[alma-config]Positive Comments on Frederiks ideas

Mon Oct 8 11:05:32 EDT 2001

Hi,

  So far most of the emails replying to Frederiks ideas have been 
negative, this is not one of those emails! I think he has some very
important  ideas to contribute which may well influence the design of
ALMA.  Frederik gave me a draft copy of his second paper some weeks ago
and  we have been having  back and forth one-to one discusssions on some 
points.  Despite this it is only in the last week that I have 
I think  understood more clearly what he is saying and 
its   importance (..it is taken me a long 
time, I'm sorry Frederik). I've been trying to explain the 
conventional view to him, talking too much and 
listening too little. I believe there is a mental block 
concerning weighting affecting
some of us old-timers which 
prevents mutual understanding of his point of view and ours. 

This is a long email, so if you want  to 'cut to the chase' 
my main  points about Frederik's ideas are in Section 4. 
In some sense the arguments presented here argue for arrays
which are less centrally condensed than my beloved zoom 
arrays at least in their present strawman designs, so 
I'm not helping sell my earlier designs, the opposite. 
It may be possible to 'close the gap' and get all operational
anf imaging advantages. If this is not so  it 
may be that there are still hard choices to make - perhaps
between  operational capabilities and more complete uv coverage
in the larger array sizes. It is still better I think if 
we all fullly understand what the tradeoffs are, and all 
understand each others positions properly before making the final 
decisions.

1. History of Interferometry

  As is well known to this group my philosphy of imaging 
has been  close to that expressed by Dave, Mark and Lenoia in 
recent emails.  If we go back to  Martin Ryles day in the 1960s/early 
1970s,  complete sampling of the uv plane plus apodization was proposed 
as the -ONLY- valid way to make images. In the late 1970s and 
1980s non-linear algorithms began to be used which gave 
higher sensitivity, higher resolution and didn't require you
to make many tracks with say WSRT to build up a complete uv coverage.
All astronomers (except Ryle and his crew), being a greedy lot, 
used the  new powerful algorithms. And they are powerful 
(see the snapshot  resotoration in 
http://www.oso.chalmers.se/~jconway/ALMA/SIMULATIONS/SIM1/
with only 4032 uv samples, i.e. with a uv cell coccupancy 
of about 0.05). Unfortunatly the power is matched by the
ugliness of algorthims like CLEAN (applied to incompletely 
sampled data) and the near impossibility to understand 
what they do, hence a Faustian deal was struck  because one is no 
longer  sure what is real in your images.

Frederik to some extent is advocating a return to the original
conception of interfereometry. In the simplest form of his 
arguments he advocates that we should aim for uv cells 
out to the uv edge being sampled (but
his second paper is more general in that it assumes that 
true, absolutely certain, 'a priori' knowledge like positivity can 
boost  the imaging so that effectively unique images can
be made for a density larger than alpha*Nyquist, where alpha
is some factor less than 1 to be discovered by simulations).
The difference now however compared to Ryles day 
is that there are so many baslines available that you can get 
both a moderately tapered uv density distribution AND good 
sampling.  

2. Deciding the Natural Taper

Much of the discussion back and forward on 
Frederiks ideas has been about two areas which I would 
like to make quick comments on and then sidestep an
then move on. The two areas are  
1) Beam vs uv optimatisation - 2) The advantages of complete
versus partial sampling of the uv plane. Considering point 
1 - its not an either/or choice -  as one goes to complete sampling 
with a  gaussian  taper the sidlobes go to zero as well. For point 
2 again its not an either/or choice. The most compact arrays 
are going to be oversampled in short observations whatever the  
detailed design. The only question, which depends on the 
size of the natural taper chosen; is how long one need to synthesise
for to achieve completenes and to what maximum array size is
complete sampling possible within a 6hour synthesis.

Frederik has realised that for ALMA as long as the natural taper 
is not too large its possible to have completely sampled 
arrays within 6hrs even out to 3km (for a 7dB natural taper, or
3km for a 10dB natural taper). The length
of observation  it takes to reach full sampling depends of 
course  on the size of the  natural taper in the uv density 
distribution. For a natural taper of 0 dB
(uniform uv distribution) the length of time is minimised
and the maximum size of an array which can achieve full sampling
is maximised. The problem as we all know is that such arrays
will have large near-in sidelobes. Either we apply CLEAN 
or MEM which extrapolates the uv data (and our image is 
not unique, and the whole point of aiming for uniquness
is lost), or we weight down (apodise) long baselines 
which loses sensitivity. 

Now I was suprised to realise, as 
noted in Frederiks paper (and as he said at the telecon), 
that to weight down to 10dB at the edge  from a uniform 
distribution gives a loss of only 18% in sensitivity. 
To my shame I have never bothered to actually calculate this
loss, I have always assumed it to be much larger than this,
after all the weight at the edge is  10 times lower than in the 
centre and the  half power point of the density is about half the 
outer radius. It looks like the loss should be more than 18%, but 
one cannot  argue with mathematics. Again I must have missed something 
in my radio astronomy education -  as Stephane rightly noted at the 
telecon a 10db power taper is typical for the illumination
pattern of a single dish giving a good compromise between
loss of sensitivity and low sidelobes (..sorry Stephane 
for not realising at the time that the analogy is accurate). 

Still none of us advocate aiming for uniform distributions 
for <4km arrays for all sorts of reasons. 1) Although a 18% 
sensitivity loss is
smaller  than I expected its still unacceptable- we would
not like to sacrifice the equivalent of 12 antennas 
out of 64. 2) A uniform uv density is anyway impossible to generate
from the autocorrelation pattern of antennas. 3) The maximum
pad sharing for arrays factors of 2 different in resolution is 
only 25%, hence more resources 4) In contrast to uniform arrays
more centrally concentrated arrays tend to have more 
short spacings - (not a problem if you trully sample all 
cells but often the few that you tend to miss out in 
an automatic optimistaion of cell occupancy are the ones in the 
centre which unfortunately are the most important ones to fill
as Mark says).  5) A 0dB array is not needed to allow complete 
sampling out to  3km (Boone 2001), so we can have higher tapers.
with their advantages.

Having rejected a 0dB array the next question is what taper 
should we choose? The PDR suggested 7dB-10dB. From the point of
view of pad sharing and allowing zoom array operation somewhat 
larger tapers are required (my strawmen have had 15dB - 20dB
tapers, but it could be pushed down by careful design). 
Practical issues are important but for now let 
us put them aside and ask  which is  the optimum taper to have
solely from the point of view of imaging? First (Section 3) 
let us  make an important assumption - that we never allow ANY 
tapering of the data because we  want ABSOLUTELY NO LOSS
of sensitivity. Let us make the above assumption and 
see where it leads (to the conventional view expressed
earlier by me, Dave, Mark etc) - later (see Section 4)
we will allow some tapering we will see how it explains 
Fredricks view, and how allowing  some tapering is in fact
justified. 

3. The optimum taper assuming no rewighting
  - and non-linear deconvolution

Let us assume absolutely no rewighting and then 
try to extimate the appropriate  uv taper,
which is effectively a constraint on the near-in sidelobes.
Daves Woodys  suggested the criteria  that the 
taper should be such  that near-in sidelobes are always 
smaller  than the far-sidelobes; this seems a reasonable 
criteria if not as rigourously defined as perhaps one might 
like. In practice this is  a requirement that 
the far sidelobes are  less  than near-in ones even for
a 6hr synthesis. This in turn requires a large taper, around
15dB or so (at a guess, I havn't got Dave paper
with me). The consquence is for all but
the smallest arrays the outer part of of the uv plane 
is sampled much less densly than Nyquist, hence linear
methods can't be used. However non-linear methods 
will tend to work very well on such a uv distribution. 

In the above case  the data reconstruction problem in the 
outer uv plane becomes one of interpolation between 
sparsely sampled uv points rather than extrapolation
beyond a densely sampled edge, common sense and simulations
suggest that such interpolation is much more robust 
than extrapolation, especially after we have weighted
down the outer part of the uv plane by the convolution
by the CLEAN beam step. One might look at such 
a tapered uv distribution with maximum baseline u_max
and say that its got  a horrible, ragged,  poorly sampled outer uv 
edge,  Frederick argues there must come a point when the 
density is too low. But one can ask the question - 
how would you rearrange the uv points to have a 
higher, more uniform, density --but still have the same 
resolution--?-  you have to move the uv points at u_max
inward  so now there are no uvpoints where the FT of the 
CLEAN beam reaches -15dB. The  alternative then to having a 
low density in the region which contributes to the 
CLEAN map at -15bB is having NO uv points in this 
region. Surely a low density is better than zero density?

One critisism of the above argument for having a large taper
(which suprisingly has not been made to my knowledge) is that we are
implicitly designing  an array for high dynamic range imaging,
and many of the images we wish to make at millimtre wavelength
may well be highly complex but may not require high dynamic range. 
If, as often the case at cm wavelenths, the image brightness
is dominated by a compact point, then the near-in 
sidelobes  in the dirty map are large, and the appropriate
criteria is that the sidelobes of the dirty beam after 
a long track are smaller than the far sidelobes. However 
if the source has only smooth structure, the effect of
the near-in sidelobes is much reduced. For such complex
but low dynamic range imaging, it might be thought to
be better to concentrate on achieveing complete sampling
rather than minimising the near-in sidelobes. Effectively 
we are designig in an array taper to deal with the 
highest dynamic range imaging, in cm imaging there 
is a strong correltion beween complex images and high dynamic
range ones, such a correlation may not be so strong at mm 
wavelength.

4. The optimum taper assuming some rewighting

In the telecon when discussing the criteria for 
weighting, Dave gave his 'rule of thumb' that 
NO loss of sensitivity is acceptable - this translates
to NO weighting. From our one and only configuration 
meeting we had in Europe  the   same criteria was
adopted and has for me at least become almost a religous 
article of faith. I believe that this assumption 
rigourously adopted leads to something of a mental 
block, making us blind to the very useful ideas in
Fredericks work. 

As Dave said most of the experiments
done with ALMA will be sensitivity limited, and so for 
those experiments indeed we don't want to lose
ANY sensistiviy. However these same experiments 
are also low dynamic range. If we have arrays whose 
natural tapers of 10dB can give complete sampling out to 2km
(as proposed by Boone) and these same arrays 
then these have natural beam near-in sidelobes at  around 2% 
and so  one can image -without loss of sensitivity -
even point dominated objects up 50:1 dynamic range. Now if 
one wants to observe a brighter source requiring more dynamic 
range one can taper the uv data
to reduce sidelobes, without reducing the sampling below
Nyquist. The reweighting  will of course decrease sensitivity 
but not by much, weighting by 5db at the edge so the effective
taper at the uv edge goes from 10dB  to 15dB        
ONLY LOSES 4% IN  SENSITIVITY. One can then win 
overall from the tapering, because even though sensitivity 
is worse sidelobes and reconstruction errors are reduced.

There are two advantages to controlling the uv edge taper
by a weighting in the reconstruction algorithim rather than 
building it into the array design. Firstly it reduces the 
near-in sidelobes without reducing the sampling rate at the uv 
edge and only very modestly increasing the noise, and secondly 
one can choose the amount of taper to apply  based on the type 
of source you are observing (higher for point dominated
images where you want large dynamic range). In contrast 
if you build a large taper in from scratch its hard wired
in metal and concrete.

5. Condition for Oversampling and linear Reconstruction methods

One of the disagreements I and Frederik have had has been on 
the nature of the sampling theorem. I will send in a following 
email today or tomorrow  a protomemo on this subject, which I sent to 
him a few weeks ago. The arguments are important because Frederik 
calculated the conditions for complete sampling basd on the criteria
of every uvcell out to the uv edge being sampled. I believe that this 
condition may be somewhat too strict, it may be possible 
to relax the criteria a bit to say maximum gap of 1.5 Nyquist 
at the uv edge, when one considers that any interpolation here is 
reduced by the effect of the restoring beam, and the total
number of constraints (uv points) is much larger then 
the unknows. If this is so 
it would be good news for everyone - because it would bridge 
the gap between the natual tapers of 7dB - 10dB  that Frederik
calculated for complete sampling out to 2km or 3km based
on cell occupancy arguments and tapers of 10dB - 15dB calcuated
from near-in sidelobe smaller than far sidelobes and tapers
of 15dB for self-similar zoom arrays.

I have been working on developing some methods of linear 
imaging using matrix algebra, for imaging in the case 
of full sampling. I have I think been having some 
success. Such a theory can answer rigourously the question
of when we are in the oversampled regime, what increase in
noise one gets, if any, etc. As Stephane mentioned I think
its possible using such an approach to show that increases in 
noise from linear imaging depends on the eigenvalue spectrum
of the dirty beam matrix compared to the eigenvalue spectrum
of the restoring beam, and hence on the sidelobe level of the 
dirty beam. Hopefully I can find time in the next few days 
to write this up so we can discuss it at the next telecon.

 Cheers
    John.