[alma-config]Positive Comments on Frederiks ideas
John Conway
jconway at oso.chalmers.se
Mon Oct 8 11:05:32 EDT 2001
Hi,
So far most of the emails replying to Frederiks ideas have been
negative, this is not one of those emails! I think he has some very
important ideas to contribute which may well influence the design of
ALMA. Frederik gave me a draft copy of his second paper some weeks ago
and we have been having back and forth one-to one discusssions on some
points. Despite this it is only in the last week that I have
I think understood more clearly what he is saying and
its importance (..it is taken me a long
time, I'm sorry Frederik). I've been trying to explain the
conventional view to him, talking too much and
listening too little. I believe there is a mental block
concerning weighting affecting
some of us old-timers which
prevents mutual understanding of his point of view and ours.
This is a long email, so if you want to 'cut to the chase'
my main points about Frederik's ideas are in Section 4.
In some sense the arguments presented here argue for arrays
which are less centrally condensed than my beloved zoom
arrays at least in their present strawman designs, so
I'm not helping sell my earlier designs, the opposite.
It may be possible to 'close the gap' and get all operational
anf imaging advantages. If this is not so it
may be that there are still hard choices to make - perhaps
between operational capabilities and more complete uv coverage
in the larger array sizes. It is still better I think if
we all fullly understand what the tradeoffs are, and all
understand each others positions properly before making the final
decisions.
1. History of Interferometry
As is well known to this group my philosphy of imaging
has been close to that expressed by Dave, Mark and Lenoia in
recent emails. If we go back to Martin Ryles day in the 1960s/early
1970s, complete sampling of the uv plane plus apodization was proposed
as the -ONLY- valid way to make images. In the late 1970s and
1980s non-linear algorithms began to be used which gave
higher sensitivity, higher resolution and didn't require you
to make many tracks with say WSRT to build up a complete uv coverage.
All astronomers (except Ryle and his crew), being a greedy lot,
used the new powerful algorithms. And they are powerful
(see the snapshot resotoration in
http://www.oso.chalmers.se/~jconway/ALMA/SIMULATIONS/SIM1/
with only 4032 uv samples, i.e. with a uv cell coccupancy
of about 0.05). Unfortunatly the power is matched by the
ugliness of algorthims like CLEAN (applied to incompletely
sampled data) and the near impossibility to understand
what they do, hence a Faustian deal was struck because one is no
longer sure what is real in your images.
Frederik to some extent is advocating a return to the original
conception of interfereometry. In the simplest form of his
arguments he advocates that we should aim for uv cells
out to the uv edge being sampled (but
his second paper is more general in that it assumes that
true, absolutely certain, 'a priori' knowledge like positivity can
boost the imaging so that effectively unique images can
be made for a density larger than alpha*Nyquist, where alpha
is some factor less than 1 to be discovered by simulations).
The difference now however compared to Ryles day
is that there are so many baslines available that you can get
both a moderately tapered uv density distribution AND good
sampling.
2. Deciding the Natural Taper
Much of the discussion back and forward on
Frederiks ideas has been about two areas which I would
like to make quick comments on and then sidestep an
then move on. The two areas are
1) Beam vs uv optimatisation - 2) The advantages of complete
versus partial sampling of the uv plane. Considering point
1 - its not an either/or choice - as one goes to complete sampling
with a gaussian taper the sidlobes go to zero as well. For point
2 again its not an either/or choice. The most compact arrays
are going to be oversampled in short observations whatever the
detailed design. The only question, which depends on the
size of the natural taper chosen; is how long one need to synthesise
for to achieve completenes and to what maximum array size is
complete sampling possible within a 6hour synthesis.
Frederik has realised that for ALMA as long as the natural taper
is not too large its possible to have completely sampled
arrays within 6hrs even out to 3km (for a 7dB natural taper, or
3km for a 10dB natural taper). The length
of observation it takes to reach full sampling depends of
course on the size of the natural taper in the uv density
distribution. For a natural taper of 0 dB
(uniform uv distribution) the length of time is minimised
and the maximum size of an array which can achieve full sampling
is maximised. The problem as we all know is that such arrays
will have large near-in sidelobes. Either we apply CLEAN
or MEM which extrapolates the uv data (and our image is
not unique, and the whole point of aiming for uniquness
is lost), or we weight down (apodise) long baselines
which loses sensitivity.
Now I was suprised to realise, as
noted in Frederiks paper (and as he said at the telecon),
that to weight down to 10dB at the edge from a uniform
distribution gives a loss of only 18% in sensitivity.
To my shame I have never bothered to actually calculate this
loss, I have always assumed it to be much larger than this,
after all the weight at the edge is 10 times lower than in the
centre and the half power point of the density is about half the
outer radius. It looks like the loss should be more than 18%, but
one cannot argue with mathematics. Again I must have missed something
in my radio astronomy education - as Stephane rightly noted at the
telecon a 10db power taper is typical for the illumination
pattern of a single dish giving a good compromise between
loss of sensitivity and low sidelobes (..sorry Stephane
for not realising at the time that the analogy is accurate).
Still none of us advocate aiming for uniform distributions
for <4km arrays for all sorts of reasons. 1) Although a 18%
sensitivity loss is
smaller than I expected its still unacceptable- we would
not like to sacrifice the equivalent of 12 antennas
out of 64. 2) A uniform uv density is anyway impossible to generate
from the autocorrelation pattern of antennas. 3) The maximum
pad sharing for arrays factors of 2 different in resolution is
only 25%, hence more resources 4) In contrast to uniform arrays
more centrally concentrated arrays tend to have more
short spacings - (not a problem if you trully sample all
cells but often the few that you tend to miss out in
an automatic optimistaion of cell occupancy are the ones in the
centre which unfortunately are the most important ones to fill
as Mark says). 5) A 0dB array is not needed to allow complete
sampling out to 3km (Boone 2001), so we can have higher tapers.
with their advantages.
Having rejected a 0dB array the next question is what taper
should we choose? The PDR suggested 7dB-10dB. From the point of
view of pad sharing and allowing zoom array operation somewhat
larger tapers are required (my strawmen have had 15dB - 20dB
tapers, but it could be pushed down by careful design).
Practical issues are important but for now let
us put them aside and ask which is the optimum taper to have
solely from the point of view of imaging? First (Section 3)
let us make an important assumption - that we never allow ANY
tapering of the data because we want ABSOLUTELY NO LOSS
of sensitivity. Let us make the above assumption and
see where it leads (to the conventional view expressed
earlier by me, Dave, Mark etc) - later (see Section 4)
we will allow some tapering we will see how it explains
Fredricks view, and how allowing some tapering is in fact
justified.
3. The optimum taper assuming no rewighting
- and non-linear deconvolution
Let us assume absolutely no rewighting and then
try to extimate the appropriate uv taper,
which is effectively a constraint on the near-in sidelobes.
Daves Woodys suggested the criteria that the
taper should be such that near-in sidelobes are always
smaller than the far-sidelobes; this seems a reasonable
criteria if not as rigourously defined as perhaps one might
like. In practice this is a requirement that
the far sidelobes are less than near-in ones even for
a 6hr synthesis. This in turn requires a large taper, around
15dB or so (at a guess, I havn't got Dave paper
with me). The consquence is for all but
the smallest arrays the outer part of of the uv plane
is sampled much less densly than Nyquist, hence linear
methods can't be used. However non-linear methods
will tend to work very well on such a uv distribution.
In the above case the data reconstruction problem in the
outer uv plane becomes one of interpolation between
sparsely sampled uv points rather than extrapolation
beyond a densely sampled edge, common sense and simulations
suggest that such interpolation is much more robust
than extrapolation, especially after we have weighted
down the outer part of the uv plane by the convolution
by the CLEAN beam step. One might look at such
a tapered uv distribution with maximum baseline u_max
and say that its got a horrible, ragged, poorly sampled outer uv
edge, Frederick argues there must come a point when the
density is too low. But one can ask the question -
how would you rearrange the uv points to have a
higher, more uniform, density --but still have the same
resolution--?- you have to move the uv points at u_max
inward so now there are no uvpoints where the FT of the
CLEAN beam reaches -15dB. The alternative then to having a
low density in the region which contributes to the
CLEAN map at -15bB is having NO uv points in this
region. Surely a low density is better than zero density?
One critisism of the above argument for having a large taper
(which suprisingly has not been made to my knowledge) is that we are
implicitly designing an array for high dynamic range imaging,
and many of the images we wish to make at millimtre wavelength
may well be highly complex but may not require high dynamic range.
If, as often the case at cm wavelenths, the image brightness
is dominated by a compact point, then the near-in
sidelobes in the dirty map are large, and the appropriate
criteria is that the sidelobes of the dirty beam after
a long track are smaller than the far sidelobes. However
if the source has only smooth structure, the effect of
the near-in sidelobes is much reduced. For such complex
but low dynamic range imaging, it might be thought to
be better to concentrate on achieveing complete sampling
rather than minimising the near-in sidelobes. Effectively
we are designig in an array taper to deal with the
highest dynamic range imaging, in cm imaging there
is a strong correltion beween complex images and high dynamic
range ones, such a correlation may not be so strong at mm
wavelength.
4. The optimum taper assuming some rewighting
In the telecon when discussing the criteria for
weighting, Dave gave his 'rule of thumb' that
NO loss of sensitivity is acceptable - this translates
to NO weighting. From our one and only configuration
meeting we had in Europe the same criteria was
adopted and has for me at least become almost a religous
article of faith. I believe that this assumption
rigourously adopted leads to something of a mental
block, making us blind to the very useful ideas in
Fredericks work.
As Dave said most of the experiments
done with ALMA will be sensitivity limited, and so for
those experiments indeed we don't want to lose
ANY sensistiviy. However these same experiments
are also low dynamic range. If we have arrays whose
natural tapers of 10dB can give complete sampling out to 2km
(as proposed by Boone) and these same arrays
then these have natural beam near-in sidelobes at around 2%
and so one can image -without loss of sensitivity -
even point dominated objects up 50:1 dynamic range. Now if
one wants to observe a brighter source requiring more dynamic
range one can taper the uv data
to reduce sidelobes, without reducing the sampling below
Nyquist. The reweighting will of course decrease sensitivity
but not by much, weighting by 5db at the edge so the effective
taper at the uv edge goes from 10dB to 15dB
ONLY LOSES 4% IN SENSITIVITY. One can then win
overall from the tapering, because even though sensitivity
is worse sidelobes and reconstruction errors are reduced.
There are two advantages to controlling the uv edge taper
by a weighting in the reconstruction algorithim rather than
building it into the array design. Firstly it reduces the
near-in sidelobes without reducing the sampling rate at the uv
edge and only very modestly increasing the noise, and secondly
one can choose the amount of taper to apply based on the type
of source you are observing (higher for point dominated
images where you want large dynamic range). In contrast
if you build a large taper in from scratch its hard wired
in metal and concrete.
5. Condition for Oversampling and linear Reconstruction methods
One of the disagreements I and Frederik have had has been on
the nature of the sampling theorem. I will send in a following
email today or tomorrow a protomemo on this subject, which I sent to
him a few weeks ago. The arguments are important because Frederik
calculated the conditions for complete sampling basd on the criteria
of every uvcell out to the uv edge being sampled. I believe that this
condition may be somewhat too strict, it may be possible
to relax the criteria a bit to say maximum gap of 1.5 Nyquist
at the uv edge, when one considers that any interpolation here is
reduced by the effect of the restoring beam, and the total
number of constraints (uv points) is much larger then
the unknows. If this is so
it would be good news for everyone - because it would bridge
the gap between the natual tapers of 7dB - 10dB that Frederik
calculated for complete sampling out to 2km or 3km based
on cell occupancy arguments and tapers of 10dB - 15dB calcuated
from near-in sidelobe smaller than far sidelobes and tapers
of 15dB for self-similar zoom arrays.
I have been working on developing some methods of linear
imaging using matrix algebra, for imaging in the case
of full sampling. I have I think been having some
success. Such a theory can answer rigourously the question
of when we are in the oversampled regime, what increase in
noise one gets, if any, etc. As Stephane mentioned I think
its possible using such an approach to show that increases in
noise from linear imaging depends on the eigenvalue spectrum
of the dirty beam matrix compared to the eigenvalue spectrum
of the restoring beam, and hence on the sidelobe level of the
dirty beam. Hopefully I can find time in the next few days
to write this up so we can discuss it at the next telecon.
Cheers
John.
More information about the Alma-config
mailing list