[alma-config] reply to mark

John Conway jconway at oso.chalmers.se
Thu Oct 6 09:45:14 EDT 2005


Hi,

Hi,

We have I guess  a email bubble which must 'pop' soon
as our energies for continuing it are exhausted.

Just to clarify things the original email distributed by Al
contained an ALMA memo on the ACA and my ALMA SPEC document
on the main array configurations. The latter is an engineering
document, it was not really meant as a memo for general scientific
discussion on say the ALMA memo server.  I do not believe the meaning of
distributiong it was to reopen the
entire debate on  the design principles as agreed in the PDRs. I doubt
the higher levels of the project would appreciate it if we went
to a completely differernt design,  however I take Frederics point
that as one reduces the number of antennas the design problems
change. This was one of the auguments I presented when it was suggested
that ALMA might have as few as 40 antenna - I argued that such
a change would require a whole re-think of the configurations causing
delays/costs (two  words project managers and politicians want to avoid).
Going from 64 to 50 I think though that we are still just about OK,
Frederic thinks it pushes us (even more) over some edge.


We have been concentrating on the sampling issue (I may reply
again to Frederics latest comments on this when I find the energy)
Just for the record though before it forgotten I just wanted here to
comment  on  some of the  other (non-Nyquist)  points made in Frederic's
original email. These are  (1) whether my reweighting scheme should
quote his memo 400 and 2) the other old argument  of increase (or not)
of noise in deconvolved images.

1) Frederic suggests that the optimised weighting I did in
my ALMA seocifications document is not new and was first done
in the Appendices of your memo 400, therefore I should revise my
document to quote his.  There must be some misunderstanding. The
only (slightly, for radio astromomy) novel aspect of what I did was
explicity to choose weights for each uv point in a snapshot to
force explicity the beam to a  gaussian shape. To control the
ill-conditioned  nature of this problem I used a pseudo-inverse,
selecting a number non-zero eignevalues such that for any given
desired accuracy of  modelling the beam a set of weights is chosen
with mimimum norm (and hence minimum sensitivty loss).  I don't see
any of this  discussion in the appendices to  memo 400.

The memo 400 appendix A first discusses the idea of weighting the uv
densities to force a gaussian like beam. This indeed is not new
(as the memo admits), it was known and done by Martin Ryle in the
1950s, and Biggs of course discussed it at length in his thesis.
Because this is so well established - in my document I did not bother
giving this idea  a reference and just  took it for granted. Appendix
B of memo 400 then discusses modifying how the weights are adjusted
by giving them a  'floor' or 'coupling constant'  (which I don't fully
understand) both are interesting but neither  I think
is related to my idea of useing psuedoinverse and fitting the to
desired beam  shape directly (not forcing a given distribution
in the gridded uv weights).

Most implementations of beam forcing in radio astronomy, generally force
uniform uv density on a grid, then gaussian taper the result (this is
I think what is described in memo 400). This is implemented in AIPS
and all other radio astronmy packages as a standard option. In my
document I did something more like what is done in radar beam forming
from arrays. It was a quick way for me to quickly calculate the sensitivty
loss for my snapshot configurations, which was 100% accurate and avoided the
need for me to write a gridding routine. For snapshots it is likely more
efficient and gives slighly lower noise than for uv grid methods, because
in these  the need to specify a grid makes them sub-optimum (the 'uv density'
is not well defined when one only has a few uv points per grid cell, and the
position of points within a grid cell is not taken into account).

It is not clear what is best for long tracks, although one could
try to adapt my method, gridding methods probably are more efficent in this
case but perhaps should  be perfected by using multi-size grids (similar
to what  you use in Frederic's uv coverage optimisation software).

Again as I note in my document my idea is not new but really taken from
radar (and other) beam forming. The difference with these fields is of
course they usually only care about the instantaneous beam shape, where we
(often) also  care about the  beam shape after many hours after combining many
different rotation and stretches of the array. Still my approach might be
usefully incorporated in some radio astronomy packages for optimising
the weighting of snapshots (maybe it is already, I'm afraid I'm only
up to date on classic AIPS), and maybe can be developed further to
apply to long tracks.


2) The question of the minimum increase in deconvolved images due to
effective 'reweighting' of the uv data going from natural uv weights
to clean beam weights.

In my ALMA Spec document appendix I argue that in some cases in
doing deconvolution one does NOT get an increase in noise equal or
greater than the increase expected from going from re-weighting the
observed uv weight density to the clean beam weight density.
Again for ALMA we are talking about a small effect because the ALMA
natural uv coverage in most configurations is pretty close to gaussian.
For the VLA with a closer to power-law density the effect is larger
(I think I remember estimating 3% once but I would have to check),
such a effect  although not dominant should be noticable by observers.

So to state the problem: Frederic take as a starting point in
his papers/memos  that in all cases when imaging there must be a minumum
increase in  noise which is that given by
the rewighting of uv density between natural uv density and the clean
beam uv density. It can be larger than this becuase in deconvolution we must
interpoalte and extrapolate between uv points but it can never be smaller
than this limit.

This hypothesis sounds very reasonable and correct and the natural
inclination is to think it is true. Still our hunches are not rigourous
proofs. So  let us consider for a moment this hypothesis and see if
it can be  proved or disproved. First can it be proved???
It can I expect be proved for linear methods of
reconstruction (see my definition of linear in yesterdays email) since
most of these boil down to reweighting the uv data. For a general non-linear
algorithm however I would not even know how to start such a proof (perhaps
a clever mathematictian can make progress). What about disproving it?
That is easier, since it claims to be a universal principle one
only needs to find counter-examples to disprove it.

When we first discussed this may years agos at the Grenoble PDR I (and
others)  immediately wondered whether Frederics proposed hypthesis agreed
with our  experience of  mapping compact sourcs with clean. It is natural when
hearing  a new hypothesis to check whether it agrees with ones
experience. The hypothesis sounded reasonable yet did not seem to agree with
thought experiments, this is a paradox and studying such apparent
paradoxes  is always fruitful in arriving at a deeper understanding.

Notice that the hypothesis says nothing about the structure of the source,
the expected minimum increase in noise depends only on starting and finshing
uv densities. It must therefore apply to all types of source structure.
Say I go to the VLA to observe what I hope will be an interesting source.
I know the expected  noise and blindly CLEAN the uv data with a stopping
criteria of  3 sigma on the noise. It turns out that the source is unresolved,
then as freely admitted by Frederic there is no noise increase in this
case. If  I repeat the experiment many times, and so from an ensemble average
form the noise rms on the point source the  flux density will be the
dirty map noise.  Frederic suggests this is because in this case of a point
source the process  reduces to 'model fitting' and his principle which
applies only to 'imaging'  no longer applies. BUT I did  not specify any
differnet algorithm to  apply and as far as I know MX has no code for
detecting point sources and  then applying a different algorithm. Instead the
CLEAN applies as it always  does finding peak, removing gain times beam etc
etc, in what sense then is  this not CLEAN imaging?

BUT even if one argues that point sources are somewhat special for some reason
-no problem-, we simply repeat the experiment now for a compact sources
extended over a beam or two (and a few pixels). Before the stopping criteria
is reached only a few pixel values are adjusted and there is simply no
way that with these few degrees of freedom we can fit all the noise in the uv
data. The clean components will only contain a fraction of the noise and so
on convolution with the clean beam the total noise will only be very slightly
increased. Now with standard clean if instead we have a source with emission
throughout the primary beam and I clean very deeply so that redisduals
are  less than than  thermal noise then I will be in the opposite regime and I
will recover the  weighting loss that Frederic expects (actually this
process takes often an enormous number of CLEANs and is rarely done;
although in many cases it maybe should be) which is one  reason
that even  for sources with extneded emission one does not notice an
increase over dirty image noise.

A fundamental difference with Frederic I think is that I see the purpose
of all  deconvolution algorithms as -fitting the data-, in that sense all
are model fitting methods allbeit with a particular image representation
(a bed of delta functions). Frederic as far as I can deconstruct it starts
from the  idea of dirty imaging (FTing the naturally weighted u v data)
with the  purpose of deconvolution being to fix up missing regions and
extrapolate slightly. Actually I see the dirty image as the very odd and
problamatic  construct; we convert measurements of the FT of the true source
into delta functions in the  uv domain whose -area- is equal to the visibility
and then we FT those. The back FT of the dirty map never equals to the uv
measurements.

Back to CLEAN. If the source consists of a bunch of compact  sources (as
it was designed for in 1974) and I clean with a 3 sigma stopping criteria
and restore with a clean beam the noise on each of the comapct sources
will be close to the dirty map value (follows from the degree of
freedom argument again). What is going on  here?, how does it avoid the
'Boone limit' due to uv reweighting? I think what is going on is that
CLEAN is doing a very effective seperation of signal and noise, what
is left in the residuals after the stopping criteria is reached is just
noise. Boring old CLEAN in the cases of a 'measles' source is working
as an effective 'denoising' algorithim. Of course as soon as I have
extended structure the only way I have with standard CLEAN
of removing the sidelobes from this is to clean deeply into the noise
and then I must get a noise increase. I supsect though that we don't
often see this in published maps because observers don't clean deep enough
into their residucals and in published maps the extended stuff which is
contoured is still mostly in the  resiudals (this is I guess sort of OK
if the  sidelobes of the extended stuff are below the noise).

Modern varients on CLEAN like multi-resolution and wavelet clean
I believe can act as effective denoising algorithms for images with stucture
on many scales (for deconvolution this 'denoisng' effect
counterbalances the increase in noise from the 'Boone effect').
If one looks at the structure on  different scales  the signal
can be identified above the noise and  seperated from the noise.

The idea of seperating signal and noise may be shocking to many
astronomers. There is a commonsense  hair-shirt view that
noise is god given and nothing you can do to reduce it, it can  only
be increased. This idea prevelent in signal processing in the
1960s and 1970's and in many undergraduate course today is not I
think the view of modern signal processing. Given the problem g = f*h + n
and some knowledge of the statistics of n (and hopefully of the image)
one can make an estimate of f called  f' and also some residuals that can
approximate n.  The special case of h = delta function is the
problem of removing additive noise ('denoising'). If the Boone hypothesis
is a universal truth for all possible non-linear algorithms then
a consequence is that denoising is impossible, but instead it is
now well practised and well established procedure (see many references
on ADS). Denoising methods
based on wavelets are now very well established and very well proven
mathematically. One does a wavelet transform and then based on the
rms and statistical form of the noise there are well established
cirteria for selecting on each scale the coefficents which give 'signal'
and those that give noise and discarding the noise like terms (note
the similarity with multi-resolution and wavelet CLEAN)

While many may admit that some form of denoising may exist as some
esoteric effect it appears that the resolution to the CLEAN compact source
paradox suggests that it operates even for boring old standard CLEAN
when applied to a limited class of sources. New multi-resolution
deconvolution methods hold the promise that it will also apply to
a wider range of source structures.

   John






















More information about the Alma-config mailing list