[daip] help with FLAGR

Thu Nov 2 14:51:53 EST 2006

Alicia Berciano Alba writes:

 > MY VERSION OF HOW FLAGR WORKS:
 > 
 > Let's concentrate in optype TIME.

      This is the most important mode I suspect.

 > The data is divided in samples of SOLINT sec.

      No - the data are averaged in each SOLINT interval.  The task
never looks at data on a finer interval.

 > For a fixed antenna, IF, POL and SOLINT, the task calculates the average 
 > amplitude, rms and weight of the data points contained in that SOLINT.

      Again, not exactly so.  It averages the baseline-based data over
that interval.  Then it computes the antenna-based amplitude, phase,
weight, rms, etc. assuming that such quantities are actually antenna
rather than baseline based.  These answers are written to the XX table
which you can keep and examine.  When all the data have been read,
averaged, converted to antenna based, and written to the XX file, the
task rereads the XX file into memory.

 > If that averages are bigger or smaller than certain value that we fix, 
 > that sample (data contained in that SOLINT for that particular antenna, 
 > IF and POL) is questionable. We also have the condition that compares 
 > different SOLINTs, but lets forget about it for now.

     Actually that is the most important function of TIME.

 > 
 > For each IF/POL, the task counts how many antennas have that SOLINT 
 > marked as questionable.
 > If there is one IF/POL for which the number of bad antennas divided by 
 > the total number of antennas is bigger than BPAR(2), that IF/POL is bad 
 > and therefore flagged.
 > I suppose that this kind of flagging correspond to the line "antenna 
 > independent, IF dependent flags" that is printed by FLAGR at the end of 
 > the run, right?

      Not exactly.  The task examines first whether the number of
questionable correlators over all IFs and polarizations exceeds a
maximum and then flags that time as a whole.  Those flags are called
antenna % IF independent in the message.  If the total test does not
cause the whole time to be deleted, then each IF is examined
individually and those with too many questionables are flagged.  That
is the antenna independent IF dependent count.  Note that these flags
are intrinsically polarization dependent although that may be
overridden.

 > But what is really flagged? As far as i understand, the task flags all 
 > the data in that SOLINT at that particular IF/POL in all antennas, right?
 > Then, if the number of IF/POL that are bad divided by the total amount 
 > of IF/POL is bigger than BPARM(1), all the data for that SOLINT is bad 
 > and should be flagged. So in that case all data inside that SOLINT is 
 > flagged in every antenna, IF and POL. That should correspond to the line 
 > "antenna & IF independent flags", right?
 > 

    Yes - except the order is first flagg all? and then flag one at a
time?

 > And finally, we search for "clearly bad data"over all remaining antennas 
 > (the ones that were not flagged jet, I suppose).

   Yes.

 > The difference is that now the conditions that we used before to decided 
 > if one SOLINT was questionable, are now enough to decide if that SOLINT

    No - different subscripts in the xPARM adverbs are used for
questionable and clearly bad. 

 > (for a particular antenna, IF and POL) is bad. So those should be the 
 > "antenna & IF dependent flags".

     yes

 > But what about the lines "bad times" and "antenna & IF dependent, 
 > all-times flags" that are also printed when the task ends?? What they 
 > exactly mean, and how are they calculated?

    The bad times is just a count of all times that had at least one
IF exceed the allowed questionable range and the total flags is just a
count of the lines written to the FG table.  Bad times should be the
sum of the ant/IF independent and ant independent/IF dependent counts
more or less.  The all times flags are just that - a correlator that
seemed to have no valid data for all times (could be because it is
missing or because it was so bad that it did not fit the closure
relations at all etc).
 > 
 > I know that there is something wrong in that description of the task.
 > My idea was that, if a SOLINT interval is bad (because for example the 
 > average amplitude is too high), all the data points contained in that 
 > SOLINT interval are flagged. But I made some tests and I was able to 
 > flag single outliers inside a SOLINT keeping the rest of good points 
 > unflagged.

    Not with FLAGR in this mode.  There is a mode that examines data
at a time (averaged actually) compared to the times around it.  That
could be set to do what you describe - but not the TIME mode.

 > But since the task is working all the time with averages calculated over 
 > the SOLINT interval, how can it flag only some outliers inside the SOLINT??

     It can't in this mode.

 > And the last thing that puzzels me a lot is the use of CP(1) and CP(2). 
 > I thought that I could use those parameters to specify the amplitude 
 > levels which determine what is an outlier.
 > What I did was use FINDR to determine the average amplitude and rms, and 
 > then I defined CPAR(1) and CPAR(2) as follows:
 > cparm[1]=amp-sigma*rms
 > cparm[2]=amp+sigma*rms
 > 
 > In that way I wanted to impose the task to flag points with an amplitude 
 > of "sigma" times the average amplitude of the data, in order to flag the 
 > outliers. However, that is not working.

   CP(1) and 2 are for questionable, CP(3) and 4 are for bad.

 > 
 > Apart for that, BPAR(1) and BPAR(2) where set to 1 to avoid any flagging 
 > using the definition of questionable data, andI did the same with 
 > DPAR(5) and DPAR(6) to forget about the closure errors.
 > The rest of BPAR, CPAR and DPAR are set to zero.

I hope this helps some

Eric Greisen