[Difx-users] Treatment of invalid sample bits in DiFX

Tue Aug 16 06:00:49 EDT 2016

Hi Adam,

thank you for your explanation! Now things are quite clear. Any way, these slightly different treatments won’t affect the final data quality very much. 

According to DiFX doku:

http://www.atnf.csiro.au/vlbi/dokuwiki/doku.php/difx/amplitudescaling <http://www.atnf.csiro.au/vlbi/dokuwiki/doku.php/difx/amplitudescaling>

in step 2, it is described that after each accumulation period, the visibility is divided by the number of valid bits in this AP. I would like to know if the same treatment will be done for auto correlation? I ask this because the cross spectrum will be affected if it is normalised with auto spectrum.

Best wishes,
Lei

> On Aug 8, 2016, at 8:34 PM, Adam Deller <deller at astron.nl> wrote:
> 
> Hi Lei,
> 
> I'm back from holidays this week and have had a second now to look at this question.  I concur with what Walter had already written: as long as there was at least some good data unpacked, then in DiFX the FFT will be performed and the datastreams cross-multiplied to produce visibilities.
> 
> I think the misunderstanding may indeed just be with the very first FFT.  The basic station-based processing loop in DiFX is:
> 
> 1. Unpack required samples
> 2. Process (FFT, fringe rotate, etc)
> repeat
> 
> In step 1, unpacking the required samples, if the first sample requested is located before the beginning of the datastream, nothing is done: weights are set to zero and that FFT is discarded.  So indeed, the very first "possible" FFT (the one which would commence with data from before the recorded datastream begins, but that partially overlaps the beginning of the recorded data) is in essence completely flagged.  Since recordings are typically assumed to be many seconds long and FFTs typically a few microseconds, this amounts to the loss of a tiny fraction of a percent of data, which was deemed acceptable when designing the system.  
> 
> (As an aside, the same logic applies to the very last FFT in a recording; the FFT which would overlap beyond the end of the recorded data is also totally discarded, because the data is never unpacked).
> 
> Does this solve the discrepancy for you?
> 
> Cheers,
> Adam
> 
> On Tue, Aug 2, 2016 at 9:34 AM, Lei Liu <thirtyliu at gmail.com <mailto:thirtyliu at gmail.com>> wrote:
> Dear Walter,
> 
> I read out the data efficiency of every integration period from type_120 section of post processing Mk4 files. The Mk4 files are generated with difx2mark4 based on DiFX visibility output, in which the weight is copied from SWIN header. Below is the information of sample data:
> 
> code: geodetic observation IVS k14349
> station: Ny, Sh, Ts, Wz (Wz is Mark5a format, others are Mark5b format)
> bandwidth: 8 MHz
> sample bits: 1
> integration time: 0.983040 s
> FFT size: 1024
> 
> Target integration: scan 3 (07h08m49s - 07h09m09s), second integration (0.983040s - 1.96608s)
> 
> I wrote a small python program to decode the raw Mark5b data of three stations (Ny, Sh, Ts), and do integral sample time compensation (ISTC) based on delay model and clock. The reason not choose the first integration period is Ny and Sh start recording at 1s.
> 
> As you suggested, I carefully checked the tvg flag and crcc for every frame, to guarantee tvg are all 0 and crcc is correct. A buffer is created to accommodate all sample bits in the integration time range. A sample bit is valid if the corresponding data frame is presented and passed checking.
> 
> Below are dumped information for three stations and integration time range:
> 
> Ny: total sample bits: 15728640, valid sample bits: 15208929, fraction: 0.966958
> Sh: total sample bits: 15728640, valid sample bits: 15241140, fraction: 0.969006
> Ts:  total sample bits: 15728640, valid sample bits: 15728640, fraction: 1.0
> 
> If a FFT frame is composed of valid sample bits only, it is regarded as an valid FFT frame, otherwise it is invalid The integration time range is composed of 15360 FFT frame. Below is the FFT frame information for three stations:
> Ny: total FFT: 15360, valid FFT: 14852, fraction: 0.966927
> Sh: total FFT: 15360, valid FFT: 14883, fraction: 0.968945
> Ts: total FFT: 15360, valid FFT: 15360, fraction: 1.0
> 
> I dumped out the corresponding data efficiency from generated postprocessing Mk4 files, for the second integration period:
> Ny-Ts: 0.966927
> Sh-Ts: 0.968945
> This is consistent with valid FFT frame fractions instead of valid sample bits fractions, which means AT LEAST at the beginning of scan, FFT frame is discarded if there is any invalid sample bits. Actually for the first FFT frame that valid sample bits present, the fraction of valid sample bits is 0.92578125 in that FFT frame, but it is still discarded. This is the reason I say MAYBE the treatment in DiFX is slightly different at the beginning of data.
> 
> Best wishes,
> Lei
> 
> 
> > On Aug 2, 2016, at 1:03 AM, Walter Brisken <wbrisken at nrao.edu <mailto:wbrisken at nrao.edu>> wrote:
> >
> >
> > Where are you inspecting the data?  If you are looking at it in AIPS the it could be the WTTHRESH parameter of FITLD that is dropping records with reduced weight.
> >
> > It is true that for some formats (e.g., Mark5B and VDIF) if weights get very low and the continuity of the data stream cannot be established that some good data will be thrown out, but this would be for cases where data loss is very high and data frames are very sporatic.  Other than that there is no dependence on data format.
> >
> > Note that for Mark5B and VDIF data the decoders honor the data invalid bit (called TVG bit for Mark5B) and will discard data with these set.
> >
> > -W
> >
> > On Tue, 2 Aug 2016, Lei Liu wrote:
> >
> >> Dear Walter,
> >>
> >> thank you for your explanation! According to my observation (I read out the data efficiency of integration periods within one scan from DiFX?s output), the first FFT frame, which is partly filled with valid sample bits from the beginning of scan data, is discarded, even if the fraction of valid sample bits in this FFT frame is more than 90 percent. Otherwise I cannot explain the data efficiency of the integration period: it is the ratio of valid FFT period to total FFT period in one integration period, instead of the ratio of valid sample bits to total sample bits in one integration period. At least for Mark5b data and the first integration period, this is the case. May I regard your description as for the middle range data of the scan? For the beginning and end of the scan, you have slightly different treatment.
> >>
> >> Best wishes,
> >> Lei
> >>
> >>> On Aug 1, 2016, at 8:36 PM, Walter Brisken <wbrisken at nrao.edu <mailto:wbrisken at nrao.edu>> wrote:
> >>>
> >>>
> >>> My understanding is that if there are any valid samples in an fft frame, the processing proceeds.  All in valid data is set to zero and the amount of good data processed is tracked and stored in the weight that comes with the data.  These weights are baseline-based.  After each sub-integration, the sub-int weights are multiplied for the two datastreams and accumulatd. This does not yield an exactly correct weight, but in cases where weights are close to 1 it does a very good job.
> >>>
> >>>     -Walter
> >>>
> >>> On Mon, 1 Aug 2016, Lei Liu wrote:
> >>>
> >>>> Dear All,
> >>>>
> >>>> I would like to ask a question about the treatment of invalid sample bits in DiFX:
> >>>>
> >>>> Assume data are sampled in one bit and are recorded in Mark5a and Mark5b format. Some data frames might get lost due to network transfer. Whatÿÿs more, the frame header will overwrite some sample bits (the first 160 bits for a 20000 bits frame for one track) in Mark5a format. Data in the lost frame or overwritten by frame header will be labeled as invalid.
> >>>>
> >>>> We know the integration period is composed of many FFT periods. For each FFT period, we can calculate the fraction of invalid sample bits among the whole FFT points. For example, if the number of invalid sample bits are 160, and FFT point number is set to 1024, the fraction is 160 / 1024 = 0.15625. My question is, it seems that in DiFX, there is a threshold of invalid sample fraction. If the fraction exceeds the threshold, this FFT period will be discarded and will not take part in integration. If my postulation is correct, I would like to know what is the fraction? Is there any default value?
> >>>>
> >>>> Thank you in advance!
> >>>>
> >>>> Best wishes,
> >>>> Lei
> >>
> >>
> 
> 
> 
> _______________________________________________
> Difx-users mailing list
> Difx-users at listmgr.nrao.edu <mailto:Difx-users at listmgr.nrao.edu>
> https://listmgr.nrao.edu/mailman/listinfo/difx-users <https://listmgr.nrao.edu/mailman/listinfo/difx-users>
> 
> 
> 
> -- 
> !=============================================================!
> Dr. Adam Deller         
> Ph  +31 521595785 / Fax +31 521595101
> Staff Astronomer, Astronomy Group    
> ASTRON, Oude Hoogeveensedijk 4
> 7991 PD Dwingeloo, The Netherlands
> !=============================================================!

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listmgr.nrao.edu/pipermail/difx-users/attachments/20160816/91ec0f5e/attachment.html>