[Difx-users] Treatment of invalid sample bits in DiFX

Mon Aug 8 08:34:27 EDT 2016

Hi Lei,

I'm back from holidays this week and have had a second now to look at this
question.  I concur with what Walter had already written: as long as there
was at least some good data unpacked, then in DiFX the FFT will be
performed and the datastreams cross-multiplied to produce visibilities.

I think the misunderstanding may indeed just be with the very first FFT.
The basic station-based processing loop in DiFX is:

1. Unpack required samples
2. Process (FFT, fringe rotate, etc)
repeat

In step 1, unpacking the required samples, if the first sample requested is
located before the beginning of the datastream, nothing is done: weights
are set to zero and that FFT is discarded.  So indeed, the very first
"possible" FFT (the one which would commence with data from before the
recorded datastream begins, but that partially overlaps the beginning of
the recorded data) is in essence completely flagged.  Since recordings are
typically assumed to be many seconds long and FFTs typically a few
microseconds, this amounts to the loss of a tiny fraction of a percent of
data, which was deemed acceptable when designing the system.

(As an aside, the same logic applies to the very last FFT in a recording;
the FFT which would overlap beyond the end of the recorded data is also
totally discarded, because the data is never unpacked).

Does this solve the discrepancy for you?

Cheers,
Adam

On Tue, Aug 2, 2016 at 9:34 AM, Lei Liu <thirtyliu at gmail.com> wrote:

> Dear Walter,
>
> I read out the data efficiency of every integration period from type_120
> section of post processing Mk4 files. The Mk4 files are generated with
> difx2mark4 based on DiFX visibility output, in which the weight is copied
> from SWIN header. Below is the information of sample data:
>
> code: geodetic observation IVS k14349
> station: Ny, Sh, Ts, Wz (Wz is Mark5a format, others are Mark5b format)
> bandwidth: 8 MHz
> sample bits: 1
> integration time: 0.983040 s
> FFT size: 1024
>
> Target integration: scan 3 (07h08m49s - 07h09m09s), second integration
> (0.983040s - 1.96608s)
>
> I wrote a small python program to decode the raw Mark5b data of three
> stations (Ny, Sh, Ts), and do integral sample time compensation (ISTC)
> based on delay model and clock. The reason not choose the first integration
> period is Ny and Sh start recording at 1s.
>
> As you suggested, I carefully checked the tvg flag and crcc for every
> frame, to guarantee tvg are all 0 and crcc is correct. A buffer is created
> to accommodate all sample bits in the integration time range. A sample bit
> is valid if the corresponding data frame is presented and passed checking.
>
> Below are dumped information for three stations and integration time range:
>
> Ny: total sample bits: 15728640, valid sample bits: 15208929, fraction:
> 0.966958
> Sh: total sample bits: 15728640, valid sample bits: 15241140, fraction:
> 0.969006
> Ts:  total sample bits: 15728640, valid sample bits: 15728640, fraction:
> 1.0
>
> If a FFT frame is composed of valid sample bits only, it is regarded as an
> valid FFT frame, otherwise it is invalid The integration time range is
> composed of 15360 FFT frame. Below is the FFT frame information for three
> stations:
> Ny: total FFT: 15360, valid FFT: 14852, fraction: 0.966927
> Sh: total FFT: 15360, valid FFT: 14883, fraction: 0.968945
> Ts: total FFT: 15360, valid FFT: 15360, fraction: 1.0
>
> I dumped out the corresponding data efficiency from generated
> postprocessing Mk4 files, for the second integration period:
> Ny-Ts: 0.966927
> Sh-Ts: 0.968945
> This is consistent with valid FFT frame fractions instead of valid sample
> bits fractions, which means AT LEAST at the beginning of scan, FFT frame is
> discarded if there is any invalid sample bits. Actually for the first FFT
> frame that valid sample bits present, the fraction of valid sample bits is
> 0.92578125 in that FFT frame, but it is still discarded. This is the reason
> I say MAYBE the treatment in DiFX is slightly different at the beginning of
> data.
>
> Best wishes,
> Lei
>
>
> > On Aug 2, 2016, at 1:03 AM, Walter Brisken <wbrisken at nrao.edu> wrote:
> >
> >
> > Where are you inspecting the data?  If you are looking at it in AIPS the
> it could be the WTTHRESH parameter of FITLD that is dropping records with
> reduced weight.
> >
> > It is true that for some formats (e.g., Mark5B and VDIF) if weights get
> very low and the continuity of the data stream cannot be established that
> some good data will be thrown out, but this would be for cases where data
> loss is very high and data frames are very sporatic.  Other than that there
> is no dependence on data format.
> >
> > Note that for Mark5B and VDIF data the decoders honor the data invalid
> bit (called TVG bit for Mark5B) and will discard data with these set.
> >
> > -W
> >
> > On Tue, 2 Aug 2016, Lei Liu wrote:
> >
> >> Dear Walter,
> >>
> >> thank you for your explanation! According to my observation (I read out
> the data efficiency of integration periods within one scan from DiFX?s
> output), the first FFT frame, which is partly filled with valid sample bits
> from the beginning of scan data, is discarded, even if the fraction of
> valid sample bits in this FFT frame is more than 90 percent. Otherwise I
> cannot explain the data efficiency of the integration period: it is the
> ratio of valid FFT period to total FFT period in one integration period,
> instead of the ratio of valid sample bits to total sample bits in one
> integration period. At least for Mark5b data and the first integration
> period, this is the case. May I regard your description as for the middle
> range data of the scan? For the beginning and end of the scan, you have
> slightly different treatment.
> >>
> >> Best wishes,
> >> Lei
> >>
> >>> On Aug 1, 2016, at 8:36 PM, Walter Brisken <wbrisken at nrao.edu> wrote:
> >>>
> >>>
> >>> My understanding is that if there are any valid samples in an fft
> frame, the processing proceeds.  All in valid data is set to zero and the
> amount of good data processed is tracked and stored in the weight that
> comes with the data.  These weights are baseline-based.  After each
> sub-integration, the sub-int weights are multiplied for the two datastreams
> and accumulatd. This does not yield an exactly correct weight, but in cases
> where weights are close to 1 it does a very good job.
> >>>
> >>>     -Walter
> >>>
> >>> On Mon, 1 Aug 2016, Lei Liu wrote:
> >>>
> >>>> Dear All,
> >>>>
> >>>> I would like to ask a question about the treatment of invalid sample
> bits in DiFX:
> >>>>
> >>>> Assume data are sampled in one bit and are recorded in Mark5a and
> Mark5b format. Some data frames might get lost due to network transfer.
> Whatÿÿs more, the frame header will overwrite some sample bits (the first
> 160 bits for a 20000 bits frame for one track) in Mark5a format. Data in
> the lost frame or overwritten by frame header will be labeled as invalid.
> >>>>
> >>>> We know the integration period is composed of many FFT periods. For
> each FFT period, we can calculate the fraction of invalid sample bits among
> the whole FFT points. For example, if the number of invalid sample bits are
> 160, and FFT point number is set to 1024, the fraction is 160 / 1024 =
> 0.15625. My question is, it seems that in DiFX, there is a threshold of
> invalid sample fraction. If the fraction exceeds the threshold, this FFT
> period will be discarded and will not take part in integration. If my
> postulation is correct, I would like to know what is the fraction? Is there
> any default value?
> >>>>
> >>>> Thank you in advance!
> >>>>
> >>>> Best wishes,
> >>>> Lei
> >>
> >>
>
>
>
> _______________________________________________
> Difx-users mailing list
> Difx-users at listmgr.nrao.edu
> https://listmgr.nrao.edu/mailman/listinfo/difx-users
>

-- 
!=============================================================!
Dr. Adam Deller
Ph  +31 521595785 / Fax +31 521595101
Staff Astronomer, Astronomy Group
ASTRON, Oude Hoogeveensedijk 4
7991 PD Dwingeloo, The Netherlands
!=============================================================!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listmgr.nrao.edu/pipermail/difx-users/attachments/20160808/ce0eb7dd/attachment.html>