[Difx-users] Debugging a Nightmare problem

Adam Deller deller at astron.nl
Wed Aug 24 03:20:39 EDT 2016


Hi Richard,

I have a few observations for you:

* Nothing strange in the file at a first glance - countVDIFPackets and
printVDIF are happy with it.  It is 2 bit data.  Frame size is 1312 bytes,
and the number of frames per second indicates that this is 1 Gbps data.
* Using printVDIFheader tells me there are 8 channels in the single VDIF
thread.  Combined with the other info, that implies the bandwidth per
subband is 32 MHz? So then the format name (which you supply to the v2d
file and hence the .input file) should be VDIF_1280-1024-8-2, I think.

However, I then get funny results when I try to unpack the data using m5d
and that format name.  It's happy for a while, and then starts to give
unpack errors (which one usually gets if one mucks up the format name).  If
I instead say the number of channels is 1 (so VDIF_1280-1024-1-2), which
would mean a single 256 MHz wide channel, then it unpacks happily.

So what's the deal with the number of subbands?  I think something is wrong
somewhere, either 8 has been written into the header where 1 should have
been, or something else like that.

Cheers,
Adam

On Wed, Aug 24, 2016 at 4:31 AM, Richard Dodson <richard.dodson at uwa.edu.au>
wrote:

> Hi Adam
>
> vdifsummary seems to be a file in ~/Util in oper as KASI. I guess it is
> something that Jan wrote. I will check.
>
> countVDIF is slow (took all night to finish) &  I should have looked at
> thread 1 not 0 (correct?). It is now running for 1. Nothing to note so far
> eg:
>
> For thread 1, at second 39896, read 29300000 frames, spotted 0 missing
> frames
> The start of the VDIF file (1GB) is at:
>  http://ict.icrar.org/store/staff/rdodson/k16mk02f_ktn_start.vdif
>
>   Thanks for your help
>      Richard
>
>
>
>
> On Mon, Aug 22, 2016 at 6:18 PM, Adam Deller <deller at astron.nl> wrote:
>
>> Hi Richard,
>>
>> Looks like there is a problem mid-file, and when it tries to re-sync the
>> header it finds is corrupted.  I can suggest a couple of things to try:
>>
>> you can run countVDIFpackets (a utility in vdifio) which is probably
>> slower than vdifsummary (what utility is this?  I'm not aware of a
>> "vdifsummary", there is a "vsum"...?) and is pretty basic but actually does
>> check for every packet, and prints a message every time a problem is seen.
>> That might give you some extra clues, so I'd try that first.  And if you
>> really want to get blasted away by lots of logging, you can use printVDIF,
>> which prints a little summary of each and every packet header.  You could
>> pipe that to grep to look for anomalies.
>>
>> Looks like the problem is very early in the file, so if you dd the first
>> second or so and put it on an ftp server somewhere, I could also take a
>> look.
>>
>> Cheers,
>> Adam
>>
>> On Mon, Aug 22, 2016 at 10:57 AM, Richard Dodson <
>> richard.dodson at uwa.edu.au> wrote:
>>
>>> Dear All
>>>
>>>  I have one of the usual nightmare twisted DiFX correlation problems.
>>>
>>>  I am trying to use DiFX on VDIF data which has been copied off the VERA
>>> OCTAVE systems (and similar) and converted.
>>>
>>>   The problem is almost certainly in the data copying -- but I need to
>>> provide some feedback on what is wrong for it to be fixed
>>>
>>>   The first problem that I found was in the VDIF file: all the invalid
>>> flags were set, the number of channels was wrong and the date was wrong by
>>> 1 day. :(
>>>
>>>   Jan has a program to fix all of these :) -- but he is not around to
>>> check if I have used this correctly :( :(
>>>
>>>    After these fixes the correlation runs, but the data file is empty.
>>> What messages should I be checking to work out what is happening? I append
>>> some messages which look suspicious but don't convey any information to me
>>> ...
>>>
>>>         All the best
>>>             Richard
>>>
>>> Comments:
>>>   vdifsummary reports seem OK
>>>
>>> # vdifsummary /lustre/kjcc/k16mk02f/MIZ/k16mk02f_kava_miz.vdif
>>> [1:1] check k16mk02f_kava_miz.vdif -> Good! it is a VDIF data scan ->
>>> add to 1
>>> k16mk02f_kava_miz.vdif   4,108,790,400,000   31317 sec( 8:41:57)   57467
>>> Mar 20 2016y080d 11:00:03 - 19:41:59  1312 100000
>>> 3,827 GB(=  3.7 TB)(= 4,108,790,400,000 B)
>>>
>>> Log messages which might be relevant:
>>>
>>> 2016-08-22 16:30:32,548 DiFXAlert INFO    MPI[ 1] compute-0-28.local
>>> k16mk02f_1   Datastream 1 has opened file index 0, which was
>>> /lustre/kjcc/k16mk02f/MIZ/k16mk02f_kava_miz.vdif
>>>
>>> 2016-08-22 16:30:32,548 DiFXAlert VERBOSE MPI[ 2] compute-0-28.local
>>> k16mk02f_1   input.bad() is 0, input.fail() is 0
>>>
>>> 2016-08-22 16:30:32,700 DiFXAlert ERROR   MPI[ 1] compute-0-28.local
>>> k16mk02f_1   Lost Sync on segment 1! Will attempt to resync. Deltatime was
>>> -1.13239e+09
>>>
>>> 2016-08-22 16:30:32,701 DiFXAlert INFO    MPI[ 1] compute-0-28.local
>>> k16mk02f_1   Config has changed!
>>>
>>> 2016-08-22 16:30:32,702 DiFXAlert INFO    MPI[ 1] compute-0-28.local
>>> k16mk02f_1   After regaining sync, the frame start day is 70573, the frame
>>> start seconds is 70631, the frame start ns is -2147483648, readscan is 2,
>>> readseconds is 1132388471, readnanoseconds is -2147483648
>>>         note the 2^31 values !!!!
>>>
>>> _______________________________________________
>>> Difx-users mailing list
>>> Difx-users at listmgr.nrao.edu
>>> https://listmgr.nrao.edu/mailman/listinfo/difx-users
>>>
>>>
>>
>>
>> --
>> !=============================================================!
>> Dr. Adam Deller
>> Ph  +31 521595785 / Fax +31 521595101
>> Staff Astronomer, Astronomy Group
>> ASTRON, Oude Hoogeveensedijk 4
>> 7991 PD Dwingeloo, The Netherlands
>> !=============================================================!
>>
>
>
>
> --
> -------------------------
> Dr Richard Dodson,
> International Centre for Radio Astronomy Research
> University of Western Australia
> P: +8 6488 7842 E: richard.dodson at icrar.org
>



-- 
!=============================================================!
Dr. Adam Deller
Ph  +31 521595785 / Fax +31 521595101
Staff Astronomer, Astronomy Group
ASTRON, Oude Hoogeveensedijk 4
7991 PD Dwingeloo, The Netherlands
!=============================================================!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listmgr.nrao.edu/pipermail/difx-users/attachments/20160824/233f1446/attachment-0001.html>


More information about the Difx-users mailing list