[Difx-users] Debugging a Nightmare problem

Richard Dodson richard.dodson at uwa.edu.au
Tue Aug 23 22:31:00 EDT 2016


Hi Adam

vdifsummary seems to be a file in ~/Util in oper as KASI. I guess it is
something that Jan wrote. I will check.

countVDIF is slow (took all night to finish) &  I should have looked at
thread 1 not 0 (correct?). It is now running for 1. Nothing to note so far
eg:

For thread 1, at second 39896, read 29300000 frames, spotted 0 missing
frames
The start of the VDIF file (1GB) is at:
 http://ict.icrar.org/store/staff/rdodson/k16mk02f_ktn_start.vdif

  Thanks for your help
     Richard




On Mon, Aug 22, 2016 at 6:18 PM, Adam Deller <deller at astron.nl> wrote:

> Hi Richard,
>
> Looks like there is a problem mid-file, and when it tries to re-sync the
> header it finds is corrupted.  I can suggest a couple of things to try:
>
> you can run countVDIFpackets (a utility in vdifio) which is probably
> slower than vdifsummary (what utility is this?  I'm not aware of a
> "vdifsummary", there is a "vsum"...?) and is pretty basic but actually does
> check for every packet, and prints a message every time a problem is seen.
> That might give you some extra clues, so I'd try that first.  And if you
> really want to get blasted away by lots of logging, you can use printVDIF,
> which prints a little summary of each and every packet header.  You could
> pipe that to grep to look for anomalies.
>
> Looks like the problem is very early in the file, so if you dd the first
> second or so and put it on an ftp server somewhere, I could also take a
> look.
>
> Cheers,
> Adam
>
> On Mon, Aug 22, 2016 at 10:57 AM, Richard Dodson <
> richard.dodson at uwa.edu.au> wrote:
>
>> Dear All
>>
>>  I have one of the usual nightmare twisted DiFX correlation problems.
>>
>>  I am trying to use DiFX on VDIF data which has been copied off the VERA
>> OCTAVE systems (and similar) and converted.
>>
>>   The problem is almost certainly in the data copying -- but I need to
>> provide some feedback on what is wrong for it to be fixed
>>
>>   The first problem that I found was in the VDIF file: all the invalid
>> flags were set, the number of channels was wrong and the date was wrong by
>> 1 day. :(
>>
>>   Jan has a program to fix all of these :) -- but he is not around to
>> check if I have used this correctly :( :(
>>
>>    After these fixes the correlation runs, but the data file is empty.
>> What messages should I be checking to work out what is happening? I append
>> some messages which look suspicious but don't convey any information to me
>> ...
>>
>>         All the best
>>             Richard
>>
>> Comments:
>>   vdifsummary reports seem OK
>>
>> # vdifsummary /lustre/kjcc/k16mk02f/MIZ/k16mk02f_kava_miz.vdif
>> [1:1] check k16mk02f_kava_miz.vdif -> Good! it is a VDIF data scan -> add
>> to 1
>> k16mk02f_kava_miz.vdif   4,108,790,400,000   31317 sec( 8:41:57)   57467
>> Mar 20 2016y080d 11:00:03 - 19:41:59  1312 100000
>> 3,827 GB(=  3.7 TB)(= 4,108,790,400,000 B)
>>
>> Log messages which might be relevant:
>>
>> 2016-08-22 16:30:32,548 DiFXAlert INFO    MPI[ 1] compute-0-28.local
>> k16mk02f_1   Datastream 1 has opened file index 0, which was
>> /lustre/kjcc/k16mk02f/MIZ/k16mk02f_kava_miz.vdif
>>
>> 2016-08-22 16:30:32,548 DiFXAlert VERBOSE MPI[ 2] compute-0-28.local
>> k16mk02f_1   input.bad() is 0, input.fail() is 0
>>
>> 2016-08-22 16:30:32,700 DiFXAlert ERROR   MPI[ 1] compute-0-28.local
>> k16mk02f_1   Lost Sync on segment 1! Will attempt to resync. Deltatime was
>> -1.13239e+09
>>
>> 2016-08-22 16:30:32,701 DiFXAlert INFO    MPI[ 1] compute-0-28.local
>> k16mk02f_1   Config has changed!
>>
>> 2016-08-22 16:30:32,702 DiFXAlert INFO    MPI[ 1] compute-0-28.local
>> k16mk02f_1   After regaining sync, the frame start day is 70573, the frame
>> start seconds is 70631, the frame start ns is -2147483648, readscan is 2,
>> readseconds is 1132388471, readnanoseconds is -2147483648
>>         note the 2^31 values !!!!
>>
>> _______________________________________________
>> Difx-users mailing list
>> Difx-users at listmgr.nrao.edu
>> https://listmgr.nrao.edu/mailman/listinfo/difx-users
>>
>>
>
>
> --
> !=============================================================!
> Dr. Adam Deller
> Ph  +31 521595785 / Fax +31 521595101
> Staff Astronomer, Astronomy Group
> ASTRON, Oude Hoogeveensedijk 4
> 7991 PD Dwingeloo, The Netherlands
> !=============================================================!
>



-- 
-------------------------
Dr Richard Dodson,
International Centre for Radio Astronomy Research
University of Western Australia
P: +8 6488 7842 E: richard.dodson at icrar.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listmgr.nrao.edu/pipermail/difx-users/attachments/20160824/c24c226e/attachment.html>


More information about the Difx-users mailing list