[Difx-users] Debugging a Nightmare problem
Richard Dodson
richard.dodson at uwa.edu.au
Tue Aug 23 22:31:00 EDT 2016
Hi Adam
vdifsummary seems to be a file in ~/Util in oper as KASI. I guess it is
something that Jan wrote. I will check.
countVDIF is slow (took all night to finish) & I should have looked at
thread 1 not 0 (correct?). It is now running for 1. Nothing to note so far
eg:
For thread 1, at second 39896, read 29300000 frames, spotted 0 missing
frames
The start of the VDIF file (1GB) is at:
http://ict.icrar.org/store/staff/rdodson/k16mk02f_ktn_start.vdif
Thanks for your help
Richard
On Mon, Aug 22, 2016 at 6:18 PM, Adam Deller <deller at astron.nl> wrote:
> Hi Richard,
>
> Looks like there is a problem mid-file, and when it tries to re-sync the
> header it finds is corrupted. I can suggest a couple of things to try:
>
> you can run countVDIFpackets (a utility in vdifio) which is probably
> slower than vdifsummary (what utility is this? I'm not aware of a
> "vdifsummary", there is a "vsum"...?) and is pretty basic but actually does
> check for every packet, and prints a message every time a problem is seen.
> That might give you some extra clues, so I'd try that first. And if you
> really want to get blasted away by lots of logging, you can use printVDIF,
> which prints a little summary of each and every packet header. You could
> pipe that to grep to look for anomalies.
>
> Looks like the problem is very early in the file, so if you dd the first
> second or so and put it on an ftp server somewhere, I could also take a
> look.
>
> Cheers,
> Adam
>
> On Mon, Aug 22, 2016 at 10:57 AM, Richard Dodson <
> richard.dodson at uwa.edu.au> wrote:
>
>> Dear All
>>
>> I have one of the usual nightmare twisted DiFX correlation problems.
>>
>> I am trying to use DiFX on VDIF data which has been copied off the VERA
>> OCTAVE systems (and similar) and converted.
>>
>> The problem is almost certainly in the data copying -- but I need to
>> provide some feedback on what is wrong for it to be fixed
>>
>> The first problem that I found was in the VDIF file: all the invalid
>> flags were set, the number of channels was wrong and the date was wrong by
>> 1 day. :(
>>
>> Jan has a program to fix all of these :) -- but he is not around to
>> check if I have used this correctly :( :(
>>
>> After these fixes the correlation runs, but the data file is empty.
>> What messages should I be checking to work out what is happening? I append
>> some messages which look suspicious but don't convey any information to me
>> ...
>>
>> All the best
>> Richard
>>
>> Comments:
>> vdifsummary reports seem OK
>>
>> # vdifsummary /lustre/kjcc/k16mk02f/MIZ/k16mk02f_kava_miz.vdif
>> [1:1] check k16mk02f_kava_miz.vdif -> Good! it is a VDIF data scan -> add
>> to 1
>> k16mk02f_kava_miz.vdif 4,108,790,400,000 31317 sec( 8:41:57) 57467
>> Mar 20 2016y080d 11:00:03 - 19:41:59 1312 100000
>> 3,827 GB(= 3.7 TB)(= 4,108,790,400,000 B)
>>
>> Log messages which might be relevant:
>>
>> 2016-08-22 16:30:32,548 DiFXAlert INFO MPI[ 1] compute-0-28.local
>> k16mk02f_1 Datastream 1 has opened file index 0, which was
>> /lustre/kjcc/k16mk02f/MIZ/k16mk02f_kava_miz.vdif
>>
>> 2016-08-22 16:30:32,548 DiFXAlert VERBOSE MPI[ 2] compute-0-28.local
>> k16mk02f_1 input.bad() is 0, input.fail() is 0
>>
>> 2016-08-22 16:30:32,700 DiFXAlert ERROR MPI[ 1] compute-0-28.local
>> k16mk02f_1 Lost Sync on segment 1! Will attempt to resync. Deltatime was
>> -1.13239e+09
>>
>> 2016-08-22 16:30:32,701 DiFXAlert INFO MPI[ 1] compute-0-28.local
>> k16mk02f_1 Config has changed!
>>
>> 2016-08-22 16:30:32,702 DiFXAlert INFO MPI[ 1] compute-0-28.local
>> k16mk02f_1 After regaining sync, the frame start day is 70573, the frame
>> start seconds is 70631, the frame start ns is -2147483648, readscan is 2,
>> readseconds is 1132388471, readnanoseconds is -2147483648
>> note the 2^31 values !!!!
>>
>> _______________________________________________
>> Difx-users mailing list
>> Difx-users at listmgr.nrao.edu
>> https://listmgr.nrao.edu/mailman/listinfo/difx-users
>>
>>
>
>
> --
> !=============================================================!
> Dr. Adam Deller
> Ph +31 521595785 / Fax +31 521595101
> Staff Astronomer, Astronomy Group
> ASTRON, Oude Hoogeveensedijk 4
> 7991 PD Dwingeloo, The Netherlands
> !=============================================================!
>
--
-------------------------
Dr Richard Dodson,
International Centre for Radio Astronomy Research
University of Western Australia
P: +8 6488 7842 E: richard.dodson at icrar.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listmgr.nrao.edu/pipermail/difx-users/attachments/20160824/c24c226e/attachment.html>
More information about the Difx-users
mailing list