[Difx-users] New to DiFX - issues with test dataset rdv70

Wed Aug 11 23:02:49 EDT 2021

Hi Rebecca,

I'll try to answer some of these questions below:

*1. Why is the calc file you generate from calcif2 and difxcalc different
to the reference case? *The small differences with difxcalc are expected
(it is a different code, after all). I see differences at up to the tens of
ps level, which is about what I would expect (that's ~1cm).  Unfortunately
I can't easily test calcif2 right now as fedora dropped support for the rpc
library, I would have expected that to produce identical output, but it is
possible there was a minor change in calcif2 some time in the last 10 years
that means that the result is not absolutely identical.

*2. Why does the DiFX output differ to the reference output?*  Here I'm
pretty sure that indeed it was the bug that was fixed. I have attached to
this email a set of outputs for the reference rdv70 correlation (where I've
just run the correlator on the existing rdv70 input dataset, not re-running
calc or anything) produced using DiFX-2.6.2.  I also included a second set
of output (and the associated input) for the case where I've generated the
im file using difxcalc.  The output with the original input fileset is
called reference_1.*, while the new fileset is example_1.* (where I just
ran vex2difx example.v2d; difxcalc example_1.calc; created the threads
file; and ran startdifx.) Try diffDiFX'ing these with the equivalent from
your output; they should give results consistent to the level of numerical
precision (1e-5, 1e-6 or so).  The reference test sets (both input and
output) really need to be updated, and I would like to move them over to
github (and just leave the bulky baseband data on the ftp site). Any
volunteers to help with that much appreciated!

*3. Why isn't startdifx generating a valid machines file?* What is the
exact command you are giving to startdifx, and have you set up the
environment variable DIFX_MACHINES and pointed it at a valid cluster
definition file (as described briefly in
https://www.atnf.csiro.au/vlbi/dokuwiki/doku.php/difx/startdifx, and in
more detail on
https://www.atnf.csiro.au/vlbi/dokuwiki/doku.php/difx/clusterdef?s[]=genmachines
)

*4. How does DiFX use the delay information in the .im file?* The code is
necessarily quite complicated.  The im and calc files are parsed in the
model.h/model.cpp.  They are then used in the process() method of the Mode
class; but there are numerous layers of abstraction: the 5th order
polynomial from the im file is first downsampled to a 2nd order polynomial
covering the 1s region of interest, and then (normally) further
down-sampled to a linear interpolator across a single FFT window.  But
there are options for 0th order and 2nd order interpolation too, and the
actual application of both fringe rotation and fractional sample correction
is broken up into sub-window chunks to improve efficiency (by replacing
some expensive sin/cos operations with cheap complex multiplications).  All
of that makes the code quite hard to follow.  There is a python library for
parsing the DiFX input files, but unfortunately it doesn't treat the model
yet.  It is something we could look at adding.  To answer your specific
questions:
 * DRY and WET refer to specific sub-components of the delay model (the dry
and wet troposphere, respectively.) You don't need to worry about those -
they are just there in case for some reason a user wants to undo the
default calculations of these contributions to the delay model in
post-processing and re-apply their own.  (Mostly used by geodesists.)
 * EL GEOM refers to the geometric elevation of the antenna.  There are two
source positions that need to be considered - one is "where is the source
on the sky really" and the other is "where does the source appear to be on
the sky due to all kinds of propagation effects like atmospheric
refraction".  For some applications you want the former, and for some the
latter.  EL GEOM gives you the antenna elevation based on a straight path
to the source, as opposed to the direction that the radiation will actually
come in on after refraction etc is accounted for.  It (and the azumith, AZ)
are given in degrees.

Cheers,
Adam

On Thu, 12 Aug 2021 at 08:15, Rebecca Lin via Difx-users <
difx-users at listmgr.nrao.edu> wrote:

> Hi all,
>
> I'm new to DiFX and have been trying to test the latest DiFX-2.6.2 version
> on the rdv70 test set. I've followed the instructions in the README file
> but did not get identical results when comparing my final output to the
> reference files. I get the following output when comparing the visibility
> files:
>
> At the end, 960 records disagreed on the header
> After 1848 records, the mean percentage absolute difference is 0.05823633
> and the mean percentage mean difference is -0.01528535 + 0.00029177 i
>
> It appears that the reference files were generated prior to an update in
> DiFX-2.6.1 that may have affected Mark4 data. I've been using calcif2 (Calc
> 9.1, I believe) and found that the .im file generated is slightly different
> than the reference one. I've also tried using difxcalc (which I think is
> Calc 11) and the generated .im file is also different from the reference
> one.
>
> When testing startdifx on the relevant reference files (instead of
> generating my own .calc, im, .input files etc.), the generated visibility
> file still differed from the reference with the following output:
>
> At the end, 0 records disagreed on the header
> After 1848 records, the mean percentage absolute difference is 0.05454927
> and the mean percentage mean difference is -0.01417088 + 0.00020612 i
>
> Does anyone know why I'm getting different results from the reference?
>
> I also ran into another issue with startdifx - the machine file generated
> does not seem to follow the requirement of at least N+2 computers entries,
> where N is the number of telescopes. I'm currently passing my own machine
> file which seems to work but would like to know if I missed something when
> setting up startdifx.
>
> In addition, I'm having a hard time figuring out how exactly mpifxcorr
> uses the delay files generated by calcif2. I have a hard time following the
> C++ source code as I've only worked in Python. Could someone point me to
> where the phasing is done - ie. fringe rotation, subsample time
> corrections, cross multiplication and accumulation of visibilities etc.? As
> well, what are SRC 0 ANT 0 DRY (us), SRC 0 ANT 0 WET (us) and SRC 0 ANT 0
> EL GEOM referring to in the .im file and why is more than once SRC
> generated in the .im file when only 1 source is provided?
>
> Best,
>    Rebecca
> _______________________________________________
> Difx-users mailing list
> Difx-users at listmgr.nrao.edu
> https://listmgr.nrao.edu/mailman/listinfo/difx-users
>

-- 
!=============================================================!
A/Prof. Adam Deller
ARC Future Fellow
Centre for Astrophysics & Supercomputing
Swinburne University of Technology
John St, Hawthorn VIC 3122 Australia
phone: +61 3 9214 5307
fax: +61 3 9214 8797

office days (usually): Mon-Thu
!=============================================================!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listmgr.nrao.edu/pipermail/difx-users/attachments/20210812/aa250fa3/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: referenceoutputs.rdv70.tar.gz
Type: application/x-gzip
Size: 1055009 bytes
Desc: not available
URL: <http://listmgr.nrao.edu/pipermail/difx-users/attachments/20210812/aa250fa3/attachment-0001.bin>