[evlatests] July 5 Lab tests of L302 modules

Barry Clark bclark at nrao.edu
Thu Jul 6 16:31:45 EDT 2006


The lab test was set up yesterday to phase compare two L302s, both
being driven by an Executor, in as close as possible an imitation
of real system behavior.

There was no indication of the tens of millisecond unlock at command
time that we see in the field.  Things were normally very smooth at
command time.

Instead there were observed three phenomena which have not been noticed
in the field.

1.  The phase comparison mixer would occasionally exhibit a pattern
of 180 degree jumps, with the jumps occurring at heartbeat times.  

The explanation is that the Walsh function pattern used by the two L302s 
were out of step with each other by one.  The phase of the Walsh function
pattern is computed by assuming that the pattern begins at TAI 0, and 
repeats every 64 heartbeats.  In particular, the time for a particular
heartbeat interrupt is calculated by looking at the time provided by
the network NTP (Network Time Protocol) client.  If that time happens
to represent an odd number of half heartbeat intervals, the MIB can
become confused about which interrupt number to assign to it, and therefore
which phase of the Walsh pattern to send.  This phenomenon would be 
extremely obvious in the field, and does not seem to have occurred 
except in the single known case when our link to the site because saturated, 
and the NTP servers there drifted quite far from the TAI tick that 
synchronizes the antenna heartbeats.  It occurred downstairs because
the heartbeat tick is free running, not synchronized to anything, and 
it just happened to be near the ambiguity point, so that the normal
sloshing around of the NTP, a few hundred microseconds, was carrying 
the comparison back and forth across the ambiguity point, and the two
MIBs were making the decision to change from one interrupt count to the
other at slightly different times.

2.  The phase comparator showed steps of order 60 degrees, occurring at
10 second intervals.  The jumped phase was sometimes corrected back 
ten seconds later, sometimes it continued for four or five ten second
intervals before returning to the original phase.  By stopping commands
to one or the other L302, it could be seen that this originated in only
one of the two L302s.

This is an expected phenomenon.  I have been surprised that we have not
noticed it in the field.

A bit of background.  The station LVDS heartbeat is used to strobe a 
calculated phase into the DDS registers, which the chip then increments
as commanded by the frequency set into it.  The DDS chip runs on a 
128 MHz clock, and uses this clock to synchronize all of its operations.
Somewhere deep within the DDS chip, the heartbeat pulse is captured by
the 128 MHz clock.  If the transition edge of the heartbeat pulse is
too close to the transition edge of the 128 MHz clock, the capture may
not be secure.  If the capture is one clock later than expected, the
phase will be wrong by the fraction of a turn (DDS frequency)/(128 MHz),
typically around 60 degrees.  There is a memo by Chip Scott discussing
the problem.  This memo seems to imply that capture of the heartbeat 
pulse is unreliable if the transition of the heartbeat pulse is within
a two or three nanosecond bad spot relative to the transitions of the
128 MHz clock.  (I put a copy in 
www.aoc.nrao.edu/evla-internal/techdocs/test/testresults/20060706.bc/
because it's a bit big to mail to a list.  Not a test result you say?
So sue me.)

If the capture is unreliable, exhibiting the 60 degree phase jumps as the 
L302 did in this lab test, the cure is to insert an extra couple of feet 
in the cable carrying the 128 MHz from the L305 to the L302, or equivalently 
to electronically delay the 128 MHz or the heartbeat pulse in the L305 fpga.

3.  There seemed to be a bit of rapid phase jitter.  When the setup 
was arranged to give a slow beat, the apparent trace on the oscilloscope
was slightly thicker at the zero crossing times than at the extrema.  
This would correspond to a fast phase fluctuation of a couple of degrees.
This is within spec, and not really a problem, even if not a consequence
of the ad hoc measuring system, as I tend to think it is.


Having noted these phenomena, and having concluded that number 1 is the
consequence of a fortuitous (or unfortuitous, rather) relation of the 
freewheeling heartbeat with TAI time, we proceeded to test the conclusion.
We powered down the L305 momentarily, with the intent of bring back the
heartbeat pulse at a different, but still random, relation to TAI.
Much to my surprise, both phenomenon 1 and phenomenon 2 went away.  
Evidently the phase relationship between the heartbeat pulse and the 
128 MHz was different after the power-off reset than before.  This can
occur because of the use of the fpga delay-locked multiplier/divider in
the generation of the heartbeat pulse.  There are two ways it can happen.
The Executor knows that this phase relation will be different for three
consecutive pulses, and, to get reproducible results, only commands the
DDS to be reset on heartbeat counts (from 0h TAI) which are divisible by
three.  Resetting the relation between the heartbeat pulse and TAI may
simply have resulted in a more fortuitous selection of which pulse to
use.  However, unless it is properly reset, the delay-locked 
multiplier/divider can come up in an arbitrary state after the power cycle, 
and will preserve that state thereafter.  I am not sure how the L305
reset works.  Does it ensure that this phase relationship is always 
preserved, even though the capture of the reset pulse itself is known
to be dicey?  If it doesn't, something needs to be done to ensure that.
(My favorite is to replace that circuit with something that uses the 
128 MHz clock and counts 6,666,666 pulses, 6,666,667 pulses, 6,666,667
pulses to trigger three heartbeat pulses.  Or even simpler, drive the 
LVDS lines with a 6.4 Hz signal dividing the 128 MHz by 20,000,000, since
we use only one pulse out of three anyway.)

Suggested action Items:

1.  The MIB knows how close we are to the ambiguity point of deciding 
on the count of a given heartbeat pulse from 0h TAI.  This should be
brought out on a monitor point.  It is probably not a problem, but it
would be nice to see a numerical demonstration of this.

2.  The question of the relative phase of the 128 MHz and the heartbeat
pulse should be examined and fixed if needed.

3.  During the power cycling of the L305, one of the L302s got itself
into the state in which the control loop indicates locked and the main
loop indicates unlocked.  This state is cured by issuing a reset to
the DDS interface.  This state also occurs in the field with reasonable
frequency.  I see no disadvantage in having the MIB recognize this 
state itself, and issuing the DDS reset itself, rather than waiting for
a person to do so.

4.  Do something to try to explain the apparently different behavior of 
the L302s in the lab with those in the field.  First we need to verify 
that these particular L302s behave differently in the field than in the 
lab.  There are lots of small differences between modules (connectors, 
cable lengths, etc.), believed negligible, but perhaps not.  If these 
modules misbehave in the field, we can set AC and BD IFs to the same 
frequency, and see if the two L302s, receiving the same commands and 
connected to the same heartbeat, in exact analogy to the lab setup, 
misbehave identically, in a way that wouldn't be seen in the lab test.  
This can be carried even further by swapping the deformatter cables 
between the C and D IFs if needed, to look at the direct comparison on
the antenna self correlation output, in exact emulation of the lab setup.



More information about the evlatests mailing list