[evlatests] New and bizarre DTS behavior

Rob Long rlong at nrao.edu
Thu May 3 13:58:16 EDT 2007


 From 9:57 to about 10:27 last night, the operator ran sysstart due to 
unscheduled dynamic time.

Rob

Hichem Ben Frej wrote:
> The module specific software for the D30x did not change since 12/2005. 
> The MIB framework was updated since then, but it has no effect on the 
> module specific software.
> 
> Thanks,
> 
> Hichem
> 
> Mike Revnell wrote:
> 
>>The following is a rather lengthy description to justify the request I 
>>make in the last two paragraphs.
>>
>>We are observing a new and bizarre behavior in the DTS modules.
>>
>>This just started happeing about 3 weeks ago. Some of the modules it has 
>>happend to have been working in antennas for a couple years. We have 
>>made no changes to the affected parts of the modules for a long time so 
>>I have difficulty believing it is bad hardware.
>>
>>At seemingly random times seemingly random groups of DTS modules go 
>>off-line. The state we find them in is consistent with them having 
>>received a "PSCReset" command or an explicit command to turn off the 
>>formatter. The formatter board is powered down but the digitizer is 
>>still under power. Power to these two boards is controlled 
>>independently. If module power is interrputed all module components are 
>>turned off. The logic that controls digitizer power runs off the same 
>>regulator as logic that controls formatter power.
>>
>>The PSCReset command causes the fpga that controls formatter power to be 
>>reconfigured. Thus in this event the formatter gets powered down but the 
>>digitizer, if it is on, will not.
>>
>>Last night 6 of them went down in a 10 minute period. Here is the timing 
>>of events to 1 minute resolution. These times come from timestamps on 
>>email messages.
>>
>>10:09 PM 16 A
>>10:11 PM 21 A
>>10:12 PM 26 A
>>10:14 PM 16 C
>>10:16 PM 16 D
>>10:19 PM 23 A
>>
>>As far as we are able to tell, when more than one DTS goes off-line they 
>>do so in groups with time resolutions similar to the above. This timing 
>>is consistent with some person doing something.
>>
>>Given the state we find the modules in I have been able to think of only 
>>a couple scenarios to explain the behavior.
>>
>>1. Something or someone has started sending PSCReset commands to the 
>>modules. I can think of no reason to do this because the circumstances 
>>which normally require it happen only when a module loses then regains 
>>its time code input. If something were automatically doing it I would 
>>expect it to happen more frequently and see more links go off line. I 
>>would also expect all links to go off-line within the same minute. Since 
>>doing this would have required writing some new piece of software, at 
>>the expense of writing something more useful, I don't think this is the 
>>case.
>>
>>2. Some change in a seemingly unrelated system or procedure has started 
>>expoliting a hitherto unused bug in the DTS MIB code or FPGA code. This 
>>could be, for example, a command to the DTS module that for some reason 
>>through a sneak logic connection causes the flip-flop that controls the 
>>formatter power to be reset. Another possibility is that some command to 
>>the MIB is causing it to assert its chip-select line that causes the 
>>FPGA to reconfigure.
>>
>>Again this is behavior observed, for the first time about 3 weeks ago, 
>>in modules that have had no changes. Some of them have been running, 
>>without problem, in antennas for over 2 years.
>>
>>All that to justify this.
>>
>>Please review any changes to software and/or procedures which have been 
>>instituted in the last few weeks. It seems to me that we may be 
>>exercising a hitherto unused bug somewhere through a sneak path from an 
>>unrelated change. In this case, I believe coincidence does matter.
>>
>>If anyone might have an idea of what they or someone else was doing with 
>>the system around 10:09 last night it could provide a helpful clue.
>>
>>Thanks.
>>
>>Mike Revnell
>>
>>_______________________________________________
>>evlatests mailing list
>>evlatests at listmgr.cv.nrao.edu
>>http://listmgr.cv.nrao.edu/mailman/listinfo/evlatests
>>  
> 
> 
> 




More information about the evlatests mailing list