[evlatests] New and bizarre DTS behavior

Hichem Ben Frej hbenfrej at nrao.edu
Thu May 3 12:15:27 EDT 2007


The module specific software for the D30x did not change since 12/2005. 
The MIB framework was updated since then, but it has no effect on the 
module specific software.

Thanks,

Hichem

Mike Revnell wrote:
> The following is a rather lengthy description to justify the request I 
> make in the last two paragraphs.
>
> We are observing a new and bizarre behavior in the DTS modules.
>
> This just started happeing about 3 weeks ago. Some of the modules it has 
> happend to have been working in antennas for a couple years. We have 
> made no changes to the affected parts of the modules for a long time so 
> I have difficulty believing it is bad hardware.
>
> At seemingly random times seemingly random groups of DTS modules go 
> off-line. The state we find them in is consistent with them having 
> received a "PSCReset" command or an explicit command to turn off the 
> formatter. The formatter board is powered down but the digitizer is 
> still under power. Power to these two boards is controlled 
> independently. If module power is interrputed all module components are 
> turned off. The logic that controls digitizer power runs off the same 
> regulator as logic that controls formatter power.
>
> The PSCReset command causes the fpga that controls formatter power to be 
> reconfigured. Thus in this event the formatter gets powered down but the 
> digitizer, if it is on, will not.
>
> Last night 6 of them went down in a 10 minute period. Here is the timing 
> of events to 1 minute resolution. These times come from timestamps on 
> email messages.
>
> 10:09 PM 16 A
> 10:11 PM 21 A
> 10:12 PM 26 A
> 10:14 PM 16 C
> 10:16 PM 16 D
> 10:19 PM 23 A
>
> As far as we are able to tell, when more than one DTS goes off-line they 
> do so in groups with time resolutions similar to the above. This timing 
> is consistent with some person doing something.
>
> Given the state we find the modules in I have been able to think of only 
> a couple scenarios to explain the behavior.
>
> 1. Something or someone has started sending PSCReset commands to the 
> modules. I can think of no reason to do this because the circumstances 
> which normally require it happen only when a module loses then regains 
> its time code input. If something were automatically doing it I would 
> expect it to happen more frequently and see more links go off line. I 
> would also expect all links to go off-line within the same minute. Since 
> doing this would have required writing some new piece of software, at 
> the expense of writing something more useful, I don't think this is the 
> case.
>
> 2. Some change in a seemingly unrelated system or procedure has started 
> expoliting a hitherto unused bug in the DTS MIB code or FPGA code. This 
> could be, for example, a command to the DTS module that for some reason 
> through a sneak logic connection causes the flip-flop that controls the 
> formatter power to be reset. Another possibility is that some command to 
> the MIB is causing it to assert its chip-select line that causes the 
> FPGA to reconfigure.
>
> Again this is behavior observed, for the first time about 3 weeks ago, 
> in modules that have had no changes. Some of them have been running, 
> without problem, in antennas for over 2 years.
>
> All that to justify this.
>
> Please review any changes to software and/or procedures which have been 
> instituted in the last few weeks. It seems to me that we may be 
> exercising a hitherto unused bug somewhere through a sneak path from an 
> unrelated change. In this case, I believe coincidence does matter.
>
> If anyone might have an idea of what they or someone else was doing with 
> the system around 10:09 last night it could provide a helpful clue.
>
> Thanks.
>
> Mike Revnell
>
> _______________________________________________
> evlatests mailing list
> evlatests at listmgr.cv.nrao.edu
> http://listmgr.cv.nrao.edu/mailman/listinfo/evlatests
>   


-- 
_________________________________________________________________________

Hichem Ben Frej - hbenfrej at aoc.nrao.edu

NRAO Array Operations Center       Phone:  +1-505-835-7292
P.O. Box 0 (1003 Lopezville Rd)    
Socorro NM 87801                   






More information about the evlatests mailing list