[evla-sw-discuss] severity of alerts

Boyd Waters bwaters at nrao.edu
Thu Jul 21 12:39:28 EDT 2005


I agree that we should use syslog as the alerting system, or make the  
case why not.
(How does a MIB emit syslog-compatible alerts, if not running syslogd?)

I agree in principle about the hierarchy of alert reporting: MIBs  
will report to subsystems, a subsystem will evaluate MIB-level alerts  
and pass alarms up the hierarchy, eventual result is alarm displayed  
to operator. The hierarchy is important because it may not be  
possible to know everything at the MIB level, and the hierarchy  
allows for alert reporting to be grown as the system evolves without  
unduly impacting MIB implementation.

I believe that initial implementation of this hierarchy will be  
rather flat and simple.


Alert hierarchy aside: there are some alarm conditions which are  
known a priori - at the MIB level - to degrade data quality. Current  
VLA system has a separate "flag" for that.

Rather than add an extra flag, we could devote one alert level to  
this "data fatal" flag, and implement a policy by which the data gets  
flagged immediately.

However, one runs into a semantic problem here, in that (made-up  
example here) a data point is at syslog-error-level 3 (out of range)  
AND it is known to be "data fatal" (an additional flag).

HLA notion of hierarchy implies that a non-MIB subsystem maintains  
this list of "data fatal" conditions, and implements data-flagging  
policy.

But I think that data-fatal flag, if known at the MIB level, should  
be maintained at the MIB level: it is documentation of the hardware  
behavior.

~ boyd




More information about the evla-sw-discuss mailing list