[evla-sw-discuss] severity of alerts

Bryan Butler bbutler at nrao.edu
Thu Jul 21 14:10:44 EDT 2005



On 7/21/05 10:39, Boyd Waters wrote:
> 
> I agree that we should use syslog as the alerting system, or make the  
> case why not.
> (How does a MIB emit syslog-compatible alerts, if not running syslogd?)

does syslog support multicast?

> I agree in principle about the hierarchy of alert reporting: MIBs  will 
> report to subsystems, a subsystem will evaluate MIB-level alerts  and 
> pass alarms up the hierarchy, eventual result is alarm displayed  to 
> operator. The hierarchy is important because it may not be  possible to 
> know everything at the MIB level, and the hierarchy  allows for alert 
> reporting to be grown as the system evolves without  unduly impacting 
> MIB implementation.
> 
> I believe that initial implementation of this hierarchy will be  rather 
> flat and simple.

indeed.  the heirarchy goes MIB->Checker->Operator.  pretty simple.

> Alert hierarchy aside: there are some alarm conditions which are  known 
> a priori - at the MIB level - to degrade data quality. Current  VLA 
> system has a separate "flag" for that.
> 
> Rather than add an extra flag, we could devote one alert level to  this 
> "data fatal" flag, and implement a policy by which the data gets  
> flagged immediately.

the problem is that flagging is more complex than that.  of course you 
can build the smarts into the system to figure it all out.  for 
instance, some flags will indicate to only flag one baseband or 
polarization.

> However, one runs into a semantic problem here, in that (made-up  
> example here) a data point is at syslog-error-level 3 (out of range)  
> AND it is known to be "data fatal" (an additional flag).

use a combination of bits?  but then that violates the "7 levels of 
severity" of syslog.

> HLA notion of hierarchy implies that a non-MIB subsystem maintains  this 
> list of "data fatal" conditions, and implements data-flagging  policy.

yes, agreed.  i think we are all agreeing here.

> But I think that data-fatal flag, if known at the MIB level, should  be 
> maintained at the MIB level: it is documentation of the hardware  behavior.

probably true.  but some data-fatal flags are at a higher level 
(combinations of MIBs) - as tom has pointed out.

	-bryan



More information about the evla-sw-discuss mailing list