[evla-sw-discuss] severity of alerts
Boyd Waters
bwaters at nrao.edu
Thu Jul 21 12:39:28 EDT 2005
I agree that we should use syslog as the alerting system, or make the
case why not.
(How does a MIB emit syslog-compatible alerts, if not running syslogd?)
I agree in principle about the hierarchy of alert reporting: MIBs
will report to subsystems, a subsystem will evaluate MIB-level alerts
and pass alarms up the hierarchy, eventual result is alarm displayed
to operator. The hierarchy is important because it may not be
possible to know everything at the MIB level, and the hierarchy
allows for alert reporting to be grown as the system evolves without
unduly impacting MIB implementation.
I believe that initial implementation of this hierarchy will be
rather flat and simple.
Alert hierarchy aside: there are some alarm conditions which are
known a priori - at the MIB level - to degrade data quality. Current
VLA system has a separate "flag" for that.
Rather than add an extra flag, we could devote one alert level to
this "data fatal" flag, and implement a policy by which the data gets
flagged immediately.
However, one runs into a semantic problem here, in that (made-up
example here) a data point is at syslog-error-level 3 (out of range)
AND it is known to be "data fatal" (an additional flag).
HLA notion of hierarchy implies that a non-MIB subsystem maintains
this list of "data fatal" conditions, and implements data-flagging
policy.
But I think that data-fatal flag, if known at the MIB level, should
be maintained at the MIB level: it is documentation of the hardware
behavior.
~ boyd
More information about the evla-sw-discuss
mailing list