[evla-sw-discuss] severity of alerts
Bryan Butler
bbutler at nrao.edu
Thu Jul 21 14:10:44 EDT 2005
On 7/21/05 10:39, Boyd Waters wrote:
>
> I agree that we should use syslog as the alerting system, or make the
> case why not.
> (How does a MIB emit syslog-compatible alerts, if not running syslogd?)
does syslog support multicast?
> I agree in principle about the hierarchy of alert reporting: MIBs will
> report to subsystems, a subsystem will evaluate MIB-level alerts and
> pass alarms up the hierarchy, eventual result is alarm displayed to
> operator. The hierarchy is important because it may not be possible to
> know everything at the MIB level, and the hierarchy allows for alert
> reporting to be grown as the system evolves without unduly impacting
> MIB implementation.
>
> I believe that initial implementation of this hierarchy will be rather
> flat and simple.
indeed. the heirarchy goes MIB->Checker->Operator. pretty simple.
> Alert hierarchy aside: there are some alarm conditions which are known
> a priori - at the MIB level - to degrade data quality. Current VLA
> system has a separate "flag" for that.
>
> Rather than add an extra flag, we could devote one alert level to this
> "data fatal" flag, and implement a policy by which the data gets
> flagged immediately.
the problem is that flagging is more complex than that. of course you
can build the smarts into the system to figure it all out. for
instance, some flags will indicate to only flag one baseband or
polarization.
> However, one runs into a semantic problem here, in that (made-up
> example here) a data point is at syslog-error-level 3 (out of range)
> AND it is known to be "data fatal" (an additional flag).
use a combination of bits? but then that violates the "7 levels of
severity" of syslog.
> HLA notion of hierarchy implies that a non-MIB subsystem maintains this
> list of "data fatal" conditions, and implements data-flagging policy.
yes, agreed. i think we are all agreeing here.
> But I think that data-fatal flag, if known at the MIB level, should be
> maintained at the MIB level: it is documentation of the hardware behavior.
probably true. but some data-fatal flags are at a higher level
(combinations of MIBs) - as tom has pointed out.
-bryan
More information about the evla-sw-discuss
mailing list