[evla-sw-discuss] severity of alerts
Bruce Rowen
browen at aoc.nrao.edu
Thu Jul 21 11:46:28 EDT 2005
Bryan Butler wrote:
>
> all,
>
> we've gotten to the point where we need to define a severity level for
> alerts. the operators need this in order to tell the importance level
> of them as they arrive on the checker screen.
>
> i propose that we define an integer alert level from 0 to 5, with 0
> being the highest importance (issues of safety) and 5 being
> informational only. if somebody can make a case for more granularity
> (do we need 10 levels?), that's fine.
The correlator CMIBs will use the Linux/Unix syslog system for general
internal error reporting and quite possibly extend its use for all
error/log messaging. I think it would be prudent, at a minimum, to adopt
the numbering system used by syslog so as not to preclude its many
attributes from being used in the future for the EVLA. Look at
/usr/include/sys/syslog.h and the manual page (man syslogd) for more
details.
Of concern here is to at least keep the order (0 being most severe) and
granularity (0-7) common with syslog.
>
> the engineers will be going over each MIB and its monitor points and
> assigning this severity code to its alerts. pat van buskirk is going
> to do the leg work of pestering the engineers on this.
>
> in addition to a severity level, an "action" for the operator has to
> be defined for each of these alerts. this would be similar to the
> page at: http://www.vla.nrao.edu/operators/alarms/ for the VLA.
It is my opinion that these two items should be applied to the alert by
a higher level system (above the MIB) to avoid too much thinking and
policy at the MIB level
>
> once they have them defined, then we need to support them. there are
> two ways that i see to do this:
> 1 - each MIB has coded into it these severity levels, just as it
> has coded into it the levels at which alerts are triggered, and
> when the alert is sent out, the severity code goes out with it;
> 2 - there is a lookup table which checker uses, given the MIB and
> the monitor point/alert, to assign severity, and any program
> that receives the alerts can use that lookup table to retrieve
> the severity level.
>
> the advantage to 1 is that it keeps the information closest to the
> MIB. it also saves the "management" software upstream. the
> disadvantage is that if you decide to change anything you have to
> modify all of those MIB images. the advantage to 2 is that you avoid
> that MIB image modification, and can centralize everything (in a
> database or similar). another advantage is that you can also include
> the "action" in this database, as well as flagging information. since
> you are going to need these other things there, you might as well add
> a column for severity.
>
> i prefer the lookup table/database, but would like to hear other
> opinions.
One beauty of syslog is it's ability to sort and dispatch log/error
messages to a highly tunable set of target machines and/or users. All
built in, all programed and tested, and already in use by the sysadmin
people for their needs. No need to create our own proprietary logging
system.
-Bruce
>
> -bryan
>
>
> _______________________________________________
> evla-sw-discuss mailing list
> evla-sw-discuss at listmgr.cv.nrao.edu
> http://listmgr.cv.nrao.edu/mailman/listinfo/evla-sw-discuss
More information about the evla-sw-discuss
mailing list