[evla-sw-discuss] severity of alerts
Sonja Vrcic
sonja.vrcic at nrc.gc.ca
Thu Jul 21 14:15:28 EDT 2005
This is probably a good time to define system wide format for alarms
and logs.
The antenna MIBs may require only four alarm priority levels, but other
parts of the system will implement more levels (e.g. trace, debug,
information etc.). The antenna MIBs may use only a a subset of the
levels supported by the EVLA system and a subset of attributes that are
defined in XML Schema for the Alarm/Log record.
In the absence of the system-wide definition of alarm / log priority
levels, MCCC / VCI (Master Correlator Control Computer / Virtual
Correlator Interface) uses levels as defined for ALMA Common Software
(ACS). I adopted ACS levels assuming that higher level software (e.g.
alarm handling) may be the same for ALMA and EVLA.
If alarms / logs are XML encoded mnemonics may be used instead of integers.
Here is an example of the MCCC log (time is shown in two formats as an
example):
<Log level="DEBUG" code="2" descriptor="INSTANT">
<TimeStamp millis="1115850114977"
dateTime="2005-05-11T15:21:54.977-07:00"/>
<Host type="MCCC" instance="0"/>
<Originator class="TestGui" method="main" thread="1"/>
<Logger name="global"/>
<Description>ENTRY</Description>
</Log>
The following table from the DRAO Memo 22 (Alarms and Logging) shows
how Linux and Java alarm levels can be converted to ACS, and vice versa.
ACS
Priority level
UNIX/LINUX
Priority Level
JAVA
Priority Level
Trace
2
KERN_DEBUG
7
FINEST
400
Debug
3
KERN_DEBUG
7
FINE
700
Info
4
KERN_INFO
6
INFO
800
Notice
5
KERN_NOTICE
5
INFO
800
Warning
6
KERN_WARNING
4
WARNING
900
Error
8
KERN_ERR
3
WARNING
901
Critical
9
KERN_CRIT
2
WARNING
902
Alert
10
KERN_ALERT
1
WARNING
903
Emergency
11
KERN_EMERG
0
SEVERE
1000
ACS priority levels are defined in the document ALM Logging and
Archiving, page 23.
http://www.eso.org/~almamgr/AlmaAcs/OnlineDocs/
http://www.eso.org/~almamgr/AlmaAcs/OnlineDocs/ACS_docs/schemas/index.html
XML Schema for ACS Alarm (an XML element is defined for each level):
<xs:element name="Alarm">
<xs:complexType mixed="true">
<xs:sequence maxOccurs="unbounded" minOccurs="0">
<xs:element ref="Data"/>
</xs:sequence>
<xs:attribute ref="TimeStamp"/>
<xs:attribute ref="File"/>
<xs:attribute ref="Line"/>
<xs:attribute ref="Routine"/>
<xs:attribute ref="SourceObject"/>
<xs:attribute ref="Host"/>
<xs:attribute ref="Process"/>
<xs:attribute ref="Context"/>
<xs:attribute ref="Thread"/>
<xs:attribute ref="StackId"/>
<xs:attribute ref="StackLevel"/>
<xs:attribute ref="LogId"/>
<xs:attribute ref="Priority"/>
<xs:attribute ref="Uri"/>
</xs:complexType>
</xs:element>
DRAO Memo 22 :
http://www.drao-ofr.hia-iha.nrc-cnrc.gc.ca/science/widar/private/Memos.html
XML Schema for the MCCC Log Record :
http://widar8/~vrcics/
Sonja
Bryan Butler wrote:
>
>
> On 7/21/05 09:46, Bruce Rowen wrote:
> > Bryan Butler wrote:
> >
> >>
> >> all,
> >>
> >> we've gotten to the point where we need to define a severity level for
> >> alerts. the operators need this in order to tell the importance level
> >> of them as they arrive on the checker screen.
> >>
> >> i propose that we define an integer alert level from 0 to 5, with 0
> >> being the highest importance (issues of safety) and 5 being
> >> informational only. if somebody can make a case for more granularity
> >> (do we need 10 levels?), that's fine.
> >
> >
> > The correlator CMIBs will use the Linux/Unix syslog system for general
> > internal error reporting and quite possibly extend its use for all
> > error/log messaging. I think it would be prudent, at a minimum, to
> adopt
> > the numbering system used by syslog so as not to preclude its many
> > attributes from being used in the future for the EVLA. Look at
> > /usr/include/sys/syslog.h and the manual page (man syslogd) for more
> > details.
> >
> > Of concern here is to at least keep the order (0 being most severe) and
> > granularity (0-7) common with syslog.
>
> this seems like a fine suggestion to me.
>
> >> the engineers will be going over each MIB and its monitor points and
> >> assigning this severity code to its alerts. pat van buskirk is going
> >> to do the leg work of pestering the engineers on this.
> >>
> >> in addition to a severity level, an "action" for the operator has to
> >> be defined for each of these alerts. this would be similar to the
> >> page at: http://www.vla.nrao.edu/operators/alarms/ for the VLA.
> >
> > It is my opinion that these two items should be applied to the alert by
> > a higher level system (above the MIB) to avoid too much thinking and
> > policy at the MIB level
> >
>
> OK - we're in agreement here then.
>
> >> once they have them defined, then we need to support them. there are
> >> two ways that i see to do this:
> >> 1 - each MIB has coded into it these severity levels, just as it
> >> has coded into it the levels at which alerts are triggered, and
> >> when the alert is sent out, the severity code goes out with it;
> >> 2 - there is a lookup table which checker uses, given the MIB and
> >> the monitor point/alert, to assign severity, and any program
> >> that receives the alerts can use that lookup table to retrieve
> >> the severity level.
> >>
> >> the advantage to 1 is that it keeps the information closest to the
> >> MIB. it also saves the "management" software upstream. the
> >> disadvantage is that if you decide to change anything you have to
> >> modify all of those MIB images. the advantage to 2 is that you avoid
> >> that MIB image modification, and can centralize everything (in a
> >> database or similar). another advantage is that you can also include
> >> the "action" in this database, as well as flagging information. since
> >> you are going to need these other things there, you might as well add
> >> a column for severity.
> >>
> >> i prefer the lookup table/database, but would like to hear other
> >> opinions.
> >
> > One beauty of syslog is it's ability to sort and dispatch log/error
> > messages to a highly tunable set of target machines and/or users. All
> > built in, all programed and tested, and already in use by the sysadmin
> > people for their needs. No need to create our own proprietary logging
> > system.
>
> yes, but we already have the distribution mechanism in place - the
> multicasting of the alerts. so we don't need the dispatch feature of
> syslog. or maybe i'm not understanding how you mean to use syslog...
>
>
> -bryan
> _______________________________________________
> evla-sw-discuss mailing list
> evla-sw-discuss at listmgr.cv.nrao.edu
> http://listmgr.cv.nrao.edu/mailman/listinfo/evla-sw-discuss
>
--
Sonja Vrcic
National Research Council
Herzberg Institute of Astrophysics
Dominion Radio Astrophysical Observatory,
Penticton, BC, Canada
Tel:(250)490-4309/(250)493-2277ext.309
Sonja.Vrcic at nrc-cnrc.gc.ca
http://www.drao-ofr.hia-iha.nrc-cnrc.gc.ca/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listmgr.nrao.edu/pipermail/evla-sw-discuss/attachments/20050721/bd299cd4/attachment.html>
More information about the evla-sw-discuss
mailing list