[evla-sw-discuss] severity of alerts

Sonja Vrcic sonja.vrcic at nrc.gc.ca
Thu Jul 21 14:15:28 EDT 2005


This is probably a good time to define  system wide format for alarms 
and logs.

The antenna MIBs may  require only four alarm priority levels, but other 
parts of the system will implement more levels (e.g. trace, debug, 
information etc.).  The antenna MIBs may use only a a subset of the 
levels supported by the EVLA system and a subset of attributes that are 
defined in  XML Schema for the Alarm/Log record.

In the absence of the system-wide definition of alarm / log priority 
levels, MCCC / VCI (Master Correlator  Control Computer / Virtual 
Correlator Interface) uses levels as defined for ALMA Common Software 
(ACS).   I adopted ACS levels assuming that higher level software (e.g. 
alarm handling) may be the same for ALMA and EVLA.

If alarms / logs are XML encoded mnemonics may be used instead of integers.

Here is an example of the MCCC log (time is shown in two formats as an 
example):

<Log level="DEBUG" code="2" descriptor="INSTANT">
    <TimeStamp millis="1115850114977" 
dateTime="2005-05-11T15:21:54.977-07:00"/>
    <Host type="MCCC" instance="0"/>
    <Originator class="TestGui" method="main" thread="1"/>
    <Logger name="global"/>
    <Description>ENTRY</Description>
</Log>


The following table from the  DRAO Memo 22  (Alarms and Logging)  shows 
how Linux and Java alarm levels can be converted to ACS, and vice versa.

ACS

Priority level

	

UNIX/LINUX

Priority Level

	

JAVA

Priority Level

Trace

	

2

	

KERN_DEBUG

	

7

	

FINEST

	

400

Debug

	

3

	

KERN_DEBUG

	

7

	

FINE

	

700

Info

	

4

	

KERN_INFO

	

6

	

INFO

	

800

Notice

	

5

	

KERN_NOTICE

	

5

	

INFO

	

800

Warning

	

6

	

KERN_WARNING

	

4

	

WARNING

	

900

Error

	

8

	

KERN_ERR

	

3

	

WARNING

	

901

Critical

	

9

	

KERN_CRIT

	

2

	

WARNING

	

902

Alert

	

10

	

KERN_ALERT

	

1

	

WARNING

	

903

Emergency

	

11

	

KERN_EMERG

	

0

	

SEVERE

	

1000




ACS priority levels are defined in the document ALM Logging and 
Archiving, page 23.

http://www.eso.org/~almamgr/AlmaAcs/OnlineDocs/

http://www.eso.org/~almamgr/AlmaAcs/OnlineDocs/ACS_docs/schemas/index.html

XML Schema for ACS Alarm (an XML element is defined for each level):

<xs:element name="Alarm">
<xs:complexType mixed="true">
<xs:sequence maxOccurs="unbounded" minOccurs="0">
<xs:element ref="Data"/>
</xs:sequence>
<xs:attribute ref="TimeStamp"/>
<xs:attribute ref="File"/>
<xs:attribute ref="Line"/>
<xs:attribute ref="Routine"/>
<xs:attribute ref="SourceObject"/>
<xs:attribute ref="Host"/>
<xs:attribute ref="Process"/>
<xs:attribute ref="Context"/>
<xs:attribute ref="Thread"/>
<xs:attribute ref="StackId"/>
<xs:attribute ref="StackLevel"/>
<xs:attribute ref="LogId"/>
<xs:attribute ref="Priority"/>
<xs:attribute ref="Uri"/>
</xs:complexType>
</xs:element>


DRAO Memo 22 :
http://www.drao-ofr.hia-iha.nrc-cnrc.gc.ca/science/widar/private/Memos.html

XML Schema for the MCCC Log Record :
http://widar8/~vrcics/


Sonja




Bryan Butler wrote:

>
>
> On 7/21/05 09:46, Bruce Rowen wrote:
> > Bryan Butler wrote:
> >
> >>
> >> all,
> >>
> >> we've gotten to the point where we need to define a severity level for
> >> alerts.  the operators need this in order to tell the importance level
> >> of them as they arrive on the checker screen.
> >>
> >> i propose that we define an integer alert level from 0 to 5, with 0
> >> being the highest importance (issues of safety) and 5 being
> >> informational only.  if somebody can make a case for more granularity
> >> (do we need 10 levels?), that's fine.
> >
> >
> > The correlator CMIBs will use the Linux/Unix syslog system for general
> > internal error reporting and quite possibly extend its use for all
> > error/log messaging. I think it would be prudent, at a minimum, to 
> adopt
> > the numbering system used by syslog so as not to preclude its many
> > attributes from being used in the future for the EVLA.  Look at
> > /usr/include/sys/syslog.h and the manual page (man syslogd) for more
> > details.
> >
> > Of concern here is to at least keep the order (0 being most severe) and
> > granularity (0-7) common with syslog.
>
> this seems like a fine suggestion to me.
>
> >> the engineers will be going over each MIB and its monitor points and
> >> assigning this severity code to its alerts.  pat van buskirk is going
> >> to do the leg work of pestering the engineers on this.
> >>
> >> in addition to a severity level, an "action" for the operator has to
> >> be defined for each of these alerts.  this would be similar to the
> >> page at: http://www.vla.nrao.edu/operators/alarms/ for the VLA.
> >
> > It is my opinion that these two items should be applied to the alert by
> > a higher level system (above the MIB) to avoid too much thinking and
> > policy at the MIB level
> >
>
> OK - we're in agreement here then.
>
> >> once they have them defined, then we need to support them.  there are
> >> two ways that i see to do this:
> >>  1 - each MIB has coded into it these severity levels, just as it
> >>      has coded into it the levels at which alerts are triggered, and
> >>      when the alert is sent out, the severity code goes out with it;
> >>  2 - there is a lookup table which checker uses, given the MIB and
> >>      the monitor point/alert, to assign severity, and any program
> >>      that receives the alerts can use that lookup table to retrieve
> >>      the severity level.
> >>
> >> the advantage to 1 is that it keeps the information closest to the
> >> MIB.  it also saves the "management" software upstream.  the
> >> disadvantage is that if you decide to change anything you have to
> >> modify all of those MIB images.  the advantage to 2 is that you avoid
> >> that MIB image modification, and can centralize everything (in a
> >> database or similar).  another advantage is that you can also include
> >> the "action" in this database, as well as flagging information.  since
> >> you are going to need these other things there, you might as well add
> >> a column for severity.
> >>
> >> i prefer the lookup table/database, but would like to hear other
> >> opinions.
> >
> > One beauty of syslog is it's ability to sort and dispatch log/error
> > messages to a highly tunable set of target machines and/or users. All
> > built in, all programed and tested, and already in use by the sysadmin
> > people for their needs. No need to create our own proprietary logging
> > system.
>
> yes, but we already have the distribution mechanism in place - the
> multicasting of the alerts.  so we don't need the dispatch feature of
> syslog.  or maybe i'm not understanding how you mean to use syslog...
>
>
>         -bryan
> _______________________________________________
> evla-sw-discuss mailing list
> evla-sw-discuss at listmgr.cv.nrao.edu
> http://listmgr.cv.nrao.edu/mailman/listinfo/evla-sw-discuss
>

-- 
Sonja Vrcic

National Research Council
Herzberg Institute of Astrophysics
Dominion Radio Astrophysical Observatory,
Penticton, BC, Canada
Tel:(250)490-4309/(250)493-2277ext.309
Sonja.Vrcic at nrc-cnrc.gc.ca
http://www.drao-ofr.hia-iha.nrc-cnrc.gc.ca/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listmgr.nrao.edu/pipermail/evla-sw-discuss/attachments/20050721/bd299cd4/attachment.html>


More information about the evla-sw-discuss mailing list