[evla-sw-discuss] alerts - reliable delivery

Wed Mar 9 18:21:12 EST 2005

Bill -

If you use an actual event service the producer (e.g. MIB) does not need
to know who is receiving the message, if anyone, or their IP address.
For the producer you have much the same thing you have now with multicast:
one just emits an asynchronous event message of a certain class (this
is like the IGMP group).  The message goes to the event service which
is at a well-known address.  It is the service which keeps track of
subscribers, each of which receives a unicast copy of the event (for a
TCP based service).  Subscribers can dynamically subscribe or unsubscribe
much like the joining an IGMP group.

Logically this is identical to multicast, with essentially the same
mechanisms, the difference being that it is done in software rather than
at the link layer, and uses a reliable protocol.  It is not as efficient
for large numbers of subscribers as multicast, nor is it real time, but
for most applications this does not matter, for example when there are
only a few subscribers or when the data rate is modest.  In some cases it
can be more efficient, e.g., when events can be batched, plus events can
be guaranteed to be delivered in sequence and can be flow-controlled.
These can be important characteristics for streaming large quantities
of data.  If latency is an issue multiple services can be used to isolate
group traffic.

	- Doug

On Wed, 9 Mar 2005, Bill Sahr wrote:

> The current scheme for alerts coming from EVLA and VLA antennas
> is to multicast an alert-on message once when a monitor point
> goes into an alert state, and to multicast an alert-off message
> once when a monitor point exits an alert state.  As I have
> mentioned on several occasions I do not consider this scheme
> robust w.r.t. dropped packets and other network glitches.  We
> already have examples of alert-off messages not been seen for
> a corresponding alert-on message even though direct query of
> the mib shows the monitor point to have exited the alert state.
> 
> In the MIB Issues meeting of 3/8 we decided that the alert status
> of a monitor point would be included in all packets containing
> the value of the monitor point, be those packets multicasts destined
> for the archive, for software process managing an observation, for
> screens, or unicast UDP datagrams returned in response to a "get"
> command received over the service port.  W.r.t. catching alerts,
> this scheme should work well for clients interested in monitor
> point values, but it seems to be extremely inefficient for clients
> that are interested in alerts, but not in monitor point values
> (such a checker & flagging).
> 
> Ken suggested a "box" or a layer whose job is to catch all monitor
> point values and alerts and then make this information globally
> available.  I narrowed/modified his suggestion to a proposal that
> we have a layer that catches only alerts, makes the alerts globally
> available, and uses reliable communications (tcp/ip rather than udp)
> between the mib and the destination layer.
> 
> One of the nice things about multicast is that the sender need have
> no concept of destination address.  The sender simply puts the message
> onto the wire, using a pre-determined multicast IP address.  Parties
> interested in receiving the multicast send out an IGMP "join group"
> message that is handled by the network routers in the system, not by
> applications.  The forwarding tables are maintained in the routers.
> If multicast (or broadcast) is not used for delivery of alerts, then
> pretty much all of the alternatives would require the MIBs must have a
> notion of a destination address for the delivery of alerts.
> 
> I think I would be willing to tolerate requiring the mibs to know one
> address, and I think I find that requirement preferable to some of
> the alternatives such as periodic retransmission of alert-on and
> alert-off messages.  The latter gets very messy once you begin to dig
> into it.
> 
> OK.  So maybe the alert-on and alert-off messages are still sent only
> once, but now they use reliable (tcp/ip) rather than unreliable (udp
> or multicast) delivery, and the mibs do have to know the destination
> for the alerts.
> 
> So, who or what is the destination for the alerts and does that mean we
> would have to poll that destination to determine alert status ?
> 
> My opinion is that the proper destination for the alert-on and alert-off
> messages is the antenna server layer, the one that is to be spun off from
> the Executor, where I have suggested that we maintain antenna state.  The
> picture in my head is now as follows:
> 
> - the antenna server layer is made up of the network addressable antenna
>    objects that we spoke of in the early days of the DO Comm team and that
>    Kevin has frequently advocated
> 
> - antenna state is maintained in these antenna objects.  They tap into
>    the ostream multicasts of monitor data, and are the destination for
>    alerts issued by EVLA and VLA antenna subsystems.
> 
> - Alert-on and alert-off messages produced by VLA and EVLA are still sent
>    only once, but are delivered to the antenna objects via a reliable
>    protocol, presumably tcp/ip.  The tcp/ip connections would be make &
>    break.  Not persistent.
> 
> - The antenna objects now serve as a distribution point for processes
>    interested in antenna state and alerts.  But, please, no polling.  It's
>    a real drag on real-time systems.  Wastes CPU cycles, is not timely,
>    and interferes with timing.  The antenna objects are deployed in
>    well-resourced boxes.  They can support a publish-subscribe mechanism
>    for interested parties such as checker & flagger.
>    Of course, the antenna objects are free to develop their own alerts,
>    based on alerts and monitor data from the antenna subsystems.
> 
> Of course, I will hold back from posting this message until I have a
> diagram to attach.
> 
> Very little, if any, of what I have written is new, and it supports many
> of the ideas that have been floated, such as a layered hierarchy for alert
> generation and distribution.  It also echoes some of Doug Tody's recent
> comments.  The chief difference between our earlier conversations and now
> is that we are no longer talking about concepts, we are talking about what
> we will actually implement, over the next few months.
> 
> Anyone know of a nice, thin layer or set of libraries that could be used
> as the publish-subscribe mechanism between the antenna objects and the
> interested parties, such as flagger ?  Maybe something like ACE toolkit/
> framework, only for Java rather than C & C++ ?
> 
>