[evla-sw-discuss] Atomicity of Scheduling Blocks

David Harland dharland at nrao.edu
Wed Aug 15 13:09:02 EDT 2007


SUMMARY

This note asks about the atomicity of the scheduling block (SB)
and the implications atomicity has for scheduling and execution,
especially for SBs with authorized counts > 1.


BACKGROUND

Some of our documents refer to the SB as the atomic unit of
scheduling.  Being atomic seems to imply that a whole SB is
scheduled as a cohesive unit, not something that is divided
up and dispersed over time.  It also implies that it succeeds
or fails as a whole, not that some part of it failed while
some other part succeeded.

The current (ie, the one used in observations today) SB has
the concept of an authorized count.  The SSS SB also has this
property.  In addition, the SSS SB has a loop of scans, which
itself has a repetition count.


QUESTIONS

1. Is the SB truly atomic, or is it really a single execution
of an SB that is atomic?  For example, if an SB has a rep count
of 5, and if it has already had 3 successful executions, what
happens if the 4th fails?  Are the results the whole SB scrapped,
and the SB rescheduled?  Or, alternatively, are the 3 successful
executions kept, and only the 4th thrown away and then retried?
(If the latter, then calling the SB "atomic" seems inappropriate.)

2. Does the scheduler think of an SB as an indivisible thing,
or does it really look at each execution of the SB as the
atomic element of scheduling?  Take the SB, above, that has
a rep count of 5.  If the scheduler truly thinks of the SB
as atomic, it would seem that it would schedule the whole 5
reps as one thing, probably neither caring nor knowing that these
repetitions exist.  When this SB's turn came up, it would be
passed to the executor, which would then see that it is to get
5 good executions from it.  If, on the other hand, each potential
execution is viewed as the atomic element of scheduling, then the
scheduler would be free to disperse those 5 executions over time.


REASON FOR QUESTIONS

In the initial implementation of the SSS SB we took this
atomicity of the SB to heart and wrote code assuming the
entire SB, meaning all of its potential executions, would
be scheduled at once, contiguously, and if something went
wrong during execution, all data would be discarded and the
whole SB rescheduled.

We later found out that this might be an incorrect view, and
that calling the SB "atomic" is misleading.  I'm getting ready
to undo our original coding and treat a single execution of an
SB as the atomic unit of scheduling.  Before doing so, i want
to make sure there is a consensus view on this, so that we
don't waste time undoing our original view only to return
to it later.


Thanks,
David




More information about the evla-sw-discuss mailing list