[fitsbits] Bintable proposals

Mark Calabretta Mark.Calabretta at atnf.CSIRO.AU
Mon Nov 19 00:27:21 EST 2001


Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii

Many thanks to those who responded to my posting.  There's too much to
answer in detail but I do have some general comments - and some solutions!

To recapitulate, this is what I said in my original email (which was in
response to a request from Bill Pence for comments concerning
standardisation of variable length arrays):

> In general, the only requirement is to be able to determine the end of the
> table and I can only think of the following impediments:
> 
>    1) Extensions following the BINARY table,
> 
>    2) Table row length less than 2880 bytes (i.e. whole table rows may fit
>       in the block padding),
> 
>    3) Use of variable length arrays.
> 
> (1) and (2) do not affect us but anyway they do not appear to be basic
> problems.  Only (3) is fundamentally limiting.

I suggested that (1) and (2) were not fundamentally limiting.

  (1) is not because readers can test for the start of a new extension,
      i.e. 'XTENSION = ' (or more) at the start of a 2880 byte block, with
      reasonable reliability.

  (2) is not because the last 2880 byte block can be filled with a signature
      bit pattern.  In fact, judging by Bill Cotton's suggested solution, it
      appears that there may already be a de facto convention for this using
      null-filled blocks.

In fact, there is a very simple construct which makes both of these
foolproof: simply require that when NAXIS2 = -1, column 1 must contain a
non-zero integer value, e.g. the 1-relative table row number.  The first
record in the null-filled block will have a row number of 0 which means
"the end".  The data section ends at the end of that block.

I suggested that (3) was fundamentally limiting but in fact the above
construct solves this too!  The terminating row, i.e. with row number 0,
must be a complete row and the heap starts immediately thereafter.  By the
time the heap is encountered its size will be known by accumulating the
variable length array sizes in all rows preceding it (as discussed in the
bintable document).  The heap ends when the data is exhausted.  THEAP,
PCOUNT and maxelem in the 'rP(maxelem)' TFORMn descriptor would be set
to -1 in the header.

However, I'd also point out that our reason for using NAXIS2 = -1, i.e.
dumping large amounts of data to tape quickly, tends to preclude the use
of a heap since it requires disk caching.

So, given the choice between non-standard bintables using NAXIS2 = -1 or
RPFITS, our traditional, simple but grossly non-standard solution (e.g.
RPFITS begins with SIMPLE = F but even has a 2560 byte block size - enough
to thwart any would-be reader!) I think that the non-standard bintable has
the potential for being portable, extends but does not contradict current
usage, could conceivably gain acceptance, and anyway at least is easily
correctable.

Some people suggested setting SIMPLE = F.  However, when you think about
it, really the only thing that software can do with SIMPLE = F is either
ignore it or reject the file outright.  While I'm sure this is quite
appropriate for RPFITS (which contradicts the standard), using it in the
other case is tantamount to saying that that usage will never be sanctioned.
That is, if NAXIS2 = -1 ever did become recognised usage, future readers
might still reject these old files on the basis of SIMPLE = F.  I've
decided to be optimistic.  Hence

   1) SIMPLE =  T,
   2) NAXIS2 = -1,
   3) column 1 contains the row number,
   4) null-fill the last block,

and we never did intend to use variable length arrays.

BTW, I should point out that this problem is specific to pulsar data
acquisition, all of our other data, aperture synthesis visibility data and
single-dish spectral data, goes to disk - albeit as RPFITS!

Mark Calabretta
ATNF





More information about the fitsbits mailing list