[fitsbits] Bintable proposals

Don Wells dwells at NRAO.EDU
Wed Nov 14 22:49:34 EST 2001


Mark, Bill, etal, 

Assertions that the FITS data format is not suitable for real-time use
have been made repeatedly over the 22_year history of FITS. 

I do not agree with these assertions.

Mark Calabretta writes:
 > The fact that NAXIS2 appears in the header and has no default value
 > means that when writing to a streaming device such as a tape drive
 > the size of the output binary table must be known before writing can
 > commence.  In practice, this precludes the use of binary tables for
 > acquisition of data without using a disk as intermediary since the
 > table size is not generally known in advance.

It is important to note that 'using a disk as intermediary' in a
real-time data acquisition system is a practical and effective solution
for the NAXIS2 problem in the majority of cases today.

Ten years ago I designed the recording system for the VLBA Correlator,
with the specification of 0.5_MB/s sustained.  The control software was
under VxWorks.  We had a disk drive with capacity of order 1_GB.  The
disk was capable of reading or writing at more than 1_MB/s, so it was
possible for one set of real-time processes to be writing FITS files to
disk while another real-time process copied previously written files
from disk to tape (DAT tapes in our case). 

The system needed to be able to record arbitrarily large observational
datasets, with many GB of data. We decided that, rather than writing
single large FITS files, these datasets should be written as a set of
FITS files of some convenient size.  We didn't want to continue a FITS
file from one cassette to another cassette.  It was easy for us to
choose a file size such that we could achieve 90+% packing efficiency
without multi-volume files.  Sizes in the range 50-200_MB will do this.

The VLBA is a radio aperture synthesis imaging instrument, so the files
consist of BINTABLE extensions containing complex fringe visibility
samples, plus other BINTABLE extensions which describe the objects being
observed and the instrumental configuration.  The latter set of tables
are repeated in all of the FITS files, for reliability and convenience.
The disk capacity was sufficient to hold a considerable number of the
smaller FITS files, so data acquisition could continue while writing was
stopped to change cassettes.  The disk was actually much faster than the
DAT tape, so the design supported running at higher speed in a 'burst
mode' until the disk filled.  In fact, we had multiple DAT tape drives,
so it was also possible to effectively synthesize a faster tape drive by
running two or more tape writing processes to copy separate FITS files
from disk to different DAT tapes simultaneously (file label information
went to the RDBMS so that the FITS files could later be retrieved from
multiple tape cassettes in proper order). This is a form of 'striping'
for tapes.

In addition to the NAXIS2 update requirement, there is another technical
problem which FITS poses for real-time designers.  This problem is the
CPU-time costs for header and ancillary extension generation, especially
at initialization, *if* the whole FITS file is being generated by one
process and streamed to output.  There is a simple solution for this
problem if a disk is being used to stage the files.  That is to break up
the FITS file into a set of files on the disk, so that the various
BINTABLES each have their own file. In fact, the headers of each
extension should also be in separate files.  This permits the generation
of the headers to be done in the background by lower-priority real-time
processes ('threads' in the modern real-time parlance).  Calibration and
configuration tables can be written into their files by lower-priority
processes. The high priority bulk data acquisition processes write their
binary data structures into simple files without worrying about headers
and the recording of ancillary information. When such a high-priority
process completes its file (BINTABLE data array), it updates the NAXIS2
value in the associated header file (and this too can even be done by a
thread at lower-priority!).

There is a simple table-of-contents file which lists the filenames of
the segments which should be concatenated to form the final FITS file on
tape.  With appropriate transaction procedures, it is even possible for
the disk-to-tape copy routine to find and copy FITS files on the disk
when restarting after an unplanned interruption of processing due to a
hardware/software failure.

 > .. pulsar pulse profile data directly to DLT at close to its
 > maximum I/O rate.  It was not considered feasible to stage this
 > data to disk.

If I were designing a system for this too-fast-for-disk-staging problem,
I would build an analogy to the VLBA disk staging design, but do it in
RAM, writing the file segments to malloc-ed chunks of memory, with
linked lists of pointers to tie it all together. RAM is cheap today.
Obviously semaphores would guard the critical regions in which the
linked lists are manipulated. With attention to the details of multiple
threads accessing and updating the lists, I expect that such a design
would be robust, and probably could mostly avoid contention for the
semaphores. Just as with the disk solution, this RAM buffering scheme
would enable lower-priority threads to generate headers and ancillary
data in the background.

 > In general, the only requirement is to be able to determine the end
 > of the table and I can only think of the following impediments:
 > 
 >    1) Extensions following the BINARY table,
 > 
 >    2) Table row length less than 2880 bytes (i.e. whole table rows
 >       may fit in the block padding), 
 > 
 >    3) Use of variable length arrays.
 > 
 > (1) and (2) do not affect us but anyway they do not appear to be
 > basic problems.  Only (3) is fundamentally limiting.

It appears to me that the design ideas which I described above will work
just fine when writing data into the variable length array heap, because
the table rows which contain the pointers into the heap can be written
into the table segment while the bulk variable-length data are written
into the heap segment.  On a disk these segments are separate files, in
RAM they are separately malloc-ed blocks of memory.  At completion of
the FITS file the heap size and offset (PCOUNT and THEAP) must be
written to the header segment along with the final NAXIS2 dimension.

Bill Cotton writes:
 > What you address is a fundamental FITS limitation rather than just a
 > binary tables one.  Each header is required to say what the length of
 > its data unit is.  This does present problems for data acquisition.
 > A slightly messier, but legal, solution is to break up the data into
 > smaller table units of, say, 1000 or 10000 rows and start a new table
 > each time one fills.  The last table could be null filled.  Buffering
 > entire tables in memory could get around the variable length array
 > problem.

This idea (multiple small BINTABLEs in one big FITS file) is yet another
idea for coping with the real-time problem for FITS files. However, I
contend that -- for a number of practical reasons -- it is generally
better to break up multi-gigabyte BINTABLE FITS files into separate
files of more manageable size, as we did with the VLBA Correlator. 

				 -=-

In summary, it is certainly true that the need for updating the value of
NAXIS2 (plus PCOUNT and THEAP in the variable length array case) in FITS
headers does pose problems for real-time data acquisition systems.
However, I believe that the disk-staging design concepts which I
outlined in the discussions above are a sufficient solution for the
problem of streaming FITS files to tape.  I further contend that these
same ideas can even be extended to the very high speed tape case, by
using RAM instead of disk for the staging.

-Don
-- 
  Donald C. Wells      Scientist - GBT Project        dwells at nrao.edu
                    http://www.cv.nrao.edu/~dwells
  National Radio Astronomy Observatory                +1-434-296-0277
  520 Edgemont Road,   Charlottesville, Virginia       22903-2475 USA
       (DCW is often in Green Bank, West Virginia, at +1-304-456-2146)



More information about the fitsbits mailing list