[fitsbits] BINTABLE convention for >999 columns

Mark Taylor m.b.taylor at bristol.ac.uk
Fri Jul 28 10:05:17 EDT 2017


Coming back to this after a bit of a breather:

To summarise the dicussion, enthusiasm for my proposed
convention for wide (>999 column) BINTABLES has not been
universal, but I am still planning to implement something
along these lines for my purposes (STIL/STILTS/TOPCAT).
The possibility exists of other software deciding to recognise
such a convention at some point in the future, but I'm not
relying on that or even necessarily recommending it.

In terms of the details, there was one main difference of opinion,
namely how to store the column metadata for the 'extended'
columns in the FITS header.  The suggestion I put forward was
to use a base-26 number giving headers TFORMAAA - TFORMZZZ,
which leads to a limit of 18574 columns.  Francois-Xavier
Pineau suggested instead using the HIERARCH convention,
which would allow a more or less unlimited column count.

For concreteness, this HIERARCH-based variant differs from
my original proposal
(https://listmgr.nrao.edu/pipermail/fitsbits/2017-July/002967.html)
in the following way:

    - Metadata for each extended column is encoded with keywords
      of the form HIERARCH XT XXXXXnnnnn, where XXXXX
      are the same keyword roots as used for normal BINTABLE extensions,
      and nnnnn is a decimal number written as usual (no leading zeros,
      as many digits as required).  Thus the formats for data
      columns 999, 1000, 1001 etc are declared with the keywords
      HIERARCH XT TFORM999, HIERARCH XT TFORM1000, HIERARCH XT TFORM1001
      etc.  Note this uses the ESO HIERARCH convention described at
      https://fits.gsfc.nasa.gov/registry/hierarch_keyword.html.
      The "name space" token has been chosen as "XT" (extended table).

and the example header looks identical to my original example up
to TFORM999, but the remaining entries differ:

   TTYPE998= 'var_min_s_2'        /  label for column 998
   TFORM998= 'D       '           /  format for column 998
   TUNIT998= 'counts/s'           /  units for column 998
   TTYPE999= 'XT_MORECOLS'        /  label for column 999
   TFORM999= '813I    '           /  format for column 999
   HIERARCH XT TTYPE999         = 'var_min_u_2' / label for column 999
   HIERARCH XT TFORM999         = 'D' / format for column 999
   HIERARCH XT TUNIT999         = 'counts/s' / units for column 999
   HIERARCH XT TTYPE1000        = 'var_prob_h_2' / label for column 1000 
   HIERARCH XT TFORM1000        = 'D' / format for column 1000 
    ...  
   HIERARCH XT TTYPE1203        = 'var_prob_w_2' / label for column 1203 
   HIERARCH XT TFORM1203        = 'D' / format for column 1203 
   HIERARCH XT TTYPE1204        = 'var_sigma_w_2' / label for column 1204 
   HIERARCH XT TFORM1204        = 'D' / format for column 1204 
   HIERARCH XT TUNIT1204        = 'counts/s' / units for column 1204
   END

I have implemented and tested both variants, and they both work.
The HIERARCH solution is a bit messier to do because it relies
on a non-standard convention.

Summarising the pros and cons of these two variants:

   Base-26:
    - limited to 18,000 columns ...
      ... but nobody has come up with a plausible case to need more
    - looks kludgy
    - not very human readable

   HIERARCH:
    - requires non-FITS convention (HIERARCH)
    - effectively no column count limit
    - 13 or so fewer characters available for column keyword values
    - easily human readable

The balance of opinion in this thread of those who have expressed
a preference between the two seems to have been in favour of the
HIERARCH option (Francois-Xavier Pineau, Bill Pence, Tom McGlynn)
as opposed to the Base-26 option (me, Rob Seaman, Arnold Rots?).
In view of that, and the nagging worry that somebody might come
up with some reason to store 20k+ columns, I think I'm just
about coming down on the HIERARCH side, though it does look
less FITSy to me.

This message is to give a last chance for anybody to weigh in
on one side or the other of the Base-26/HIERARCH question,
in particular anybody who thinks they might end up one day
wanting to implement support for this (which may be nobody!).
If there is no more input on that question (which is fine by me),
I'll decide one way or the other, implement and release it in
STIL/STILTS/TOPCAT, and report back here.

Thanks for reading and for the community input on this.

Mark

--
Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
m.b.taylor at bris.ac.uk +44-117-9288776  http://www.star.bris.ac.uk/~mbt/



More information about the fitsbits mailing list