[fitsbits] BINTABLE convention for >999 columns

jaffe jaffe at strw.leidenuniv.nl
Sat Jul 8 04:21:15 EDT 2017


My view is either do it right or don't do it.

If the problem is more or less one-off from a single application
then you should use multiple standard tables, with the connection 
between
the tables intrinsic to the application and not part of any standard.

If there is a general recognized need for very wide tables then there
should be a generalized solution, not limited in width (say by
using base 36 coding).  Such a solution might be a separate table
defining the table format parameters for the wide table, but there
are probably other elegant solutions.

Walter
> Mark,
> 
> Where do these wide FITS tables (> 999 columns) that you are proposing
> to support come from in the first place?  Are you just trying to
> support conversion of other tabular formats that can support more than
> 999 columns into FITS format?  If so, I don't see the point since no
> other existing software will be able to read them properly.
> 
> Also, will TOPCAT have the ability to insert or delete columns within
> these wide FITS tables?  That is a rather complicated process.
> 
> The main issue I see with your convention is that it only provides a
> modest increase in the maximum number of columns from 999 to about
> 18000.  I'd prefer a convention that places no limit on the number of
> columns.   One of the previous posters suggested using the HIERARCH
> convention for encoding keywords like 'TFORM12345', which seems to me
> to be a more robust and easier to understand convention than using
> base 26 encoded strings.
> 
> Regards,
> Bill Pence
> 
>> On Jul 7, 2017, at 7:09 AM, Mark Taylor <M.B.Taylor at bristol.ac.uk> 
>> wrote:
>> 
>> Dear fitsbits,
>> 
>> I am considering a convention for storing table data in FITS files
>> where the number of columns exceeds the 999 limit implicitly imposed
>> by the standard BINTABLE extension type.  I have running code for
>> this (available on request) and plan to incorporate it in future
>> releases of STIL/STILTS/TOPCAT so that people can work with wide
>> tables in FITS while using those tools.  People using software
>> that is unaware of this convention would still see a legal BINTABLE
>> but not the later columns.
>> 
>> I'm posting the details here in case people want to comment,
>> or point out some major problem with the idea that I might have
>> overlooked, or tell me that there's already a convention for
>> this out there that I should be using instead.  Otherwise, please
>> feel free to ignore this post.  I'm not requesting that any
>> other software implements this, though if anyone wants to I
>> certainly don't object.
>> 
>> Mark
>> 
>> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
>> 
>> Extended column convention for FITS BINTABLE
>> --------------------------------------------
>> 
>> The BINTABLE extension type as described in the FITS Standard
>> (FITS Standard v3.0, sec 7.3) requires table column metadata
>> to be described using 8-character keywords of the form XXXXXnnn,
>> where XXXXX represents one of an open set of mandatory, reserved
>> or user-defined root keywords up to five characters in length,
>> for instance TFORM (mandatory), TUNIT (reserved), TUCD (user-defined).
>> The nnn part is an integer between 1 and 999 indicating the
>> index of the column to which the keyword in question refers.
>> Since the header syntax confines this indexed part of the keyword
>> to three digits, there is an upper limit of 999 columns in
>> BINTABLE extensions.
>> 
>> Note that the FITS/BINTABLE format does not entail any restriction on
>> the storage of column *data* beyond the 999 column limit in the data
>> part of the HDU, the problem is just that client software
>> cannot be informed about the layout of this data using the
>> header cards in the usual way.
>> 
>> In some cases it is desirable to store FITS tables with a column
>> count greater than 999.  Whether that's a good idea is not within
>> the scope of this discussion.
>> 
>> To achieve this, I propose the following convention.
>> 
>> Definitions:
>> 
>> - 'BINTABLE columns' are those columns defined using the
>>      FITS BINTABLE standard
>> 
>> - 'Data columns' are the columns to be encoded
>> 
>> - N_TOT is the total number of data columns to be stored
>> 
>> - Data columns with (1-based) indexes from 999 to N_TOT inclusive
>>      are known as 'extended' columns.  Their data is stored
>>      within the 'container' column.
>> 
>> - BINTABLE column 999 is known as the 'container' column
>>      It contains the byte data for all the 'extended' columns.
>> 
>> Convention:
>> 
>> - All column data (for columns 1 to N_TOT) is laid out in the data 
>> part
>>      of the HDU in exactly the same way as if there were no 999-column
>>      limit.
>> 
>> - The TFIELDS header is declared with the value 999.
>> 
>> - The container column is declared in the header with some
>>      TFORM999 value corresponding to the total field length required
>>      by all the extended columns ('B' is the obvious data type, but
>>      any legal TFORM value that gives the right width MAY be used).
>>      The byte count implied by TFORM999 MUST be equal to the
>>      total byte count implied by all extended columns.
>> 
>> - Other XXXXX999 headers MAY optionally be declared to describe
>>      the container column in accordance with the usual rules,
>>      e.g. TTYPE999 to give it a name.
>> 
>> - The NAXIS1 header is declared in the usual way to give the width
>>      of a table row in bytes.  This is equal to the sum of
>>      all the BINTABLE columns as usual.  It is also equal to
>>      the sum of all the data columns, which has the same value.
>> 
>> - Headers for Data columns 1-998 are declared as usual,
>>      corresponding to BINTABLE columns 1-998.
>> 
>> - Keyword XT_ICOL indicates the index of the container column.
>>      It MUST be present with the integer value 999 to indicate
>>      that this convention is in use.
>> 
>> - Keyword XT_NCOL indicates the total number of data columns encoded.
>>      It MUST be present with an integer value equal to N_TOT.
>> 
>> - Metadata for each extended column is encoded with keywords
>>      of the form XXXXXaaa, where XXXXX are the same keyword roots
>>      as used for normal BINTABLE extensions, and aaa is a 3-digit
>>      value in base 26 using the characters 'A' (0 in base 26) to
>>      'Z' (25 in base 26), and giving the 1-based data column index
>>      minus 999.  The sequence aaa MUST be exactly three characters
>>      long (leading 'A's are required).  Thus the formats for data
>>      columns 999, 1000, 1001, etc are declared with the keywords
>>      TFORMAAA, TFORMAAB, TFORMAAC etc.
>> 
>> - This convention MUST NOT be used for N_TOT<=999.
>> 
>> The resulting HDU is a completely legal FITS BINTABLE extension.
>> Readers aware of this convention may use it to extract column
>> data and metadata beyond the 999-column limit.
>> Readers unaware of this convention will see 998 columns in their
>> intended form, and an additional (possibly large) column 999
>> which contains byte data but which cannot be easily interpreted.
>> 
>> This convention can therefore allow encoding of tables with data
>> column counts N_TOT up to 998+26^3 = 18574.
>> 
>> An example header might look like this:
>> 
>>   XTENSION= 'BINTABLE'           /  binary table extension
>>   BITPIX  =                    8 /  8-bit bytes
>>   NAXIS   =                    2 /  2-dimensional table
>>   NAXIS1  =                 9229 /  width of table in bytes
>>   NAXIS2  =                   26 /  number of rows in table
>>   PCOUNT  =                    0 /  size of special data area
>>   GCOUNT  =                    1 /  one data group
>>   TFIELDS =                  999 /  number of columns
>>   XT_ICOL =                  999 /  index of container column
>>   XT_NCOL =                 1204 /  total columns including extended
>>   TTYPE1  = 'posid_1 '           /  label for column 1
>>   TFORM1  = 'J       '           /  format for column 1
>>   TTYPE2  = 'instrument_1'       /  label for column 2
>>   TFORM2  = '4A      '           /  format for column 2
>>   TTYPE3  = 'edge_code_1'        /  label for column 3
>>   TFORM3  = 'I       '           /  format for column 3
>>   TUCD3   = 'meta.code.qual'
>>    ...
>>   TTYPE998= 'var_min_s_2'        /  label for column 998
>>   TFORM998= 'D       '           /  format for column 998
>>   TUNIT998= 'counts/s'           /  units for column 998
>>   TTYPE999= 'XT_MORECOLS'        /  label for column 999
>>   TFORM999= '813I    '           /  format for column 999
>>   TTYPEAAA= 'var_min_u_2'        /  label for column 999
>>   TFORMAAA= 'D       '           /  format for column 999
>>   TUNITAAA= 'counts/s'           /  units for column 999
>>   TTYPEAAB= 'var_prob_h_2'       /  label for column 1000
>>   TFORMAAB= 'D       '           /  format for column 1000
>>    ...
>>   TTYPEAHW= 'var_prob_w_2'       /  label for column 1203
>>   TFORMAHW= 'D       '           /  format for column 1203
>>   TTYPEAHX= 'var_sigma_w_2'      /  label for column 1204
>>   TFORMAHX= 'D       '           /  format for column 1204
>>   TUNITAHX= 'counts/s'           /  units for column 1204
>>   END
>> 
>> This general approach was suggested by William Pence on the FITSBITS
>> list in June 2012
>> (https://listmgr.nrao.edu/pipermail/fitsbits/2012-June/002367.html),
>> and by Francois-Xavier Pineau (CDS) in private conversation in 2016.
>> The details have been filled in by Mark Taylor (Bristol).
>> (F-X favours a different mechanism for encoding the extended
>> column metadata).
>> 
>> --
>> Mark Taylor   Astronomical Programmer   Physics, Bristol University, 
>> UK
>> m.b.taylor at bris.ac.uk +44-117-9288776  
>> http://www.star.bris.ac.uk/~mbt/
>> 
>> _______________________________________________
>> fitsbits mailing list
>> fitsbits at listmgr.nrao.edu
>> https://listmgr.nrao.edu/mailman/listinfo/fitsbits
> 
> _______________________________________________
> fitsbits mailing list
> fitsbits at listmgr.nrao.edu
> https://listmgr.nrao.edu/mailman/listinfo/fitsbits



More information about the fitsbits mailing list