[fitsbits] BINTABLE convention for >999 columns

Francois-Xavier PINEAU francois-xavier.pineau at astro.unistra.fr
Tue Jul 11 05:57:44 EDT 2017


Dear Arnold and fitsbit,

I am not sure of what is meant by "full backward compatibility".
Could you point-out explicitly the not "fully backward compatible" elements
in Mark T's and the HIERARCH solutions?

Both previous solutions:
- do not break the 'upper case' rule on keyword names
   (whether this rule should be relaxed or not is probably a more 
general topic);
- overload the value of the TFIELDS keyword, keeping the 999 upper limit for
   unaware softwares.

In my opinion, Mark C's solution without a mechanism to overload the TFIELDS
value is not compatible with unaware (versions of) FITS readers.
Thus, it is not really "software backward compatible".
So far, I tend to think that it may be problematic:
I prefer a software reading a FITS table and printing the description of
the column 999 -- saying e.g. that it contains additional columns and 
must be
read by a wide table aware software -- than a software crashing or throwing
errors when it opens a file containing a wide BINTABLE.

In my current understanding, Mark T's and the HIERARCH approaches are
"software backward compatible" since unaware softwares simply interpret 
column
999 as an array of unsigned bytes (or 16-bit integers in Mark T's 
convention).
And they are both legal in the current (and previous) version(s) of the
FITS standard.

Additionally, considering HIERARCH like a convention allowing to extend the
length of a keyword, and considering that a HIERARCH keyword possibly 
overload
a "classic" FITS keyword, then I tend to think that the HIERARCH approach is
"fully backward compatible".
And, I repeat, FITS readers already supporting HIERARCH, and overloading
classic keywords by HIERARCH defined keywords are very likely to support 
wide
tables already.

However, HIERARCH is not a FITS convention.
So, like Thierry, if a proposal for keyword length extension is ready 
somewhere
(and is Software backward compatible), it would probably have my preference.


François-Xavier Pineau


P.S: Mark (T), in your convention, if I am correct, you are using an 
array of
      16-bits integers.
      Why not using an array of Unsigned Bytes?
      In some cases (odd number of L, B, C, X/8 types in columns > 998) 
1 byte
      will be "wasted", right?


Le 10/07/2017 à 17:34, Arnold Rots a écrit :
> From all the suggestions offered so far, Mark's is by far the most 
> sensible in my opinion since it provides a significant expansion while 
> preserving full backward compatibility.
>
>   - Arnold
>
> -------------------------------------------------------------------------------------------------------------
> Arnold H. Rots Chandra X-ray Science Center
> Smithsonian Astrophysical Observatory tel:  +1 617 496 7701
> 60 Garden Street, MS 67   fax:  +1 617 495 7356
> Cambridge, MA 02138 arots at cfa.harvard.edu <mailto:arots at cfa.harvard.edu>
> USA http://hea-www.harvard.edu/~arots/ 
> <http://hea-www.harvard.edu/%7Earots/>
> --------------------------------------------------------------------------------------------------------------
>
>
> On Fri, Jul 7, 2017 at 8:51 PM, Mark Calabretta <mark at calabretta.id.au 
> <mailto:mark at calabretta.id.au>> wrote:
>
>     Taking into consideration what others have said on this thread, I
>     would
>     like to point out that up to 34695 bintable columns may easily be
>     accomodated, with full backward compatibility, via a simple extension
>     to the FITS standard.  Namely,
>
>     1. When encoding bintable-related keywords such as ijPCna, allow
>        lower-case letters to represent digits in a base-36 counting
>     system.
>
>     2. Number bintable columns 1 to 999, followed by a00 to zzz, where an
>        offset (-11960) is applied to make a00 column 1000.  The total
>     number
>        of columns is then 999 + 26*36*36 = 34695. (Alternatively, the full
>        range of three-digit base-36 counting, namely 46656, could be
>        recovered with a more elaborate ordering.)
>
>     Regards,
>     Mark Calabretta
>
>
>     On Fri, 7 Jul 2017 12:09:15 +0100 (BST)
>     Mark Taylor <M.B.Taylor at bristol.ac.uk
>     <mailto:M.B.Taylor at bristol.ac.uk>> wrote:
>
>     Dear fitsbits,
>
>     I am considering a convention for storing table data in FITS files
>     where the number of columns exceeds the 999 limit implicitly imposed
>     by the standard BINTABLE extension type.  I have running code for
>     this (available on request) and plan to incorporate it in future
>     releases of STIL/STILTS/TOPCAT so that people can work with wide
>     tables in FITS while using those tools.  People using software
>     that is unaware of this convention would still see a legal BINTABLE
>     but not the later columns.
>
>     I'm posting the details here in case people want to comment,
>     or point out some major problem with the idea that I might have
>     overlooked, or tell me that there's already a convention for
>     this out there that I should be using instead. Otherwise, please
>     feel free to ignore this post.  I'm not requesting that any
>     other software implements this, though if anyone wants to I
>     certainly don't object.
>
>     Mark
>
>     . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
>
>     Extended column convention for FITS BINTABLE
>     --------------------------------------------
>
>     The BINTABLE extension type as described in the FITS Standard
>     (FITS Standard v3.0, sec 7.3) requires table column metadata
>     to be described using 8-character keywords of the form XXXXXnnn,
>     where XXXXX represents one of an open set of mandatory, reserved
>     or user-defined root keywords up to five characters in length,
>     for instance TFORM (mandatory), TUNIT (reserved), TUCD (user-defined).
>     The nnn part is an integer between 1 and 999 indicating the
>     index of the column to which the keyword in question refers.
>     Since the header syntax confines this indexed part of the keyword
>     to three digits, there is an upper limit of 999 columns in
>     BINTABLE extensions.
>
>     Note that the FITS/BINTABLE format does not entail any restriction on
>     the storage of column *data* beyond the 999 column limit in the data
>     part of the HDU, the problem is just that client software
>     cannot be informed about the layout of this data using the
>     header cards in the usual way.
>
>     In some cases it is desirable to store FITS tables with a column
>     count greater than 999.  Whether that's a good idea is not within
>     the scope of this discussion.
>
>     To achieve this, I propose the following convention.
>
>     Definitions:
>
>      - 'BINTABLE columns' are those columns defined using the
>           FITS BINTABLE standard
>
>      - 'Data columns' are the columns to be encoded
>
>      - N_TOT is the total number of data columns to be stored
>
>      - Data columns with (1-based) indexes from 999 to N_TOT inclusive
>           are known as 'extended' columns.  Their data is stored
>           within the 'container' column.
>
>      - BINTABLE column 999 is known as the 'container' column
>           It contains the byte data for all the 'extended' columns.
>
>     Convention:
>
>      - All column data (for columns 1 to N_TOT) is laid out in the
>     data part
>           of the HDU in exactly the same way as if there were no
>     999-column
>           limit.
>
>      - The TFIELDS header is declared with the value 999.
>
>      - The container column is declared in the header with some
>           TFORM999 value corresponding to the total field length required
>           by all the extended columns ('B' is the obvious data type, but
>           any legal TFORM value that gives the right width MAY be used).
>           The byte count implied by TFORM999 MUST be equal to the
>           total byte count implied by all extended columns.
>
>      - Other XXXXX999 headers MAY optionally be declared to describe
>           the container column in accordance with the usual rules,
>           e.g. TTYPE999 to give it a name.
>
>      - The NAXIS1 header is declared in the usual way to give the width
>           of a table row in bytes.  This is equal to the sum of
>           all the BINTABLE columns as usual.  It is also equal to
>           the sum of all the data columns, which has the same value.
>
>      - Headers for Data columns 1-998 are declared as usual,
>           corresponding to BINTABLE columns 1-998.
>
>      - Keyword XT_ICOL indicates the index of the container column.
>           It MUST be present with the integer value 999 to indicate
>           that this convention is in use.
>
>      - Keyword XT_NCOL indicates the total number of data columns encoded.
>           It MUST be present with an integer value equal to N_TOT.
>
>      - Metadata for each extended column is encoded with keywords
>           of the form XXXXXaaa, where XXXXX are the same keyword roots
>           as used for normal BINTABLE extensions, and aaa is a 3-digit
>           value in base 26 using the characters 'A' (0 in base 26) to
>           'Z' (25 in base 26), and giving the 1-based data column index
>           minus 999.  The sequence aaa MUST be exactly three characters
>           long (leading 'A's are required).  Thus the formats for data
>           columns 999, 1000, 1001, etc are declared with the keywords
>           TFORMAAA, TFORMAAB, TFORMAAC etc.
>
>      - This convention MUST NOT be used for N_TOT<=999.
>
>     The resulting HDU is a completely legal FITS BINTABLE extension.
>     Readers aware of this convention may use it to extract column
>     data and metadata beyond the 999-column limit.
>     Readers unaware of this convention will see 998 columns in their
>     intended form, and an additional (possibly large) column 999
>     which contains byte data but which cannot be easily interpreted.
>
>     This convention can therefore allow encoding of tables with data
>     column counts N_TOT up to 998+26^3 = 18574.
>
>     An example header might look like this:
>
>        XTENSION= 'BINTABLE'           /  binary table extension
>        BITPIX  =                    8 /  8-bit bytes
>        NAXIS   =                    2 /  2-dimensional table
>        NAXIS1  =                 9229 /  width of table in bytes
>        NAXIS2  =                   26 /  number of rows in table
>        PCOUNT  =                    0 /  size of special data area
>        GCOUNT  =                    1 /  one data group
>        TFIELDS =                  999 /  number of columns
>        XT_ICOL =                  999 /  index of container column
>        XT_NCOL =                 1204 /  total columns including extended
>        TTYPE1  = 'posid_1 '           /  label for column 1
>        TFORM1  = 'J       '           /  format for column 1
>        TTYPE2  = 'instrument_1'       /  label for column 2
>        TFORM2  = '4A      '           /  format for column 2
>        TTYPE3  = 'edge_code_1'        /  label for column 3
>        TFORM3  = 'I       '           /  format for column 3
>        TUCD3   = 'meta.code.qual'
>         ...
>        TTYPE998= 'var_min_s_2'        /  label for column 998
>        TFORM998= 'D       '           /  format for column 998
>        TUNIT998= 'counts/s'           /  units for column 998
>        TTYPE999= 'XT_MORECOLS'        /  label for column 999
>        TFORM999= '813I    '           /  format for column 999
>        TTYPEAAA= 'var_min_u_2'        /  label for column 999
>        TFORMAAA= 'D       '           /  format for column 999
>        TUNITAAA= 'counts/s'           /  units for column 999
>        TTYPEAAB= 'var_prob_h_2'       /  label for column 1000
>        TFORMAAB= 'D       '           /  format for column 1000
>         ...
>        TTYPEAHW= 'var_prob_w_2'       /  label for column 1203
>        TFORMAHW= 'D       '           /  format for column 1203
>        TTYPEAHX= 'var_sigma_w_2'      /  label for column 1204
>        TFORMAHX= 'D       '           /  format for column 1204
>        TUNITAHX= 'counts/s'           /  units for column 1204
>        END
>
>     This general approach was suggested by William Pence on the FITSBITS
>     list in June 2012
>     (https://listmgr.nrao.edu/pipermail/fitsbits/2012-June/002367.html
>     <https://listmgr.nrao.edu/pipermail/fitsbits/2012-June/002367.html>),
>     and by Francois-Xavier Pineau (CDS) in private conversation in 2016.
>     The details have been filled in by Mark Taylor (Bristol).
>     (F-X favours a different mechanism for encoding the extended
>     column metadata).
>
>     --
>     Mark Taylor   Astronomical Programmer   Physics, Bristol
>     University, UK
>     m.b.taylor at bris.ac.uk <mailto:m.b.taylor at bris.ac.uk>
>     +44-117-9288776 <tel:%2B44-117-9288776>
>     http://www.star.bris.ac.uk/~mbt/ <http://www.star.bris.ac.uk/%7Embt/>
>
>     _______________________________________________
>     fitsbits mailing list
>     fitsbits at listmgr.nrao.edu <mailto:fitsbits at listmgr.nrao.edu>
>     https://listmgr.nrao.edu/mailman/listinfo/fitsbits
>     <https://listmgr.nrao.edu/mailman/listinfo/fitsbits>
>
>     _______________________________________________
>     fitsbits mailing list
>     fitsbits at listmgr.nrao.edu <mailto:fitsbits at listmgr.nrao.edu>
>     https://listmgr.nrao.edu/mailman/listinfo/fitsbits
>     <https://listmgr.nrao.edu/mailman/listinfo/fitsbits>
>
>
>
>
> _______________________________________________
> fitsbits mailing list
> fitsbits at listmgr.nrao.edu
> https://listmgr.nrao.edu/mailman/listinfo/fitsbits

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listmgr.nrao.edu/pipermail/fitsbits/attachments/20170711/d4a1e163/attachment-0001.html>


More information about the fitsbits mailing list