[fitsbits] BINTABLE convention for >999 columns
Rob Seaman
seaman at lpl.arizona.edu
Mon Jul 10 12:25:55 EDT 2017
Thanks for the info about usage context. Separating the tables into
multiple files or extensions still seems a reasonable way to address
these cases, but since Mark T's proposed convention (apparently
originally from Bill) is legal or near-legal FITS usage, the main
question is how best to discourage a diversity of keyword encodings, etc.
Also agree with Mark C's encoding, though would suggest mono-case will
be less of a confusing change than lower case. Mark C's option avoids
confusing usage like TFORM0AA or whatever interrupting the sort order. A
digit in character 6 would require digits in #7 and 8.
Nobody has mentioned extremely wide table use cases (millions of
columns), and 34695 columns is enough to cover all the wide table DB
options listed in a previous email.
Rob
--
On 7/10/17 8:34 AM, Arnold Rots wrote:
> From all the suggestions offered so far, Mark's is by far the most
> sensible in my opinion since it provides a significant expansion while
> preserving full backward compatibility.
>
> - Arnold
>
> -------------------------------------------------------------------------------------------------------------
> Arnold H. Rots Chandra X-ray
> Science Center
> Smithsonian Astrophysical Observatory tel: +1 617
> 496 7701
> 60 Garden Street, MS 67 fax: +1
> 617 495 7356
> Cambridge, MA 02138
> arots at cfa.harvard.edu <mailto:arots at cfa.harvard.edu>
> USA
> http://hea-www.harvard.edu/~arots/ <http://hea-www.harvard.edu/%7Earots/>
> --------------------------------------------------------------------------------------------------------------
>
>
> On Fri, Jul 7, 2017 at 8:51 PM, Mark Calabretta <mark at calabretta.id.au
> <mailto:mark at calabretta.id.au>> wrote:
>
> Taking into consideration what others have said on this thread, I
> would
> like to point out that up to 34695 bintable columns may easily be
> accomodated, with full backward compatibility, via a simple extension
> to the FITS standard. Namely,
>
> 1. When encoding bintable-related keywords such as ijPCna, allow
> lower-case letters to represent digits in a base-36 counting
> system.
>
> 2. Number bintable columns 1 to 999, followed by a00 to zzz, where an
> offset (-11960) is applied to make a00 column 1000. The total
> number
> of columns is then 999 + 26*36*36 = 34695. (Alternatively, the
> full
> range of three-digit base-36 counting, namely 46656, could be
> recovered with a more elaborate ordering.)
>
> Regards,
> Mark Calabretta
>
>
> On Fri, 7 Jul 2017 12:09:15 +0100 (BST)
> Mark Taylor <M.B.Taylor at bristol.ac.uk
> <mailto:M.B.Taylor at bristol.ac.uk>> wrote:
>
> Dear fitsbits,
>
> I am considering a convention for storing table data in FITS files
> where the number of columns exceeds the 999 limit implicitly imposed
> by the standard BINTABLE extension type. I have running code for
> this (available on request) and plan to incorporate it in future
> releases of STIL/STILTS/TOPCAT so that people can work with wide
> tables in FITS while using those tools. People using software
> that is unaware of this convention would still see a legal BINTABLE
> but not the later columns.
>
> I'm posting the details here in case people want to comment,
> or point out some major problem with the idea that I might have
> overlooked, or tell me that there's already a convention for
> this out there that I should be using instead. Otherwise, please
> feel free to ignore this post. I'm not requesting that any
> other software implements this, though if anyone wants to I
> certainly don't object.
>
> Mark
>
> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
>
> Extended column convention for FITS BINTABLE
> --------------------------------------------
>
> The BINTABLE extension type as described in the FITS Standard
> (FITS Standard v3.0, sec 7.3) requires table column metadata
> to be described using 8-character keywords of the form XXXXXnnn,
> where XXXXX represents one of an open set of mandatory, reserved
> or user-defined root keywords up to five characters in length,
> for instance TFORM (mandatory), TUNIT (reserved), TUCD (user-defined).
> The nnn part is an integer between 1 and 999 indicating the
> index of the column to which the keyword in question refers.
> Since the header syntax confines this indexed part of the keyword
> to three digits, there is an upper limit of 999 columns in
> BINTABLE extensions.
>
> Note that the FITS/BINTABLE format does not entail any restriction on
> the storage of column *data* beyond the 999 column limit in the data
> part of the HDU, the problem is just that client software
> cannot be informed about the layout of this data using the
> header cards in the usual way.
>
> In some cases it is desirable to store FITS tables with a column
> count greater than 999. Whether that's a good idea is not within
> the scope of this discussion.
>
> To achieve this, I propose the following convention.
>
> Definitions:
>
> - 'BINTABLE columns' are those columns defined using the
> FITS BINTABLE standard
>
> - 'Data columns' are the columns to be encoded
>
> - N_TOT is the total number of data columns to be stored
>
> - Data columns with (1-based) indexes from 999 to N_TOT inclusive
> are known as 'extended' columns. Their data is stored
> within the 'container' column.
>
> - BINTABLE column 999 is known as the 'container' column
> It contains the byte data for all the 'extended' columns.
>
> Convention:
>
> - All column data (for columns 1 to N_TOT) is laid out in the
> data part
> of the HDU in exactly the same way as if there were no
> 999-column
> limit.
>
> - The TFIELDS header is declared with the value 999.
>
> - The container column is declared in the header with some
> TFORM999 value corresponding to the total field length required
> by all the extended columns ('B' is the obvious data type, but
> any legal TFORM value that gives the right width MAY be used).
> The byte count implied by TFORM999 MUST be equal to the
> total byte count implied by all extended columns.
>
> - Other XXXXX999 headers MAY optionally be declared to describe
> the container column in accordance with the usual rules,
> e.g. TTYPE999 to give it a name.
>
> - The NAXIS1 header is declared in the usual way to give the width
> of a table row in bytes. This is equal to the sum of
> all the BINTABLE columns as usual. It is also equal to
> the sum of all the data columns, which has the same value.
>
> - Headers for Data columns 1-998 are declared as usual,
> corresponding to BINTABLE columns 1-998.
>
> - Keyword XT_ICOL indicates the index of the container column.
> It MUST be present with the integer value 999 to indicate
> that this convention is in use.
>
> - Keyword XT_NCOL indicates the total number of data columns encoded.
> It MUST be present with an integer value equal to N_TOT.
>
> - Metadata for each extended column is encoded with keywords
> of the form XXXXXaaa, where XXXXX are the same keyword roots
> as used for normal BINTABLE extensions, and aaa is a 3-digit
> value in base 26 using the characters 'A' (0 in base 26) to
> 'Z' (25 in base 26), and giving the 1-based data column index
> minus 999. The sequence aaa MUST be exactly three characters
> long (leading 'A's are required). Thus the formats for data
> columns 999, 1000, 1001, etc are declared with the keywords
> TFORMAAA, TFORMAAB, TFORMAAC etc.
>
> - This convention MUST NOT be used for N_TOT<=999.
>
> The resulting HDU is a completely legal FITS BINTABLE extension.
> Readers aware of this convention may use it to extract column
> data and metadata beyond the 999-column limit.
> Readers unaware of this convention will see 998 columns in their
> intended form, and an additional (possibly large) column 999
> which contains byte data but which cannot be easily interpreted.
>
> This convention can therefore allow encoding of tables with data
> column counts N_TOT up to 998+26^3 = 18574.
>
> An example header might look like this:
>
> XTENSION= 'BINTABLE' / binary table extension
> BITPIX = 8 / 8-bit bytes
> NAXIS = 2 / 2-dimensional table
> NAXIS1 = 9229 / width of table in bytes
> NAXIS2 = 26 / number of rows in table
> PCOUNT = 0 / size of special data area
> GCOUNT = 1 / one data group
> TFIELDS = 999 / number of columns
> XT_ICOL = 999 / index of container column
> XT_NCOL = 1204 / total columns including extended
> TTYPE1 = 'posid_1 ' / label for column 1
> TFORM1 = 'J ' / format for column 1
> TTYPE2 = 'instrument_1' / label for column 2
> TFORM2 = '4A ' / format for column 2
> TTYPE3 = 'edge_code_1' / label for column 3
> TFORM3 = 'I ' / format for column 3
> TUCD3 = 'meta.code.qual'
> ...
> TTYPE998= 'var_min_s_2' / label for column 998
> TFORM998= 'D ' / format for column 998
> TUNIT998= 'counts/s' / units for column 998
> TTYPE999= 'XT_MORECOLS' / label for column 999
> TFORM999= '813I ' / format for column 999
> TTYPEAAA= 'var_min_u_2' / label for column 999
> TFORMAAA= 'D ' / format for column 999
> TUNITAAA= 'counts/s' / units for column 999
> TTYPEAAB= 'var_prob_h_2' / label for column 1000
> TFORMAAB= 'D ' / format for column 1000
> ...
> TTYPEAHW= 'var_prob_w_2' / label for column 1203
> TFORMAHW= 'D ' / format for column 1203
> TTYPEAHX= 'var_sigma_w_2' / label for column 1204
> TFORMAHX= 'D ' / format for column 1204
> TUNITAHX= 'counts/s' / units for column 1204
> END
>
> This general approach was suggested by William Pence on the FITSBITS
> list in June 2012
> (https://listmgr.nrao.edu/pipermail/fitsbits/2012-June/002367.html
> <https://listmgr.nrao.edu/pipermail/fitsbits/2012-June/002367.html>),
> and by Francois-Xavier Pineau (CDS) in private conversation in 2016.
> The details have been filled in by Mark Taylor (Bristol).
> (F-X favours a different mechanism for encoding the extended
> column metadata).
>
> --
> Mark Taylor Astronomical Programmer Physics, Bristol
> University, UK
> m.b.taylor at bris.ac.uk <mailto:m.b.taylor at bris.ac.uk>
> +44-117-9288776 <tel:%2B44-117-9288776>
> http://www.star.bris.ac.uk/~mbt/ <http://www.star.bris.ac.uk/%7Embt/>
>
> _______________________________________________
> fitsbits mailing list
> fitsbits at listmgr.nrao.edu <mailto:fitsbits at listmgr.nrao.edu>
> https://listmgr.nrao.edu/mailman/listinfo/fitsbits
> <https://listmgr.nrao.edu/mailman/listinfo/fitsbits>
>
> _______________________________________________
> fitsbits mailing list
> fitsbits at listmgr.nrao.edu <mailto:fitsbits at listmgr.nrao.edu>
> https://listmgr.nrao.edu/mailman/listinfo/fitsbits
> <https://listmgr.nrao.edu/mailman/listinfo/fitsbits>
>
>
>
>
> _______________________________________________
> fitsbits mailing list
> fitsbits at listmgr.nrao.edu
> https://listmgr.nrao.edu/mailman/listinfo/fitsbits
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listmgr.nrao.edu/pipermail/fitsbits/attachments/20170710/fc7ab23a/attachment-0001.html>
More information about the fitsbits
mailing list