[fitsbits] BINTABLE convention for >999 columns

William Pence William.Pence at nasa.gov
Sun Jul 30 15:34:34 EDT 2017


Mark,

This seems to me to be a good solution to the particular use case you 
outlined, namely to allow TOPCAT users to temporarily store the results 
from a cross-correlation of 2 FITS tables for later analysis using 
TOPCAT.  This is not intended to be a general solution for supporting 
very wide tables in FITS.  If the FITS community decided that this was a 
serious issue that should be addressed, then I think a much better 
solution would be to just relax the 8-character limit on the length of 
keyword names so that the column number suffix on the keyword name can 
be longer than 3 digits.

As an aside, I think this 8-character limit on keyword names is probably 
the most serious current limitation in the FITS format.  Fixing this by 
allowing free-format 80-character header records where the equals sign 
is no longer required to be in byte 9 would not be difficult to 
implement and support.

-Bill

On 7/28/2017 10:05 AM, Mark Taylor wrote:
> Coming back to this after a bit of a breather:
>
> To summarise the dicussion, enthusiasm for my proposed
> convention for wide (>999 column) BINTABLES has not been
> universal, but I am still planning to implement something
> along these lines for my purposes (STIL/STILTS/TOPCAT).
> The possibility exists of other software deciding to recognise
> such a convention at some point in the future, but I'm not
> relying on that or even necessarily recommending it.
>
> In terms of the details, there was one main difference of opinion,
> namely how to store the column metadata for the 'extended'
> columns in the FITS header.  The suggestion I put forward was
> to use a base-26 number giving headers TFORMAAA - TFORMZZZ,
> which leads to a limit of 18574 columns.  Francois-Xavier
> Pineau suggested instead using the HIERARCH convention,
> which would allow a more or less unlimited column count.
>
> For concreteness, this HIERARCH-based variant differs from
> my original proposal
> (https://listmgr.nrao.edu/pipermail/fitsbits/2017-July/002967.html)
> in the following way:
>
>     - Metadata for each extended column is encoded with keywords
>       of the form HIERARCH XT XXXXXnnnnn, where XXXXX
>       are the same keyword roots as used for normal BINTABLE extensions,
>       and nnnnn is a decimal number written as usual (no leading zeros,
>       as many digits as required).  Thus the formats for data
>       columns 999, 1000, 1001 etc are declared with the keywords
>       HIERARCH XT TFORM999, HIERARCH XT TFORM1000, HIERARCH XT TFORM1001
>       etc.  Note this uses the ESO HIERARCH convention described at
>       https://fits.gsfc.nasa.gov/registry/hierarch_keyword.html.
>       The "name space" token has been chosen as "XT" (extended table).
>
> and the example header looks identical to my original example up
> to TFORM999, but the remaining entries differ:
>
>    TTYPE998= 'var_min_s_2'        /  label for column 998
>    TFORM998= 'D       '           /  format for column 998
>    TUNIT998= 'counts/s'           /  units for column 998
>    TTYPE999= 'XT_MORECOLS'        /  label for column 999
>    TFORM999= '813I    '           /  format for column 999
>    HIERARCH XT TTYPE999         = 'var_min_u_2' / label for column 999
>    HIERARCH XT TFORM999         = 'D' / format for column 999
>    HIERARCH XT TUNIT999         = 'counts/s' / units for column 999
>    HIERARCH XT TTYPE1000        = 'var_prob_h_2' / label for column 1000
>    HIERARCH XT TFORM1000        = 'D' / format for column 1000
>     ...
>    HIERARCH XT TTYPE1203        = 'var_prob_w_2' / label for column 1203
>    HIERARCH XT TFORM1203        = 'D' / format for column 1203
>    HIERARCH XT TTYPE1204        = 'var_sigma_w_2' / label for column 1204
>    HIERARCH XT TFORM1204        = 'D' / format for column 1204
>    HIERARCH XT TUNIT1204        = 'counts/s' / units for column 1204
>    END
>
> I have implemented and tested both variants, and they both work.
> The HIERARCH solution is a bit messier to do because it relies
> on a non-standard convention.
>
> Summarising the pros and cons of these two variants:
>
>    Base-26:
>     - limited to 18,000 columns ...
>       ... but nobody has come up with a plausible case to need more
>     - looks kludgy
>     - not very human readable
>
>    HIERARCH:
>     - requires non-FITS convention (HIERARCH)
>     - effectively no column count limit
>     - 13 or so fewer characters available for column keyword values
>     - easily human readable
>
> The balance of opinion in this thread of those who have expressed
> a preference between the two seems to have been in favour of the
> HIERARCH option (Francois-Xavier Pineau, Bill Pence, Tom McGlynn)
> as opposed to the Base-26 option (me, Rob Seaman, Arnold Rots?).
> In view of that, and the nagging worry that somebody might come
> up with some reason to store 20k+ columns, I think I'm just
> about coming down on the HIERARCH side, though it does look
> less FITSy to me.
>
> This message is to give a last chance for anybody to weigh in
> on one side or the other of the Base-26/HIERARCH question,
> in particular anybody who thinks they might end up one day
> wanting to implement support for this (which may be nobody!).
> If there is no more input on that question (which is fine by me),
> I'll decide one way or the other, implement and release it in
> STIL/STILTS/TOPCAT, and report back here.
>
> Thanks for reading and for the community input on this.
>
> Mark
>
> --
> Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
> m.b.taylor at bris.ac.uk +44-117-9288776  http://www.star.bris.ac.uk/~mbt/
>
> _______________________________________________
> fitsbits mailing list
> fitsbits at listmgr.nrao.edu
> https://listmgr.nrao.edu/mailman/listinfo/fitsbits
>
> ---
> This email has been checked for viruses by AVG.
> http://www.avg.com
>



More information about the fitsbits mailing list