[fitsbits] Fwd: Re: BINTABLE convention for >999 columns

jaffe jaffe at strw.leidenuniv.nl
Sun Jul 30 08:54:11 EDT 2017


I agree with Mark (Calabretta).  This is a kludge.
Walter

-------- Original Message --------
Subject: Re: [fitsbits] BINTABLE convention for >999 columns
Date: 2017-07-29 10:18
 From: Mark Calabretta <mark at calabretta.id.au>
To: Mark Taylor <m.b.taylor at bristol.ac.uk>
Cc: fitsbits at listmgr.nrao.edu

ines: 133

Hi Mark,

I will simply register my opinion that what you are proposing to do,
namely shoehorn multiple columns into one, is an unfortunate kludge(*).
It is made more egregious by the fact that a simple extension to the
standard would solve the problem straightforwardly.  Namely, increasing
the maximum permissible value of TFIELDS and using extended indexing, as
I outlined previously, on the existing column-related keywords (not
forgetting the many WCS bintable and pixlist keywords).  Only relatively
minor changes would be required to existing software to support it.

The underlying, unstated, problem here is the perception that it's too
difficult to change the FITS standard.  Understandably, this has led in
the past to calls for a replacement data format.  However, if I were
you I would simply publish what I intended to do, and write files that
violate the standard.  The end result would be the same, namely that
unaware FITS readers would not be able to cope with wide tables.  It
would actually be a benefit if they complained or even crashed in the
attempt, so alerting astronomers to the fact that their software cannot
handle the data.

(*) Reminiscent of DOS MBR extended/logical vs primary disk partitions.

Regards,
Mark Calabretta



On Fri, 28 Jul 2017 15:05:17 +0100 (BST)
Mark Taylor <m.b.taylor at bristol.ac.uk> wrote:

Coming back to this after a bit of a breather:

To summarise the dicussion, enthusiasm for my proposed
convention for wide (>999 column) BINTABLES has not been
universal, but I am still planning to implement something
along these lines for my purposes (STIL/STILTS/TOPCAT).
The possibility exists of other software deciding to recognise
such a convention at some point in the future, but I'm not
relying on that or even necessarily recommending it.

In terms of the details, there was one main difference of opinion,
namely how to store the column metadata for the 'extended'
columns in the FITS header.  The suggestion I put forward was
to use a base-26 number giving headers TFORMAAA - TFORMZZZ,
which leads to a limit of 18574 columns.  Francois-Xavier
Pineau suggested instead using the HIERARCH convention,
which would allow a more or less unlimited column count.

For concreteness, this HIERARCH-based variant differs from
my original proposal
(https://listmgr.nrao.edu/pipermail/fitsbits/2017-July/002967.html)
in the following way:

     - Metadata for each extended column is encoded with keywords
       of the form HIERARCH XT XXXXXnnnnn, where XXXXX
       are the same keyword roots as used for normal BINTABLE extensions,
       and nnnnn is a decimal number written as usual (no leading zeros,
       as many digits as required).  Thus the formats for data
       columns 999, 1000, 1001 etc are declared with the keywords
       HIERARCH XT TFORM999, HIERARCH XT TFORM1000, HIERARCH XT TFORM1001
       etc.  Note this uses the ESO HIERARCH convention described at
       https://fits.gsfc.nasa.gov/registry/hierarch_keyword.html.
       The "name space" token has been chosen as "XT" (extended table).

and the example header looks identical to my original example up
to TFORM999, but the remaining entries differ:

    TTYPE998= 'var_min_s_2'        /  label for column 998
    TFORM998= 'D       '           /  format for column 998
    TUNIT998= 'counts/s'           /  units for column 998
    TTYPE999= 'XT_MORECOLS'        /  label for column 999
    TFORM999= '813I    '           /  format for column 999
    HIERARCH XT TTYPE999         = 'var_min_u_2' / label for column 999
    HIERARCH XT TFORM999         = 'D' / format for column 999
    HIERARCH XT TUNIT999         = 'counts/s' / units for column 999
    HIERARCH XT TTYPE1000        = 'var_prob_h_2' / label for column 1000
    HIERARCH XT TFORM1000        = 'D' / format for column 1000
     ...
    HIERARCH XT TTYPE1203        = 'var_prob_w_2' / label for column 1203
    HIERARCH XT TFORM1203        = 'D' / format for column 1203
    HIERARCH XT TTYPE1204        = 'var_sigma_w_2' / label for column 
1204
    HIERARCH XT TFORM1204        = 'D' / format for column 1204
    HIERARCH XT TUNIT1204        = 'counts/s' / units for column 1204
    END

I have implemented and tested both variants, and they both work.
The HIERARCH solution is a bit messier to do because it relies
on a non-standard convention.

Summarising the pros and cons of these two variants:

    Base-26:
     - limited to 18,000 columns ...
       ... but nobody has come up with a plausible case to need more
     - looks kludgy
     - not very human readable

    HIERARCH:
     - requires non-FITS convention (HIERARCH)
     - effectively no column count limit
     - 13 or so fewer characters available for column keyword values
     - easily human readable

The balance of opinion in this thread of those who have expressed
a preference between the two seems to have been in favour of the
HIERARCH option (Francois-Xavier Pineau, Bill Pence, Tom McGlynn)
as opposed to the Base-26 option (me, Rob Seaman, Arnold Rots?).
In view of that, and the nagging worry that somebody might come
up with some reason to store 20k+ columns, I think I'm just
about coming down on the HIERARCH side, though it does look
less FITSy to me.

This message is to give a last chance for anybody to weigh in
on one side or the other of the Base-26/HIERARCH question,
in particular anybody who thinks they might end up one day
wanting to implement support for this (which may be nobody!).
If there is no more input on that question (which is fine by me),
I'll decide one way or the other, implement and release it in
STIL/STILTS/TOPCAT, and report back here.

Thanks for reading and for the community input on this.

Mark

--
Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
m.b.taylor at bris.ac.uk +44-117-9288776  http://www.star.bris.ac.uk/~mbt/

_______________________________________________
fitsbits mailing list
fitsbits at listmgr.nrao.edu
https://listmgr.nrao.edu/mailman/listin



More information about the fitsbits mailing list