<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>I should comment that there's no reason this wouldn't compress as
normal using fpack, but the container column would not generally
compress efficiently because of the mixed data types. A future
update to fpack could become wide-table aware if deemed desirable.</p>
<p>Rob</p>
<p>--<br>
</p>
<br>
<div class="moz-cite-prefix">On 7/10/17 9:25 AM, Rob Seaman wrote:<br>
</div>
<blockquote type="cite"
cite="mid:be4c5cd8-c86e-42e0-f3d0-53d0178dea73@lpl.arizona.edu">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<p>Thanks for the info about usage context. Separating the tables
into multiple files or extensions still seems a reasonable way
to address these cases, but since Mark T's proposed convention
(apparently originally from Bill) is legal or near-legal FITS
usage, the main question is how best to discourage a diversity
of keyword encodings, etc.<br>
</p>
<p>Also agree with Mark C's encoding, though would suggest
mono-case will be less of a confusing change than lower case.
Mark C's option avoids confusing usage like TFORM0AA or whatever
interrupting the sort order. A digit in character 6 would
require digits in #7 and 8.</p>
<p>Nobody has mentioned extremely wide table use cases (millions
of columns), and 34695 columns is enough to cover all the wide
table DB options listed in a previous email.</p>
Rob<br>
--<br>
<br>
<div class="moz-cite-prefix">On 7/10/17 8:34 AM, Arnold Rots
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAJXToE9PdjouEPs9gpvzc+8+31FeLeQBK7b+82jZxwX+pdap+Q@mail.gmail.com">
<div dir="ltr">
<div>From all the suggestions offered so far, Mark's is by far
the most sensible in my opinion since it provides a
significant expansion while preserving full backward
compatibility.<br>
<br>
</div>
- Arnold<br>
</div>
<div class="gmail_extra"><br clear="all">
<div>
<div class="gmail_signature"
data-smartmail="gmail_signature">
<div dir="ltr">-------------------------------------------------------------------------------------------------------------<br>
Arnold H. Rots
Chandra X-ray Science Center<br>
Smithsonian Astrophysical Observatory
tel: +1 617 496 7701<br>
60 Garden Street, MS 67
fax: +1 617 495 7356<br>
Cambridge, MA 02138
<a href="mailto:arots@cfa.harvard.edu"
target="_blank" moz-do-not-send="true">arots@cfa.harvard.edu</a><br>
USA <a
href="http://hea-www.harvard.edu/%7Earots/"
target="_blank" moz-do-not-send="true">http://hea-www.harvard.edu/~arots/</a><br>
--------------------------------------------------------------------------------------------------------------<br>
<br>
</div>
</div>
</div>
<br>
<div class="gmail_quote">On Fri, Jul 7, 2017 at 8:51 PM, Mark
Calabretta <span dir="ltr"><<a
href="mailto:mark@calabretta.id.au" target="_blank"
moz-do-not-send="true">mark@calabretta.id.au</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">Taking
into consideration what others have said on this thread, I
would<br>
like to point out that up to 34695 bintable columns may
easily be<br>
accomodated, with full backward compatibility, via a
simple extension<br>
to the FITS standard. Namely,<br>
<br>
1. When encoding bintable-related keywords such as ijPCna,
allow<br>
lower-case letters to represent digits in a base-36
counting system.<br>
<br>
2. Number bintable columns 1 to 999, followed by a00 to
zzz, where an<br>
offset (-11960) is applied to make a00 column 1000.
The total number<br>
of columns is then 999 + 26*36*36 = 34695.
(Alternatively, the full<br>
range of three-digit base-36 counting, namely 46656,
could be<br>
recovered with a more elaborate ordering.)<br>
<br>
Regards,<br>
Mark Calabretta<br>
<div class="HOEnZb">
<div class="h5"><br>
<br>
On Fri, 7 Jul 2017 12:09:15 +0100 (BST)<br>
Mark Taylor <<a
href="mailto:M.B.Taylor@bristol.ac.uk"
moz-do-not-send="true">M.B.Taylor@bristol.ac.uk</a>>
wrote:<br>
<br>
Dear fitsbits,<br>
<br>
I am considering a convention for storing table data
in FITS files<br>
where the number of columns exceeds the 999 limit
implicitly imposed<br>
by the standard BINTABLE extension type. I have
running code for<br>
this (available on request) and plan to incorporate it
in future<br>
releases of STIL/STILTS/TOPCAT so that people can work
with wide<br>
tables in FITS while using those tools. People using
software<br>
that is unaware of this convention would still see a
legal BINTABLE<br>
but not the later columns.<br>
<br>
I'm posting the details here in case people want to
comment,<br>
or point out some major problem with the idea that I
might have<br>
overlooked, or tell me that there's already a
convention for<br>
this out there that I should be using instead.
Otherwise, please<br>
feel free to ignore this post. I'm not requesting
that any<br>
other software implements this, though if anyone wants
to I<br>
certainly don't object.<br>
<br>
Mark<br>
<br>
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .<br>
<br>
Extended column convention for FITS BINTABLE<br>
------------------------------<wbr>--------------<br>
<br>
The BINTABLE extension type as described in the FITS
Standard<br>
(FITS Standard v3.0, sec 7.3) requires table column
metadata<br>
to be described using 8-character keywords of the form
XXXXXnnn,<br>
where XXXXX represents one of an open set of
mandatory, reserved<br>
or user-defined root keywords up to five characters in
length,<br>
for instance TFORM (mandatory), TUNIT (reserved), TUCD
(user-defined).<br>
The nnn part is an integer between 1 and 999
indicating the<br>
index of the column to which the keyword in question
refers.<br>
Since the header syntax confines this indexed part of
the keyword<br>
to three digits, there is an upper limit of 999
columns in<br>
BINTABLE extensions.<br>
<br>
Note that the FITS/BINTABLE format does not entail any
restriction on<br>
the storage of column *data* beyond the 999 column
limit in the data<br>
part of the HDU, the problem is just that client
software<br>
cannot be informed about the layout of this data using
the<br>
header cards in the usual way.<br>
<br>
In some cases it is desirable to store FITS tables
with a column<br>
count greater than 999. Whether that's a good idea is
not within<br>
the scope of this discussion.<br>
<br>
To achieve this, I propose the following convention.<br>
<br>
Definitions:<br>
<br>
- 'BINTABLE columns' are those columns defined using
the<br>
FITS BINTABLE standard<br>
<br>
- 'Data columns' are the columns to be encoded<br>
<br>
- N_TOT is the total number of data columns to be
stored<br>
<br>
- Data columns with (1-based) indexes from 999 to
N_TOT inclusive<br>
are known as 'extended' columns. Their data is
stored<br>
within the 'container' column.<br>
<br>
- BINTABLE column 999 is known as the 'container'
column<br>
It contains the byte data for all the 'extended'
columns.<br>
<br>
Convention:<br>
<br>
- All column data (for columns 1 to N_TOT) is laid
out in the data part<br>
of the HDU in exactly the same way as if there
were no 999-column<br>
limit.<br>
<br>
- The TFIELDS header is declared with the value 999.<br>
<br>
- The container column is declared in the header with
some<br>
TFORM999 value corresponding to the total field
length required<br>
by all the extended columns ('B' is the obvious
data type, but<br>
any legal TFORM value that gives the right width
MAY be used).<br>
The byte count implied by TFORM999 MUST be equal
to the<br>
total byte count implied by all extended
columns.<br>
<br>
- Other XXXXX999 headers MAY optionally be declared
to describe<br>
the container column in accordance with the
usual rules,<br>
e.g. TTYPE999 to give it a name.<br>
<br>
- The NAXIS1 header is declared in the usual way to
give the width<br>
of a table row in bytes. This is equal to the
sum of<br>
all the BINTABLE columns as usual. It is also
equal to<br>
the sum of all the data columns, which has the
same value.<br>
<br>
- Headers for Data columns 1-998 are declared as
usual,<br>
corresponding to BINTABLE columns 1-998.<br>
<br>
- Keyword XT_ICOL indicates the index of the
container column.<br>
It MUST be present with the integer value 999 to
indicate<br>
that this convention is in use.<br>
<br>
- Keyword XT_NCOL indicates the total number of data
columns encoded.<br>
It MUST be present with an integer value equal
to N_TOT.<br>
<br>
- Metadata for each extended column is encoded with
keywords<br>
of the form XXXXXaaa, where XXXXX are the same
keyword roots<br>
as used for normal BINTABLE extensions, and aaa
is a 3-digit<br>
value in base 26 using the characters 'A' (0 in
base 26) to<br>
'Z' (25 in base 26), and giving the 1-based data
column index<br>
minus 999. The sequence aaa MUST be exactly
three characters<br>
long (leading 'A's are required). Thus the
formats for data<br>
columns 999, 1000, 1001, etc are declared with
the keywords<br>
TFORMAAA, TFORMAAB, TFORMAAC etc.<br>
<br>
- This convention MUST NOT be used for N_TOT<=999.<br>
<br>
The resulting HDU is a completely legal FITS BINTABLE
extension.<br>
Readers aware of this convention may use it to extract
column<br>
data and metadata beyond the 999-column limit.<br>
Readers unaware of this convention will see 998
columns in their<br>
intended form, and an additional (possibly large)
column 999<br>
which contains byte data but which cannot be easily
interpreted.<br>
<br>
This convention can therefore allow encoding of tables
with data<br>
column counts N_TOT up to 998+26^3 = 18574.<br>
<br>
An example header might look like this:<br>
<br>
XTENSION= 'BINTABLE' / binary table
extension<br>
BITPIX = 8 / 8-bit bytes<br>
NAXIS = 2 / 2-dimensional
table<br>
NAXIS1 = 9229 / width of table in
bytes<br>
NAXIS2 = 26 / number of rows in
table<br>
PCOUNT = 0 / size of special
data area<br>
GCOUNT = 1 / one data group<br>
TFIELDS = 999 / number of columns<br>
XT_ICOL = 999 / index of
container column<br>
XT_NCOL = 1204 / total columns
including extended<br>
TTYPE1 = 'posid_1 ' / label for column
1<br>
TFORM1 = 'J ' / format for column
1<br>
TTYPE2 = 'instrument_1' / label for column
2<br>
TFORM2 = '4A ' / format for column
2<br>
TTYPE3 = 'edge_code_1' / label for column
3<br>
TFORM3 = 'I ' / format for column
3<br>
TUCD3 = 'meta.code.qual'<br>
...<br>
TTYPE998= 'var_min_s_2' / label for column
998<br>
TFORM998= 'D ' / format for column
998<br>
TUNIT998= 'counts/s' / units for column
998<br>
TTYPE999= 'XT_MORECOLS' / label for column
999<br>
TFORM999= '813I ' / format for column
999<br>
TTYPEAAA= 'var_min_u_2' / label for column
999<br>
TFORMAAA= 'D ' / format for column
999<br>
TUNITAAA= 'counts/s' / units for column
999<br>
TTYPEAAB= 'var_prob_h_2' / label for column
1000<br>
TFORMAAB= 'D ' / format for column
1000<br>
...<br>
TTYPEAHW= 'var_prob_w_2' / label for column
1203<br>
TFORMAHW= 'D ' / format for column
1203<br>
TTYPEAHX= 'var_sigma_w_2' / label for column
1204<br>
TFORMAHX= 'D ' / format for column
1204<br>
TUNITAHX= 'counts/s' / units for column
1204<br>
END<br>
<br>
This general approach was suggested by William Pence
on the FITSBITS<br>
list in June 2012<br>
(<a
href="https://listmgr.nrao.edu/pipermail/fitsbits/2012-June/002367.html"
rel="noreferrer" target="_blank"
moz-do-not-send="true">https://listmgr.nrao.edu/<wbr>pipermail/fitsbits/2012-June/<wbr>002367.html</a>),<br>
and by Francois-Xavier Pineau (CDS) in private
conversation in 2016.<br>
The details have been filled in by Mark Taylor
(Bristol).<br>
(F-X favours a different mechanism for encoding the
extended<br>
column metadata).<br>
<br>
--<br>
Mark Taylor Astronomical Programmer Physics,
Bristol University, UK<br>
<a href="mailto:m.b.taylor@bris.ac.uk"
moz-do-not-send="true">m.b.taylor@bris.ac.uk</a> <a
href="tel:%2B44-117-9288776" value="+441179288776"
moz-do-not-send="true">+44-117-9288776</a> <a
href="http://www.star.bris.ac.uk/%7Embt/"
rel="noreferrer" target="_blank"
moz-do-not-send="true">http://www.star.bris.ac.uk/~<wbr>mbt/</a><br>
<br>
______________________________<wbr>_________________<br>
fitsbits mailing list<br>
<a href="mailto:fitsbits@listmgr.nrao.edu"
moz-do-not-send="true">fitsbits@listmgr.nrao.edu</a><br>
<a
href="https://listmgr.nrao.edu/mailman/listinfo/fitsbits"
rel="noreferrer" target="_blank"
moz-do-not-send="true">https://listmgr.nrao.edu/<wbr>mailman/listinfo/fitsbits</a><br>
<br>
______________________________<wbr>_________________<br>
fitsbits mailing list<br>
<a href="mailto:fitsbits@listmgr.nrao.edu"
moz-do-not-send="true">fitsbits@listmgr.nrao.edu</a><br>
<a
href="https://listmgr.nrao.edu/mailman/listinfo/fitsbits"
rel="noreferrer" target="_blank"
moz-do-not-send="true">https://listmgr.nrao.edu/<wbr>mailman/listinfo/fitsbits</a><br>
</div>
</div>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
fitsbits mailing list
<a class="moz-txt-link-abbreviated" href="mailto:fitsbits@listmgr.nrao.edu" moz-do-not-send="true">fitsbits@listmgr.nrao.edu</a>
<a class="moz-txt-link-freetext" href="https://listmgr.nrao.edu/mailman/listinfo/fitsbits" moz-do-not-send="true">https://listmgr.nrao.edu/mailman/listinfo/fitsbits</a>
</pre>
</blockquote>
<br>
</blockquote>
<br>
</body>
</html>