[fitsbits] Five FITS Proposals

Rob Seaman seaman at noao.edu
Thu Oct 24 15:06:29 EDT 2013


On Oct 23, 2013, at 10:13 AM, William Pence <William.Pence at nasa.gov> wrote:

> During the recent FITS Birds-of-a-Feather session at the ADASS meeting on Sept 30th, there was some discussion about the shortcomings of the current FITS format.

I missed the BoF but have read the discussion on the associated mailing list.  They are writing up a white paper with extensive not-very-appreciative comments.  It would be best to separate the characterization of the perceived problem space from entertaining possible solutions.  Not only is it likely that there will continue to be disagreement on the nature of the requirements for astronomical data format(s), but it seems to me their comments have already exceeded the flexibility of FITS.

Rather, a prudent course would be to continue to support FITS in its current form, including conservative evolutionary steps building on the registry of conventions.  And *also* pursue a shiny new astronomical data standard meeting needs similar to those discussed.  (And may such a format be half as successful as FITS has been :-)

> This discussion has motivated me to consider what relatively simple changes could be made to FITS to address some of the most commonly heard complaints.  After some discussions with a few of my colleagues, I've drafted 5 specific proposals that should be fairly simple to implement:
> 
> 1.  Allow longer keyword names
> 2.  Allow arbitrarily long character string keyword values
> 3.  Allow additional characters in keyword names
> 4.  Introduce a new FITS version keyword
> 5.  Define a convention for preallocating space for keywords in FITS headers for later use
> 
> The full description of these proposals is available at
> 
> http://fits.gsfc.nasa.gov/proposals/proposals.html

I don't have a strong feeling either way about these, except that the evolution of FITS should be driven by the needs of FITS users, not by the critique of those who neither like or use FITS.

> I think these proposals would be a good first step towards addressing the most basic problems with FITS.

FITS' weaknesses are often its strengths.  Rather than 5 new conventions to tinker with the current header restrictions, perhaps one convention for embedding metadata in a new extension, as mentioned by Preben and others.  If this were implemented as a binary table it would benefit from the many tools that have already been deployed for handling such tables.  It could also benefit from the proposed tiled-table convention:

	http://arxiv.org/abs/1201.1340

A FITS file could then have a one-record data-less PDHU containing only structural and logistical metadata (CHECKSUM, etc) followed by a sequence of imaging or tabular data EHDUs and ending with a metadata bin-table EHDU containing the equivalent of what are now expressed as header keywords in the primary header.  The metadata for the data EHDUs would be in the same metadata EHDU (or perhaps in separate EHDUs) - a hierarchical tabular data structure could address not only the HIERARCH issue, but also keyword inheritance, etc.

Such a metadata bintable could then address the keyword name and value length restrictions, could include support for non-ASCII characters (since it would be a binary table), the preallocation requirement would be met by having the metadata extension at the end of the file, i.e., no need to overwrite any preceding data records to update the metadata.  And if we want version keyword(s) we could add them in either the primary header or the metadata extension header record itself.

This notion permits backwards compatibility via straightforward tools that would convert back-and-forth from the current format FITS keywords to metadata records.  FITS then looks like:

	1 primary header record (2880 bytes)
	N binary tables containing data
	1 binary table containing metadata

This assumes that imaging data are tile-compressed (as why shouldn't they be? :-)  The extension header records would define the tables, support checksums, and not much else.  This format could be prototyped today and would be completely legal FITS with no need for additional conventions, e.g., to support native XML.  (And support already exists for embedding such metadata in tables, we would just need to define the precise table schema to meet the needs of "header" metadata.)

It may well be that the astrodataformat folks won't find it acceptable to constrain data and metadata objects to continue to adhere to the basic FITS architecture.  In that case I think the question has to be whether they should be thinking about "FITS v2" at all, or simply define or adopt a brand new format.

Also, FITS originated as a data interchange format and has been spectacularly successful at this, with a rate of adoption that is the envy of other communities.  There have always been other formats that are used in production workflows (e.g., IRAF .imh images were observed on Cerro Tololo this week).  The role of FITS can return to providing interchange (and archive) support, with other formats like HDF5 being used on the astro battlefields of the 21st century, if native FITS as described still isn't sufficient.

Rob Seaman
NOAO





More information about the fitsbits mailing list