[fitsbits] FITS 'keyword dictionaries'

Erik Bray embray at stsci.edu
Wed Apr 2 17:10:15 EDT 2014


On 04/01/2014 04:47 PM, Joe Hourcle wrote:
>
>
> I don't know if this is considered on topic for this mailing list or
> not.  If it isn't, I apologize, and would appreciate guidance on what
> might be a better forum for this question.
>
>
> I've been looking through the different ways that instrument teams in
> solar physics document the meaning of their FITS headers, and there
> doesn't seem to be a standard way to do it within our community.
>
> In looking at the FITS website, there are example 'Keyword Dictionaries' :
>
> 	http://fits.gsfc.nasa.gov/fits_dictionary.html
>
> As an example entry :
>
> 	NAME:         Time.date
> 	KEYWORD:      DATEOBS
> 	DEFAULT:      DATE-OBS
> 	INDEX:        none
> 	HDU:          primary & extension
> 	VALUE:        %s (date)
> 	UNITS:
> 	COMMENT:      Date of observation
> 	EXAMPLE:      '05/04/87'
> 	DESCRIPTION:
> 	    Default date for the observation.  This keyword is generally not used
> 	    and is DATE-OBS keyword for the start of the exposure on the detector
> 	    is used.
>
> I was wondering if this format is officially documented, or is it up to each
> archive to generate however they see fit.
>
> I say this, because I've seen some slight variances ... 'VALUE' may be a
> format specification such as for sprintf ('%d'), with an optional comment
> ('%s (date)'), or an english word ('integer').  Or, the field 'DATATYPE'
> is used in its place.
>
>
> In looking through all of the keyword dictionaries I've found so far, the
> possible fields include:
>
> 	KEYWORD
> 	NAME	     / ucd ?
> 	DEFAULT      / what to use if absent; a value (eg, 1.0) or another keyword
> 	INDEX        / value range of keyword index (eg, 'NAXISn' is 1-9)
> 	REFERENCE    / URL to documentation
> 	STATUS       / 'mandatory', 'reserved'
> 	HDU          / where it's valid, 'any', 'primary', 'table', 'extension', 'image'.
> 	VALUE        / format specification (%s) or 'string', 'integer', etc.
> 	DATATYPE     / 'string', 'integer', 'logical', 'real'.
> 	UNITS        / expected or suggested (assumed?) units for the value
> 	RANGE        / of the format '[min:max]', where max may be absent
> 	EXAMPLE      / example value
> 	COMMENT      / short description (to be used as default comment in FITS card)
> 	DESCRIPTION  / longer descriptive test
>
> Slight variations or other similar formats:
> 	Units/Options / units *or* list of enumerations
> 	Option Flag   / 'Y'=optional, 'N'=required, 'C'=constant
>   	Data Type     / Format specification; 'CHR', 'C20', 'I4', 'R4', etc.
>
>
> ....
>
> The reason that I ask is that I'm trying to:
>
> 	1. Come up with recommendations for groups documenting their data
> 	2. Identify where two projects have incompatible definitions
> 	   for a given keyword.
> 	3. Use the resulting documentation to automatically generate
> 	    database structures to store the HDU, and associated
> 	    search interface
>
> It's that last one that might be tricky, as I'd want to know if a
> given keyword is a constant, enumeration, or infinitely variable; how
> to expand coded values or bitmasks; etc.
>
> It might be that there are other projects working on similar efforts,
> but the only thing that I could find isn't directly related:
>
> 	Vocabularies in the Virtual Observatory
> 	http://www.ivoa.net/documents/latest/Vocabularies.html


Hi Joe,

It's interesting to me that you should bring this up, as I do have a *related* 
project that I have been circulating a bit to a handful of audiences.  I'll get 
to that a bit below.

I do not know specifically if the format of the FITS keyword dictionary is 
specified anywhere.  Another frustration I've found with such keyword 
dictionaries is that they don't often specify relationships between 
keywords--they just provide a flat list of keywords that might show up in an 
arbitrary header, but don't often describe what keywords do or do not belong 
with specific types of data products from specific observatories, etc.

The HST keyword dictionary is in many areas incomplete or out of date I've 
found.  The ESO seems to do better than most.  But regardless all of these 
dictionaries are disconnected from each other so finding incompatible 
definitions and the like could be challenging.  Really there are myriad "custom" 
formats and conventions out there, and not all of them are well specified.

Now as for what I'm working on related to this: I've been experimenting with a 
data structure designed exactly for the purpose of managing collections of rules 
that specify what keywords can be used with headers for different data products, 
the values those keywords may have associated with them, and also relationships 
between different keywords within a convention.  I'm calling this a "schema" for 
FITS headers.  The information included in a schema is not unlike the keyword 
dictionary format you described above, but is focused on exactly what keywords 
should go with a specific header.  The work I've done so far implements schemas 
for the standard FITS headers, but can easily be extended to describe local 
conventions.

I wasn't intending to broadcast this to the broader community yet, so I won't go 
into too much detail here.  I just decided to go ahead and bring it up since it 
was relevant to your search for related efforts.  The documentation for the 
current prototype can be read here:

http://embray.github.io/PyFITS/schema/users_guide/users_schema.html

A prototypical example of a schema can be seen here:

https://github.com/embray/PyFITS/blob/standard-keywords/lib/pyfits/hdu/base.py#L71

This was originally developed to be part of PyFITS, and one of the most 
controversial aspects of it so far has been its close tie to Python.  But there 
are good, practical reasons for that and it does not limit its applicability 
(really this is an enhancement).  It can be used outside of PyFITS or an 
entirely Python-based software ecosystem as a way of organizing descriptions of 
FITS keywords.

Best,

Erik




More information about the fitsbits mailing list