[fitsbits] FITS 'keyword dictionaries'
Erik Bray
embray at stsci.edu
Wed Apr 2 17:10:15 EDT 2014
On 04/01/2014 04:47 PM, Joe Hourcle wrote:
>
>
> I don't know if this is considered on topic for this mailing list or
> not. If it isn't, I apologize, and would appreciate guidance on what
> might be a better forum for this question.
>
>
> I've been looking through the different ways that instrument teams in
> solar physics document the meaning of their FITS headers, and there
> doesn't seem to be a standard way to do it within our community.
>
> In looking at the FITS website, there are example 'Keyword Dictionaries' :
>
> http://fits.gsfc.nasa.gov/fits_dictionary.html
>
> As an example entry :
>
> NAME: Time.date
> KEYWORD: DATEOBS
> DEFAULT: DATE-OBS
> INDEX: none
> HDU: primary & extension
> VALUE: %s (date)
> UNITS:
> COMMENT: Date of observation
> EXAMPLE: '05/04/87'
> DESCRIPTION:
> Default date for the observation. This keyword is generally not used
> and is DATE-OBS keyword for the start of the exposure on the detector
> is used.
>
> I was wondering if this format is officially documented, or is it up to each
> archive to generate however they see fit.
>
> I say this, because I've seen some slight variances ... 'VALUE' may be a
> format specification such as for sprintf ('%d'), with an optional comment
> ('%s (date)'), or an english word ('integer'). Or, the field 'DATATYPE'
> is used in its place.
>
>
> In looking through all of the keyword dictionaries I've found so far, the
> possible fields include:
>
> KEYWORD
> NAME / ucd ?
> DEFAULT / what to use if absent; a value (eg, 1.0) or another keyword
> INDEX / value range of keyword index (eg, 'NAXISn' is 1-9)
> REFERENCE / URL to documentation
> STATUS / 'mandatory', 'reserved'
> HDU / where it's valid, 'any', 'primary', 'table', 'extension', 'image'.
> VALUE / format specification (%s) or 'string', 'integer', etc.
> DATATYPE / 'string', 'integer', 'logical', 'real'.
> UNITS / expected or suggested (assumed?) units for the value
> RANGE / of the format '[min:max]', where max may be absent
> EXAMPLE / example value
> COMMENT / short description (to be used as default comment in FITS card)
> DESCRIPTION / longer descriptive test
>
> Slight variations or other similar formats:
> Units/Options / units *or* list of enumerations
> Option Flag / 'Y'=optional, 'N'=required, 'C'=constant
> Data Type / Format specification; 'CHR', 'C20', 'I4', 'R4', etc.
>
>
> ....
>
> The reason that I ask is that I'm trying to:
>
> 1. Come up with recommendations for groups documenting their data
> 2. Identify where two projects have incompatible definitions
> for a given keyword.
> 3. Use the resulting documentation to automatically generate
> database structures to store the HDU, and associated
> search interface
>
> It's that last one that might be tricky, as I'd want to know if a
> given keyword is a constant, enumeration, or infinitely variable; how
> to expand coded values or bitmasks; etc.
>
> It might be that there are other projects working on similar efforts,
> but the only thing that I could find isn't directly related:
>
> Vocabularies in the Virtual Observatory
> http://www.ivoa.net/documents/latest/Vocabularies.html
Hi Joe,
It's interesting to me that you should bring this up, as I do have a *related*
project that I have been circulating a bit to a handful of audiences. I'll get
to that a bit below.
I do not know specifically if the format of the FITS keyword dictionary is
specified anywhere. Another frustration I've found with such keyword
dictionaries is that they don't often specify relationships between
keywords--they just provide a flat list of keywords that might show up in an
arbitrary header, but don't often describe what keywords do or do not belong
with specific types of data products from specific observatories, etc.
The HST keyword dictionary is in many areas incomplete or out of date I've
found. The ESO seems to do better than most. But regardless all of these
dictionaries are disconnected from each other so finding incompatible
definitions and the like could be challenging. Really there are myriad "custom"
formats and conventions out there, and not all of them are well specified.
Now as for what I'm working on related to this: I've been experimenting with a
data structure designed exactly for the purpose of managing collections of rules
that specify what keywords can be used with headers for different data products,
the values those keywords may have associated with them, and also relationships
between different keywords within a convention. I'm calling this a "schema" for
FITS headers. The information included in a schema is not unlike the keyword
dictionary format you described above, but is focused on exactly what keywords
should go with a specific header. The work I've done so far implements schemas
for the standard FITS headers, but can easily be extended to describe local
conventions.
I wasn't intending to broadcast this to the broader community yet, so I won't go
into too much detail here. I just decided to go ahead and bring it up since it
was relevant to your search for related efforts. The documentation for the
current prototype can be read here:
http://embray.github.io/PyFITS/schema/users_guide/users_schema.html
A prototypical example of a schema can be seen here:
https://github.com/embray/PyFITS/blob/standard-keywords/lib/pyfits/hdu/base.py#L71
This was originally developed to be part of PyFITS, and one of the most
controversial aspects of it so far has been its close tie to Python. But there
are good, practical reasons for that and it does not limit its applicability
(really this is an enhancement). It can be used outside of PyFITS or an
entirely Python-based software ecosystem as a way of organizing descriptions of
FITS keywords.
Best,
Erik
More information about the fitsbits
mailing list