[fitsbits] Recommendations for long-term preservation of FITS files?
Joe Hourcle
oneiros at grace.nascom.nasa.gov
Wed Jan 4 13:59:49 EST 2012
(I tried sending this yesterday, but I don't think I had completed all of
the list subscription process, so it didn't go through. If it does, I
apologize in advance for the duplicate messages)
As I newbie on the list, I don't want to spam the list with lots of separate
messages, and these are mostly related:
1. The Library of Congress has a group that's documentating different
formats used in digital preservation, and they don't currently have any
information about FITS:
http://www.digitalpreservation.gov/formats/
They do, however, have a note on each of their lists of:
Persons with specialized knowledge are encouraged to review and
comment on the descriptions using this contact form.
http://www.digitalpreservation.gov/formats/contact_format.shtml
I would think it would be useful for someone from the FITS community
to contact them to make sure FITS is properly represented.
2. I've been asked to come up with recommendations for people generating
FITS files in the solar physics community. It's currently in the shape
of a (poorly organized) checklist:
http://sdac.virtualsolar.org/fits_headers/fits_checklist.txt
It was suggested that I mock up some templates for organizations to
emulate, and it may also be useful to look at instructions on how to
use the various FITS-writing tools to achieve the things we want to
place in there (eg, how do you get each tool to generate checksums?
How do you insert comments for a card, or insert comments before/after
a specific card?) Of course, as I don't actually generate FITS files
personally, I don't know what the capabilities are for the various
tools.
I was wondering if any of the other disciplines have done anything
similar, and if there might be interest in trying to make something
that's more discipline agnostic?
I envision a header template that has a series of logical groupings,
eg. administrative (identifiers, checksums, responsible party),
instrument description, instrument location, pointing, observation
time, processing applied, a block for instrument-specific stuff, etc.
I imagine that some items are going to be solar-specific, but much of
it would apply to other fields, too.
3. At the Fall AGU meeting, I presented a poster in which I looked at the
headers of FITS files from various solar physics data archives:
http://eposters.agu.org/abstracts/review-of-provenance-metadata-in-solar-physics-data-archives/
... and well, there seems to be some ambiguity in how people are using
various keywords. It may be worth clarifying exactly what people
should be putting in 'ORIGIN' (is it the institution, the software, or
both?) and 'DETECTOR' (most give an acronym or name, but one had 'CMOS
1Kx1K', which is useful, but it seems like there might be a better
keyword for model/manufacturer or similar)
(note, I looked it up since sending this the first time -- 'ORIGIN' is
clearly documented to be the institution but IRAF keeps getting mentioned
in there, as either 'KPNO-IRAF' or 'NOAO-IRAF', even from institutions
that aren't NOAO; the FITS 'common' keywords includes a definition for
'DETNAME', while NOAO has DETECTOR with the description 'Detector name',
but's not clear if it should be a unique name or a manufacturer's part
name, etc.)
I'll have to dig through my notes, as I think there were some other
issues but I've already managed to forget most of that week due to
sleep deprevation.
-Joe
-----
Joe Hourcle
Programmer/Analyst
Solar Data Analysis Center
Goddard Space Flight Center
More information about the fitsbits
mailing list