[fitsbits] FITS 'keyword dictionaries'

Joe Hourcle oneiros at grace.nascom.nasa.gov
Thu Apr 3 15:08:45 EDT 2014


On Apr 3, 2014, at 1:28 PM, Tom Kuiper wrote:

> On 04/03/2014 07:51 AM, Joe Hourcle wrote:
>> I was thinking that much like the 'file' command in UNIX, it might be worth making a list of values in use, so that if someone had some random FITS file, they could run it through a program that would attempt to identify the file, and possibly give a reference of where to find more information about that particular instrument. 
> I remember seeing something like that a few years ago when I was working on this issue but had forgotten.  As I recall, the suggestion was to provide a URL for a repository with the detailed information.  I had some concerns about that though.  For example, can we count on the website being maintained?  A complete FITS header, at least, will exist as long as the FITS file exists.

That might've been me.  I sent something out to the fitsbits mailing list in January of last year, asking about the REFERENC keyword, because I was hoping to use it to actually put the URL in the FITS HDU to link to documentation.

One of the other things that I'm trying to do is to come up with some better ways that we can cite astronomical data that aligns with the recent Joint Declaration of Data Citation Principles:	

	https://www.force11.org/datacitation

(in part, because I had a talk accepted on the topic for the AAS [1] ... and I've also got a poster in SPD on this topic of documenting FITS headers [2] to pressure me to do some work on it)

...

I think the important thing is that we need to be using some sort of persistent URLs to the documentation, rather than just a URL to some PI's website.  It might be possible that we could get AAS as a society to store & host the documentation along with the journal.  It's also possible that the NASA/ADS folks at SAO might consider this to be complementary to their work and agree to maintain either a repository or a redirection service.

(there's also a 'NASA Labs' grant that comes around once a year that we might be able to get some seed money (up to $30k) to get it started)



> In the rest of your e-mail, Joe, you give wonderful examples of the varied (mis?)use of supposedly standard keywords.  My inclination is to create much more specific keywords.  The reason is that in radio astronomy, at least, technology is now leading towards hardware which is very adaptable to an observer's specific requirements.  The most egregious example I can think of is the CASPER hardware.  We now have ROACH-1 boards with KATADCs for radio astronomy at each of out three DSN stations.  Observers can bring their own firmware, so that the same hardware may be a spectrometer for one group, a spectro-polarimeter for another, a pulsar timing back-end for a third group, a pulsar search engine for the fourth, and so on.  Our new receivers are configurable too, so that they provide HV polarization or RL polarization as requested, and the IF outputs can be I/Q (in/quadrature phase) or USB/LSB pairs.  The software that I'm trying to write must capture the state of that equipment and write it to FITS headers so the analyst will later know exactly what happened to convert photons to bits.

More specific keywords can help, but it doesn't help warn people of things like 'this team has set DATE_OBS to DATE_END; you'll have to use EXPTIME to derive DATE-BEG'
or 'EXPTIME is recorded in milliseconds'

As for the issue with firmware, the closest concept that I can think of that's more universal is 'observing mode', but I don't know that you'd be able to put anything other than something very broad into OBS_MODE.  I don't know that you can really capture all of the details of the firmware sufficiently without depositing the firmware for inspection or otherwise publishing it, just like you can't have a complete scientific record without depositing the data that's behind a given journal article.


> If I go with a new keyword set (it seems almost unavoidable to me) then TELESCOP, INSTRUME, DETECTOR, and any others in common use could have some cleverly crafted values for traditional software.  I would hope that most analysts will use some software like ASAP that can be easily extended to handle the added keywords.  Observatory staff could help by making libraries available that provide such extensions.

DETNAME is also used for 'sub-instrument' by some groups.  In looking at the CHANDRA docs that Arnold sent, they're one of the groups that use it.

(and I'd also like to say that I really like the text file he linked to ( http://cxc.cfa.harvard.edu/contrib/arots/fits/content.txt ), as it discusses keywords in logical groups, rather than each on individually; they also discuss how to decode filenames, what acronyms mean, etc.)


-Joe




[1 & 2] :

Blah .. abstract central doesn't let you give useful URLs to abstracts ... so here's what I submitted:

======

Data Citation in Astronomy 

Many observatories maintain bibliographies to document their impact and justify their continued funding[1], an effort that requires humans to discover and curate links between the scientific papers and the data that was used as evidence. The "Best Practices for Creating a Telescope Bibliography", endorsed by IAU C5 WG Libraries, recommends full text searching and human examination of each paper.[2] These efforts do not scale well.

It is unlikely that articles published in journals from other disciples would be found. This is particularly a problem for solar physics, as solar data has applicability in astrophysics, space weather, and even the earth sciences.

As our scientists are not on the editorial boards of the journals from other disciplines, we can't ensure proper attribution to allow these relationships to be discovered via full text searching.

To better deal with tracking cross-discipline data usage, a number of groups have come up with guidelines and principles for data citation. In 2012, the National Academy's Board on Research Data and Information released the report "For Attribution-Developing Data Attribution and Citation Practices and Standards" [3] and it was followed last year by the CODATA-ICSTI report "Out of Cite, Out of Mind".[4]

Participants from a number of groups synthesized a single set of principles for data citation that could be endorsed by all groups involved in research.[5] Implementing these principles can help to improve the scientific ecosystem by giving proper attribution to all contributors to data, improving transparency and reproducability, and making data more easily reusable to both astronomers and other researchers.

We will present the Joint Declaration of Data Citation Principles, discuss the implications of them for astronomical data, and recommend steps towards implementation.

References:
[1] Accomazzi, et.al, 2012. http://adsabs.harvard.edu/abs/2012SPIE.8448E..0KA
[2] Bishop, Grothkopf & Lagerstrom, 2012. http://iau-commission5.wikispaces.com/file/view/Best+Practices+Final.pdf
[3] National Research Council, 2012. http://www.nap.edu/catalog.php?record_id=13564
[4] CODATA, 2013. http://dx.doi.org/10.2481/dsj.OSOM13-043
[5] FORCE11, 2014. http://www.force11.org/datacitation

======

Standardizing Documentation of FITS Headers 

Although the FITS file format[1] can be self-documenting, human intervention is often needed to read the headers to write the necessary transformations to make a given instrument team's data compatible with our preferred analysis package. External documentation may be needed to determine what the values are of coded values or unfamiliar acronyms. 

Different communities have interpreted keywords slightly differently. This has resulted in ambiguous fields such as DATE-OBS, which could be either the start or mid-point of an observation.[2] 

Conventions for placing units and additional information within the comments of a FITS card exist, but they require re-writing the FITS file. This operation can be quite costly for large archives, and should not be taken lightly when dealing with issues of digital preservation.

We present what we believe is needed for a machine-actionable external file describing a given collection of FITS files. We seek comments from data producers, archives, and those writing software to help develop a single, useful, implementable standard.

References:
[1] Pence, et.al. 2010, http://dx.doi.org/10.1051/0004-6361/201015362
[2] Rots, et.al, (in preparation), http://hea-www.cfa.harvard.edu/~arots/TimeWCS/ 





More information about the fitsbits mailing list