[fitsbits] 'Dataset Identifications' postings (digest)

Rob Seaman seaman at noao.edu
Tue Mar 23 16:54:40 EST 2004


Arnold Rots writes:

> Maybe it helps to state the practical purpose of the identifiers.
> It's put in there to inform users as to what dataset identifier to use
> if and when they insert such identifiers into their manuscripts.

Thanks!  Yes, that does help.

> The purpose of that is to facilitate the linkage between the
> literature and the archived datasets.  Those links are currently being
> maintained by a number of data centers (and the ADS) but it is rather
> labor-intensive.  This mechanism would allow for automatic harvesting.

An eminently desirable goal.  This causes me to strengthen my
recommendation that the reserved keyword name(s) be ADSID and ADSIDnnn.
(I imagine a thousand ADS dataset identifiers are sufficient for a
particular FITS HDU - are they?)

> You will find the current list at:
> 	http://vo.ads.harvard.edu/dv/facilities.txt

A very interesting list.  Might I suggest that this list be itself
scrubbed and extended as part of this process?  There is a lot of
confusion about the organizations contained on the list.  For instance,
here are the overtly NOAO related entries:

    KPNO.12m        Kitt Peak National Observatory/12 meter Telescope 
    KPNO.2.1m       Kitt Peak National Observatory/2.1 meter Telescope 
    KPNO.BT         Kitt Peak National Observatory/Bok Telescope 
    KPNO.MAYALL     Kitt Peak National Observatory/Mayall Telescope 
    KPNO.MDMHT      Kitt Peak National Observatory/MDM Hitner Telescope 
    KPNO.MDMMH      Kitt Peak National Observatory/MDM HcGraw-Hill Telescope
    KPNO.MPT        Kitt Peak National Observatory/McMath-Pierce Telescope 
    KPNO.SARA       Kitt Peak National Observatory/Southeastern Association
                         for Reasearch in Astronomy Telescope 
    KPNO.SWT        Kitt Peak National Observatory/Space Watch Telescope 
    KPNO.WIYN       Kitt Peak National Observatory/WYIN,
                         Wisconson-Indiana-Yale-NOAO Telescope

    CTIO.1.5m       Cerro Tololo Inter-American Observatory/1.5 meter Telescope 
    CTIO.2MASS      Cerro Tololo Inter-American Observatory/2MASS Telescope
    CTIO.VBT        Cerro Tololo Inter-American Observatory/Victor Blanco
                          Telescope 
    CTIO.YALO       Cerro Tololo Inter-American Observatory/YALO,
                          Yale-AURA-Lisbon-OU Telescope 

First, note that the "National Optical Astronomy Observatory" is not
mentioned yet NOAO is likely the legal owner of many data products
resulting from some of these facilities.

Second, note:

    1) that data from KPNO.12m is owned (I would think) by *NRAO* (as is
    the telescope),
    2) that data from KPNO.BT and KPNO.SWT is owned by the University
    of Arizona (or perhaps the state of Arizona),
    3) that data from KPNO.MPT is owned by the National Solar Observatory,
    4) that data from KPNO.MDMHT and KPNO.MDMMH is owned by whoever owned
    MDM during the epoch of the observations in question,
    5) that data from KPNO.SARA is owned by the SARA consortium,
    6) that data from KPNO.WIYN is owned by the WIYN consortium, one
    member of which is NOAO,
    7) that there are two 2MASS telescopes and only one is at CTIO
    8) that CTIO.YALO was run by the - you guessed it - YALO consortium
    and has since ceased operations

It is quite likely that I got some of those nuances wrong myself :-)

There appears to be a confusion between a ground-based observing site
and an observatory - perhaps this is a result of the list being compiled
by our friends in the space-based astronomical community?

In general an observatory is a political entity, a telescope is a facility,
and a site like Kitt Peak is a piece of real estate that may be host
multiple facilities from multiple observatories.  Depending on the details
of contracts or other binding operating agreements, an observatory may
"own" the data that result from a particular facility like a telescope,
instrument, archive or pipeline - or that ownership may devolve to a
specific member of some consortium.  In many cases, one imagines that
a funding agency or government or perhaps even the "people of the United
States of America" may ultimately own a particular data product.

So, an example.  NOAO operates twin 8Kx8K mosaic wide field imagers
at its sites on Kitt Peak in Arizona and on Cerro Tololo in Chile.
Depending on the phase of the moon (quite literally :-) the resulting
data may be owned by NOAO or by some instrumentalities associated with
the University of Wisconsin, Indiana University, Yale University and
in the near future perhaps the University of Maryland.  Confounded with
this question of ownership is the issue of proprietary rights.  Time
on NOAO facilities is awarded competitively and the successful PIs are
rewarded with sole access for some period (typically 18 months).

A dataset ID can be a relatively simple beast - perhaps as simple as
a data source ID and a serial number.  But the full taxonomy of dataset
provenance has to support many degrees of freedom.  At the very least:

    Nation
    Funding agency
    Observatory
    Consortium member ("partner")
    Telescope
    Instrument
    Date&Time
    Proposal ID
    PI and/or project ID
    ...

The more I listen to myself talk, the more I convince (myself, anyway :-)
that a single DS_IDENT keyword is a very poor match to the underlying
requirements.  Not only might a single file belong to multiple datasets
certified by a particular entity (like ADS), but they may belong to
multiple other datasets certified by multiple other entities - and more
to the point, the design of the certification process will vary from one
to the next to the next.

In particular, the NOAO Science Archive has been discussing the precise
questions of ownership and proprietary access and had already selected
a subset of fields along the lines of Observatory (NOAO, WIYN, SOAR, etc.),
Partner (NOAO, Wisconsin, Indiana, Yale, Brazil, etc.), Telescope (kp4m,
ct4m, wiyn, soar, etc.), Instrument (too many to list), Date&Time, and
(most similar to the ADS scheme) the NOAO Proposal ID spanning all these
facilities.  Whatever we settle on will never fit within the confines of
any single keyword.  On the other hand, I'd love to *also* include an
ADSID tag to even further constrain the provenance.

Rob Seaman
NOAO Science Data Systems



More information about the fitsbits mailing list