[fitsbits] REFERENC keyword, etc.

Joe Hourcle oneiros at grace.nascom.nasa.gov
Tue Jul 21 13:25:37 EDT 2015



On Tue, 21 Jul 2015, William Thompson wrote:

> But even persistent URLs aren't guaranteed to still exist 20 years from now, 
> while published literature is expected to remain around for centuries.
>
> I'm all for including a pointer to an informational web page (i.e. INFO_URL), 
> but this should be backed up with links to the published literature.


I admit, I felt the same way when I first saw the recommendation for 
'actionable DOIs' in citations, but I've come to realize that most people 
have no idea what to do with DOIs or bibcodes other than type them into 
Google.  Of course, that assumes that they don't just gloss right over 
them, not even aware that it's an identifier that could be resolved.

Although PURLs might not persist 100+, there's at least a level of 
committment.  In most cases, it's a group committment, so if one 
organization folds, there's still someone else that can take over.

I'm more confident about HTTP than other protocols right now, simply 
because it's been in use for 20+ years at this point, and there are groups 
actively working on crawling and archiving HTTP sites.  Just last month, 
there was a conference on the topic:

 	https://library.columbia.edu/bts/web_resources_collection/Conferences/program.html



...

I want to be able to mint DOIs, but I want to be using resolvable URLs:

 	http://dx.doi.org/10.xxx/xxxxxx

For bibcodes, I'd recommend:

 	http://adsabs.harvard.edu/abs/xxxxx

...

If we don't trust these two groups to persist, then another option would 
be for the FITS community to set up a resolver that they manage, which 
would be trusted sufficiently to use in FITS headers.

If you go that route, I'd advise hosting it somewhere other than a US 
government site, both due to paperwork and the coming crackdown against 
HTTP on .gov sites.

Yes, you could use HTTPS in place of HTTP for identifiers, but I've just 
seen way to many unmaintained SSL certificates out there that it'd likely 
cause more problems than any benefit gained.

-Joe





> On 07/20/15 18:55, Joe Hourcle wrote:
>> On Mon, 20 Jul 2015, William Thompson wrote:
>> 
>>> Folks:
>>> 
>>> In our community we've been discussing the need for including information 
>>> in
>>> the FITS header about the instrument that produced the data.  The FITS
>>> standard describes the keyword REFERENC which is described as being where 
>>> the
>>> data are published.  We were thinking of using this or a similar keyword 
>>> to
>>> point to the instrument paper.  However, the description of REFERENC 
>>> doesn't
>>> appear to be compatible with this usage.  Instead, it appears to be 
>>> oriented
>>> towards a data rights model where specific observations are considered to
>>> belong to a group of researchers until the data are published.  That's not
>>> generally the case in our field.
>> 
>> I had brought up REFERENC for this use a while back, and I was told that
>> it's from the Vizier project, where when they extracted the data from the
>> arcticles, and in the resulting FITS files, they put a reference back to
>> the paper (in this case, a bibcode).
>> 
>> 
>>> One keyword that we've recently (tentatively) adopted is INFO_URL to point 
>>> to
>>> a website where information about the instrument can be found.  This can
>>> include copies of the instrument paper(s), user guides, software manuals,
>>> observer's logs, and the like.  However, one must always consider URLs as
>>> ultimately ephemeral, and a standard way to point to the published 
>>> literature,
>>> which is considered to be more permanent, is highly desired.  I'm curious 
>>> to
>>> know how other groups have tackled this problem.
>>> 
>>> One possible set of keywords which have occurred to me are:
>>> 
>>> INS_REF Instrument description paper
>>> CAL_REF Instrument calibration paper
>>> 
>>> The instrument calibration paper tends to come well after the instrument
>>> description paper.  I haven't discussed these keywords yet with our teams, 
>>> but
>>> they seem sensible.  As with REFERENCE, these should either contain the 
>>> ADS
>>> bibcode, or the DOI.
>>> 
>>> I'm interested to know what you think,
>> 
>> 
>> I've been trying to get access to mint DOIs through the EOSDIS here at 
>> Goddard
>> -- I likely need to ping them again, as I was told they'd have to discuss 
>> it
>> with some other people.
>> 
>> As an alternative to DOIs, we could also use PURLs, which are 'persistent
>> URLs'.  Basically, you have a DSN name that's used to redirect people, so 
>> that
>> you only have to make sure that one site stays up.  Should the site hosting 
>> the
>> documentation go down, you adjust the record at the PURL site to redirect 
>> to the
>> new location.
>> 
>> Personally, I'm against hard-linking to the instrument & calibration 
>> papers,
>> because they're static -- if you linked to a paper describing the original 
>> EIT
>> calibration, it wouldn't contain any information about the degredation that
>> wasn't detected until years later.  I would prefer to see a 'INFO_URL' 
>> pointing
>> to a website that the PI team could update.  They could then provide 
>> up-to-date
>> links to information about calibration, the user's guide, etc.
>> 
>> For the solar community, I've set up 'http://data.virtualsolar.org/' for 
>> PURLs.
>> It was set up as a stop-gap until I can mint DOIs.*
>> 
>> ...
>> 
>> 
>> If you're planning for the long term, as our community tends to define 
>> headers
>> early in the mission, and then avoid changing them, I'd actually like to 
>> see
>> there be slightly different PURLs if there are different types of data
>> released.  (eg, if there are multiple detectors, different observing modes, 
>> or
>> different types of processing applied).  This way, we can more easily
>> differentiate between them.
>> 
>> We could also use these PURLs to serve metadata about the larger collection 
>> of
>> data.  See "Achieving human and machine accessibility of cited data in 
>> scholarly
>> publications", which discusses this plus some recommendations on
>> cross-discipline standards:
>>
>>      https://dx.doi.org/10.7717/peerj-cs.1
>> 
>> ...
>> 
>> For more on the arguments against linking to static documents, see the 
>> handout
>> from the poster "Linking Articles to Data":
>>
>>      http://dx.doi.org/10.5281/zenodo.13802
>> 
>> 
>> -Joe
>> 
>> 
>> * and currently, only has one entry:
>>
>>      http://data.virtualsolar.org/soho.uvcs
>> 
>> 
>> -----
>> Joe Hourcle
>> Programmer/Analyst
>> Solar Data Analysis Center
>> Goddard Space Flight Center
>> 
>> 
>
> -- 
> William Thompson
> NASA Goddard Space Flight Center
> Code 671
> Greenbelt, MD  20771
> USA
>
> 301-286-2040
> William.T.Thompson at nasa.gov
>



More information about the fitsbits mailing list