[fitsbits] REFERENC keyword, etc.

Joe Hourcle oneiros at grace.nascom.nasa.gov
Tue Aug 11 12:41:04 EDT 2015



On Mon, 10 Aug 2015, Demitri Muna wrote:

> Hi,
>
> On Jul 21, 2015, at 1:25 PM, Joe Hourcle <oneiros at grace.nascom.nasa.gov> wrote:
>
>> I'm more confident about HTTP than other protocols right now, simply 
>> because it's been in use for 20+ years at this point, and there are 
>> groups actively working on crawling and archiving HTTP sites.  Just 
>> last month, there was a conference on the topic:
>>
>> 	https://library.columbia.edu/bts/web_resources_collection/Conferences/program.html
>>
>> I want to be able to mint DOIs, but I want to be using resolvable URLs:
>>
>> 	http://dx.doi.org/10.xxx/xxxxxx
>
> There is already a defined method to do this with the "doi" URI scheme:
>
> http://www.doi.org/doi_handbook/2_Numbering.html#2.6.2
> https://tools.ietf.org/html/draft-paskin-doi-uri-04#section-3
>
>> NOTE: Certain client or server software might be able to process DOIs 
>> using native resolution technology (i.e. doi:10.1006/jmbi.1998.2354 
>> would be interpreted by the browser and automatically resolved without 
>> the addition of the proxy server address).
>
> I think this is the correct, future-proof approach.

Does your e-mail render this as a link?

 	doi:10.7717/peerj-cs.1

It's only 'future-proof' in that it's broken now, so when it's broken in 
the future, no one will notice that anything has changed.  It'd be nice if 
worked in e-mail readers, PDF makers, and the like but the majority of 
tools have no idea that they're even supposed to do anything with it.

That's why the library community recommends a URL:

 	https://dx.doi.org/10.7717/peerj-cs.1

I admit that there are some issues with disabiguation, as all of the 
following resolve to the same place, so it's not ideal as an identifier:

 	http://dx.doi.org/10.7717/peerj-cs.1
 	https://dx.doi.org/10.7717/peerj-cs.1
 	http://doi.org/10.7717/peerj-cs.1
 	https://doi.org/10.7717/peerj-cs.1
 	http://doi.org/6q5
 	https://doi.org/6q5
 	http://dx.doi.org/6q5
 	https://dx.doi.org/6q5

... but if the goal is to get people from the short string of text in the 
FITS file to the actual document, an HTTP or HTTPS URL is better than a 
DOI URI.  Even if their viewer doesn't automatically make the URL 
clickable, our current generation of scientists have enough familiarity 
with seeing 'http://' all over the place that they know what to do with 
it.

If we come up with a standard that only calls for URIs in the form 
'doi:...', I'll recommend that our community do:

 	REF_INST='doi:10.xxx/whatever'
 	COMMENT   http://dx.doi.org/10.xxx/whatever

... or something similar.

If we're talking about archiving, we should be worrying about making sure 
that the information is usable to both current & future generations, not
worrying about shaving a few bytes off a file.  If we were concerned about 
the size of the headers, then we'd come up with standards to store the 
information so it's normalized to at least 3NF.

-Joe


-----
Joe Hourcle
Programmer/Analyst
Solar Data Analysis Center
Goddard Space Flight Center



More information about the fitsbits mailing list