[fitsbits] JSONFITS, a suggestion for rendering FITS files in an Internet standards friendly format

Wed Sep 30 15:16:38 EDT 2015

Hi,

I think this is a very good idea. But I concur that a base library like cfitsio or nom-tam-fits is nessesary.
But more importantly I think the base should be a xsd describing fits in all details this can than be used for json or XML.
After that a fits stream could be converted into a json/XML stream with a minimal buffer.
(That converter is based on a std. Fits lib) 
The resulting stream can then be parsed by any normal parser but especially by a sax or stax or any other event based json/XML parser.

The potential is big. Just think for example a Jason report based directly on a fits file? ....

    Ritchie
________________________________________
From: fitsbits [fitsbits-bounces at listmgr.nrao.edu] on behalf of Tom McGlynn (NASA/GSFC Code 660.1) [tom.mcglynn at nasa.gov]
Sent: Wednesday, September 30, 2015 5:37 PM
To: Brian McConnell; fitsbits at nrao.edu
Subject: Re: [fitsbits] JSONFITS, a suggestion for rendering FITS files in an Internet standards friendly format

Hi Brian,

I'm curious why you decided that you would do a specialized encoding of
the payload rather than render it in JSON directly. Is this simply a
size decision.  My guess is that floating point numbers might incur s
significant penalty but I'm not sure about the other types (when you
include compression), but have you looked at this for real FITS files.
If you're in a context where you're using JSON, providing the data in a
JSON format would seem to make sense.  If you only want to provide FITS
metadata, then just encode the headers.

I'm not sure I understand why you would have any loss of precision in
floating point numbers as John Parejko suggests -- if it's just the
quality of the supporting libraries then I don't think that need drive
your standard.   However you would have to worry about loss of precision
of longs with large absolute values where they are not exactly
representable as doubles.  These you could represent as strings.
Variable length records -- particularly for a FITS file that reuses heap
locations - is probably another difficult area.  While I think the
majority of actual FITS files could be handled very naturally, it's
legal to write FITS files where the same set of bytes is read as a
integer for one row and a float for another.  So in this most general
case an intermediary byte array is required.  Despite these issues, I
think a more complete JSON rendering would be more popular.

     Regards,
     Tom McGlynn

Brian McConnell wrote:
> Hello,
>
> Bill Pence suggested I post here. I'm working on a project to make
> FITS data more accessible to software developers who are not
> necessarily familiar with the format. Specifically, we're looking at
> ways to make SETI observational data available to third party
> developers who may be very knowledgeable about software engineering,
> digital signal processing, etc, but not field specific formats like
> this.
>
> What I am working on is a utility that renders FITS files as JSON, a
> widely used interchange format (think XML without the bloat), with
> base64 encoded binary for payload data. The result is a file that is 7
> bit friendly, can be viewed in any text editor, and is trivial to
> parse. A typical JSONFITS file would look like:
>
> [
> {"TARGNAME":"foo","TELESCOP":"arecibo","BINDATA":base64encodedblobofdatagoeshere},{"TARGNAME":"bar","TELESCOP":"arecibo","BINDATA":morebase64encodeddata}
> ]
>
> Someone consuming this data can then do so very easily, as in the
> Python example:
>
> blocks = json.loads(open('test.jsn','r').read())
> for b in blocks:
>    target_name = b.get('TARGNAME','')
>    telescope = b.get('TELESCOP','')
>    payload = base64.b64decode(b.get('BINDATA',''))
>    do_something_with(target_name, telescope, payload)
>
> I should point out that I am not suggesting a new file format, but am
> thinking of this as an output filter for rendering FITS files in an
> internet friendly format.
>
> Now you might ask won't base64 encoding bloat the file/download size?
> Yes, by about 4:3, but it turns out lossless compression largely
> undoes this, and as on the fly gzip compression is a built in feature
> in most web servers nowadays, this is basically a non issue (same
> thing for storage if you use a compressed volume). So you can have
> your cake (easy to parse, human readable files) and eat it too
> (similar overall footprint as binaries).
>
> Well, I wanted to put this out there as a discussion item, and see if
> there's other work along these lines underway. My intent with the
> rendering utility is to make it available as a tiny Python library
> that people can use and build on.
>
> Thanks for your time.
>
> Brian McConnell <bsmcconnell at gmail.com>
>
> _______________________________________________
> fitsbits mailing list
> fitsbits at listmgr.nrao.edu
> https://listmgr.nrao.edu/mailman/listinfo/fitsbits

_______________________________________________
fitsbits mailing list
fitsbits at listmgr.nrao.edu
https://listmgr.nrao.edu/mailman/listinfo/fitsbits