[fitsbits] start of Public Comment Period on compressed FITS image and tables

William Pence William.Pence at nasa.gov
Wed Jul 1 16:52:06 EDT 2015


The most obvious way to make use of multiple cores with tile compressed 
images is to assign a different core to each tile and then uncompress 
multiple tiles in parallel.  CFITSIO does not currently make use this 
technique, but it could be done.

If I understand correctly, you are suggesting that it might also be 
beneficial to be able to use multiple cores when uncompressing a single 
tile.  This probably could be done and would only require defining one 
or more new compression algorithms that support multiple cores.

-Bill


On 7/1/2015 3:21 PM, van Nieuwenhoven, Richard wrote:
> OK, on request of Tom I did some programming to test the benefits of using blocked compression.
> Using Java I have thrown togetheran very raw and basic implementation. The results are very prommesing.
>
> What I did was a very simple extension of the current compression system. The difference is that I wrote an uncompressed integer (containing the block size before the compressed data and continue that till there is a 0 length.
>
> The speed gain useing the join fork pattern (every block is decompressed by parallel threads) was 33% per extra core, without any optimalisation my pc compressed and decompressed 3 times faster. Probably with a more sofisticated implementation there should be more to gain.
>
> Wenn we specify in the standard that the blocks must be on row boundary, the row construction can also be done in parallel.
>
> A non parallel implementation would still be very similar to the standard decompression.
>
> any thoughts on this?
>
>        Ritchie
> ________________________________________
> From: fitsbits [fitsbits-bounces at listmgr.nrao.edu] on behalf of van Nieuwenhoven, Richard [Richard.vanNieuwenhoven at adesso.at]
> Sent: Friday, June 26, 2015 7:32 AM
> To: fitsbits at nrao.edu
> Subject: Re: [fitsbits] start of Public Comment Period on compressed FITS image and tables
>
> As a programmer there is another concern, the fits file can get very big
> and will become even bigger in future. Today's computers gain more power
> by using more cores instead of more speed per core. So It would be good
> if the standard "helps" in the use of multiple cores to process and
> decompress the fits files.
>
> The use of tiles already helps a lot because they can be handled in
> parallel. But the compression algorithms does not help at all because
> most of them can not use multiple cores to do the job.
>
> One possibility to get around this is to use blocks of compressed data
> and every block is compressed in itself. Or to have some kind of index
> with multiple entry points into the compressed data. This will be
> difficult to bring in line with the currently used compressions. A
> simple solution for that could be to add a bocked version of every
> compression type, this then uses a predefined block size.
>
> A minor concern is that it would help if there was some kind of index or
> other way to have jump points to the separate hdu's. Currently this is
> only possible by calculating the size from the header data and then
> jumping over the body to the next hdu. This could be solved by adding a
> special index hdu to the end of the file where the entry points of the
> different hdu's are stored.
>
> These suggestions would enable software to process the fits files a lot
> faster and as the trend goes on, more cores but not much more speed per
> core, the standard should prepare for it.
>
>       Ritchie
>
>
>
> Am 2015-06-25 um 16:38 schrieb Tom McGlynn (NASA/GSFC Code 660.1):
>> While I'm generally supportive of the compression proposal (at least for
>> images), I feel that the current text reflects the sense of this being a
>> convention rather than part of the standard.  By this I mean that if we
>> are going to support compressed images and tables then they should be
>> incorporated into the standard as first class objects.  The current text
>> makes it clear that these compressed HDU's are compressed
>> representations of virtual uncompressed images and tables.  Implicitly
>> the idea is the the user converts from the compressed image to the
>> uncompressed version and then processes that.  Instead we should
>> recognize that a compressed image is just one of the ways that FITS
>> allows one to store an image just like a pimary image array, an
>> extension image or vector value in a table.
>>
>> So I would suggest that the ZSIMPLE, ZEXTEND, ZBLOCKED and such keywords
>> be made optional with wording something like:  "If a compressed image is
>> being used to compress an existing FITS image extension, the ZXTENSION
>> keyword MAY be used to contain the value of the original extension."
>> I'd suggest that in future use the use of these keywords be discouraged.
>>
>> The recommended practice would be that users treat the compressed image
>> as the image and not worry about some
>> intermediate image representation.
>>
>>
>> My second major concern with with this convention is that it does seem
>> rather ad hoc.  I think that it would be much better if the proposal was
>> rigorously separated the algorithmic aspects from the non-algorithmic
>> elements.  A mechanism for how additional compression techniques could
>> be added should be notedE.g., the discussion of quantization should part
>> of the implementation of the lossy compression algorithms and the
>> ZQUANTIZ parameter should  probably be  one of the ZVALn, ZNAMEn
>> elememts.  Table 36 should titled something like:
>>    Supported Compression Algorithms
>> with the first column being the name of the compression algorithm, the
>> second the value of ZCMPTYPE, and also including the ZNAMEs using
>> (flagging the critical ones).
>>
>> I've almost no insight into table compression.  Given that no one seems
>> to be using this convention, my suggestion would be that it's premature
>> to add to the standard.
>>
>> Overall I suspect that the tiling capabilities are going to be
>> increasing essential for handling large images, so that at least that
>> much needs to be made part of the standard.  However I don't feel this
>> text is ready to be finalized.
>>
>>      Regards
>>      Tom McGlynn
>>
>> Lucio Chiappetti wrote:
>>> ANNOUNCEMENT:  START OF FORMAL PUBLIC COMMENT PERIOD
>>>
>>> This is to announce the official start of a 3-week formal Public
>>> Comment Period on the incorporation of the Tiled Image Compression and
>>> Tiled Table Compression conventions in the FITS Standard.
>>>
>>> This is part of a process to incorporate the most useful and widely
>>> used registered conventions (which are valid FITS constructs) into the
>>> official definition of the standard.
>>>
>>> Among these the two compression conventions benefit of a common
>>> handling. Given their relative complexity they are better discussed
>>> first, before other easier conventions.
>>>
>>> The proposed text consists
>>>
>>> - in the ADDITION of an entire new chapter (10)  to the FITS Standard
>>>    Document which describes the two conventions in a common prescriptive
>>>    framework.
>>> - It also includes the ADDITION of a new non-prescriptive Appendix I,
>>> - plus the addition of the necessary bibliographic references,
>>>
>>>    and has been prepared by a technical team including L.Chiappetti,
>>>    W.Pence, A.Dobrzycki, R.A.Shaw and W.Thompson (main editor Dick Shaw).
>>>
>>> - If the proposal is approved also Appendix C will be updated listing the
>>>    new keywords, and a section H.3 will be added to Appendix H describing
>>>    the updates and the differences with the registered convention.
>>>
>>>    All the updates are shown in blue colour in their current context (with
>>>    the exception of the NEW chapter 10 which is black)
>>>
>>> The proposed draft text is available at
>>> http://sax.iasf-milano.inaf.it/~lucio/FITS/Conventions/compression-upd2.pdf
>>>
>>>
>>> Supporting material is provided in the FITS Convention Registry at the
>>> following URLs
>>> http://fits.gsfc.nasa.gov/registry/tilecompression.html
>>> http://fits.gsfc.nasa.gov/registry/tiletablecompression.html
>>>
>>> Considering that the convention(s) have been in use since several
>>> years, are legal FITS, were discussed on FITSBITS when the conventions
>>> were entered in the Registry and therefore their usage is well proven
>>> (also for what interoperability is concerned), the Public Comment
>>> Period is reduced to 3 weeks.
>>>
>>> Also the review by FITS Working Group Executive can be speeded up and
>>> handled in parallel or quickly after the conclusion of the Public
>>> Comment Period.
>>>
>>> Please review the text carefully and post any comments, criticisms, or
>>> suggestions on the FITSBITS mailing list (not on iauwfg or elsewhere)
>>> ==================================================================
>>>
>>> The Public Comment Period starts today 16 June 2015 and will last
>>> formally for 3 weeks until July 6
>>>
>>> ==================================================================
>>> Background information on the FITS approval process
>>>
>>> Under the "Rules and Procedures" of the IAU FITS Working Group,
>>> http://fits.gsfc.nasa.gov/iaufwg/iaufwg_rules.html, the first step in
>>> the official approval process of any FITS proposal will be a formal
>>> Public Comment Period to take place on the FITSBITS mailing list.
>>> After that the IAU FITS Working Group Executive will review the
>>> results. Following that the IAU FITS Working Group will then conduct a
>>> final vote to approve or disapprove the proposal.
>>>
>>
>> _______________________________________________
>> fitsbits mailing list
>> fitsbits at listmgr.nrao.edu
>> https://listmgr.nrao.edu/mailman/listinfo/fitsbits
>
>
> --
> BSc Richard van Nieuwenhoven
> Software Architekt
>
> adesso Austria GmbH
> floridotower 26. Stock              T +43 1 2198790-0
> Foridsdorfer Hauptstr. 1            F +43 1 2198790-13
> A-1210 Wien                         H +43 664 88614710
>                                      E richard.vannieuwenhoven at adesso.at
>                                      www.adesso.at
> -------------------------------------------------------------
>           >>> business. people. technology. <<<
> -------------------------------------------------------------
> adesso Austria GmbH mit Sitz in Wien
> Handelsgericht Wien FN231467v
>
> _______________________________________________
> fitsbits mailing list
> fitsbits at listmgr.nrao.edu
> https://listmgr.nrao.edu/mailman/listinfo/fitsbits
>
> _______________________________________________
> fitsbits mailing list
> fitsbits at listmgr.nrao.edu
> https://listmgr.nrao.edu/mailman/listinfo/fitsbits
>


-- 
____________________________________________________________________
Dr. William Pence    Astrophysicist     William.Pence at nasa.gov
NASA/GSFC Code 662     [Emeritus]       +1-301-286-4599 (voice)
Greenbelt MD 20771                      +1-301-286-1684 (fax)



More information about the fitsbits mailing list