[fitsbits] start of Public Comment Period on compressed FITS image and tables

van Nieuwenhoven, Richard Richard.vanNieuwenhoven at adesso.at
Wed Jul 22 01:21:29 EDT 2015


Ok, Bill convinced me that the tiles are already so small by default
that there is no need for further splitting. The default value in
cfitsio is already one row (and 16 in case of hcompress). The missing
experience with compressed tiles lead to my wrong assumption. So the use
of multiple cores to (de)compress multiple tiles of one image at once is
enough to occupy the CPU's cores.

But I have two questions that came up during the reading of the
compression code.

1. since 2005 the maximum for BITPIX is -64 (double) and 64 (long).
since the value range of keywords has been extended, should there not be
support for -128 and 128? I do not know if that kind of precision is
necessary or a over-kill? (in the next ~10 years). Attention these are a
lot more difficult to implement, so they should only be included if they
are really necessary.

2. As a coder and a bad math-text reader could I ask somebody to take a
look at the description of the compression algorithms if it needs
extension for BITPIX=64 as I noticed the compression algorithm Rice in
cfitsio (as an example) directly only supports 8/16 and 32 bit. If the
need for -128 and 128 is present then these should be checked also.

    Ritchie

Am 2015-07-06 um 21:37 schrieb William Pence:
> OK, why don't we continue talking about this privately and maybe do some
> experiments to see if this technique proves useful.
> 
> -Bill
> 
> On 7/6/2015 3:04 PM, van Nieuwenhoven, Richard wrote:
>> the thing I am aiming at is to use the blocking for two purposes
>> 1 decompression of one tile using mull!tiple threads
>> 2 skip a part of the tile if not needed
>>
>> The normal compressing algorithms can be used bud just on separate
>> series of rows instead of the hole tile at ones.
>>
>> The blocking could be specified by a suffix or prefix to the algorithm
>> specification.
>>
>> The integer for the block size an short for the number of row in the
>> block are the only speciality of the blocking typt.
>>
>> This way the software reading the fits files can use optimal
>> performance from all the threads.
>>
>>       Ritchie
>> ________________________________________
>> From: William Pence [William.Pence at nasa.gov]
>> Sent: Monday, July 06, 2015 5:08 PM
>> To: van Nieuwenhoven, Richard; fitsbits at nrao.edu
>> Subject: Re: [fitsbits] start of Public Comment Period on compressed
>> FITS image and tables
>>
>> I'm confused.  If you are talking about the details of a new type of
>> multi-threaded compression algorithm, this is not something we would
>> want to try to implement immediately, so this is not really relevant to
>> the current discussion about the compression convention itself.  We
>> could perhaps continue this discussion offline...
>>
>> But on the other hand, the current convention as described does record
>> all the keywords necessary to determine which rows and columns of pixels
>> are stored in each tile.  This allows software to skip over the tiles
>> that are not needed when reading just a part of the image or table.
>>
>> -Bill
>>
>> On 7/6/2015 1:27 AM, van Nieuwenhoven, Richard wrote:
>>> Hi,
>>>
>>> maybe it would be good also to include the number of rows. So one
>>> unsigned integer for the size and one unsigned short for the number of
>>> rows. In case the program only needs to read a part of the table/image
>>> it can just fast forward over the blocks that would be skipped anyway.
>>>
>>>        Ritchie
>>>
>>> Am 2015-07-02 um 06:37 schrieb van Nieuwenhoven, Richard:
>>>> Yes, by defining a "blocked" variant of every compressesion type. Or
>>>> just add a prefix/suffix to the compression algorithem identifier, that
>>>> way the the case of a new compression type is also clearly defined. The
>>>> amount of blocks is free for the user to define but normally 16
>>>> would be
>>>> sufficient for most cases.
>>>>
>>>> In the attachment, as requested, a visual description.
>>>>
>>>>       Ritchie
>>>>
>>>>
>>>> Am 2015-07-01 um 22:52 schrieb William Pence:
>>>>> The most obvious way to make use of multiple cores with tile
>>>>> compressed
>>>>> images is to assign a different core to each tile and then uncompress
>>>>> multiple tiles in parallel.  CFITSIO does not currently make use this
>>>>> technique, but it could be done.
>>>>>
>>>>> If I understand correctly, you are suggesting that it might also be
>>>>> beneficial to be able to use multiple cores when uncompressing a
>>>>> single
>>>>> tile.  This probably could be done and would only require defining one
>>>>> or more new compression algorithms that support multiple cores.
>>>>>
>>>>> -Bill
>>>>>
>>>>>
>>>>> On 7/1/2015 3:21 PM, van Nieuwenhoven, Richard wrote:
>>>>>> OK, on request of Tom I did some programming to test the benefits of
>>>>>> using blocked compression.
>>>>>> Using Java I have thrown togetheran very raw and basic
>>>>>> implementation.
>>>>>> The results are very prommesing.
>>>>>>
>>>>>> What I did was a very simple extension of the current compression
>>>>>> system. The difference is that I wrote an uncompressed integer
>>>>>> (containing the block size before the compressed data and continue
>>>>>> that till there is a 0 length.
>>>>>>
>>>>>> The speed gain useing the join fork pattern (every block is
>>>>>> decompressed by parallel threads) was 33% per extra core, without any
>>>>>> optimalisation my pc compressed and decompressed 3 times faster.
>>>>>> Probably with a more sofisticated implementation there should be more
>>>>>> to gain.
>>>>>>
>>>>>> Wenn we specify in the standard that the blocks must be on row
>>>>>> boundary, the row construction can also be done in parallel.
>>>>>>
>>>>>> A non parallel implementation would still be very similar to the
>>>>>> standard decompression.
>>>>>>
>>>>>> any thoughts on this?
>>>>>>
>>>>>>          Ritchie
>>>>>> ________________________________________
>>>>>> From: fitsbits [fitsbits-bounces at listmgr.nrao.edu] on behalf of van
>>>>>> Nieuwenhoven, Richard [Richard.vanNieuwenhoven at adesso.at]
>>>>>> Sent: Friday, June 26, 2015 7:32 AM
>>>>>> To: fitsbits at nrao.edu
>>>>>> Subject: Re: [fitsbits] start of Public Comment Period on compressed
>>>>>> FITS image and tables
>>>>>>
>>>>>> As a programmer there is another concern, the fits file can get
>>>>>> very big
>>>>>> and will become even bigger in future. Today's computers gain more
>>>>>> power
>>>>>> by using more cores instead of more speed per core. So It would be
>>>>>> good
>>>>>> if the standard "helps" in the use of multiple cores to process and
>>>>>> decompress the fits files.
>>>>>>
>>>>>> The use of tiles already helps a lot because they can be handled in
>>>>>> parallel. But the compression algorithms does not help at all because
>>>>>> most of them can not use multiple cores to do the job.
>>>>>>
>>>>>> One possibility to get around this is to use blocks of compressed
>>>>>> data
>>>>>> and every block is compressed in itself. Or to have some kind of
>>>>>> index
>>>>>> with multiple entry points into the compressed data. This will be
>>>>>> difficult to bring in line with the currently used compressions. A
>>>>>> simple solution for that could be to add a bocked version of every
>>>>>> compression type, this then uses a predefined block size.
>>>>>>
>>>>>> A minor concern is that it would help if there was some kind of
>>>>>> index or
>>>>>> other way to have jump points to the separate hdu's. Currently
>>>>>> this is
>>>>>> only possible by calculating the size from the header data and then
>>>>>> jumping over the body to the next hdu. This could be solved by
>>>>>> adding a
>>>>>> special index hdu to the end of the file where the entry points of
>>>>>> the
>>>>>> different hdu's are stored.
>>>>>>
>>>>>> These suggestions would enable software to process the fits files
>>>>>> a lot
>>>>>> faster and as the trend goes on, more cores but not much more
>>>>>> speed per
>>>>>> core, the standard should prepare for it.
>>>>>>
>>>>>>         Ritchie
>>>>>>
>>>>>>
>>>>>>
>>>>>> Am 2015-06-25 um 16:38 schrieb Tom McGlynn (NASA/GSFC Code 660.1):
>>>>>>> While I'm generally supportive of the compression proposal (at
>>>>>>> least for
>>>>>>> images), I feel that the current text reflects the sense of this
>>>>>>> being a
>>>>>>> convention rather than part of the standard.  By this I mean that
>>>>>>> if we
>>>>>>> are going to support compressed images and tables then they
>>>>>>> should be
>>>>>>> incorporated into the standard as first class objects.  The
>>>>>>> current text
>>>>>>> makes it clear that these compressed HDU's are compressed
>>>>>>> representations of virtual uncompressed images and tables. 
>>>>>>> Implicitly
>>>>>>> the idea is the the user converts from the compressed image to the
>>>>>>> uncompressed version and then processes that.  Instead we should
>>>>>>> recognize that a compressed image is just one of the ways that FITS
>>>>>>> allows one to store an image just like a pimary image array, an
>>>>>>> extension image or vector value in a table.
>>>>>>>
>>>>>>> So I would suggest that the ZSIMPLE, ZEXTEND, ZBLOCKED and such
>>>>>>> keywords
>>>>>>> be made optional with wording something like:  "If a compressed
>>>>>>> image is
>>>>>>> being used to compress an existing FITS image extension, the
>>>>>>> ZXTENSION
>>>>>>> keyword MAY be used to contain the value of the original extension."
>>>>>>> I'd suggest that in future use the use of these keywords be
>>>>>>> discouraged.
>>>>>>>
>>>>>>> The recommended practice would be that users treat the compressed
>>>>>>> image
>>>>>>> as the image and not worry about some
>>>>>>> intermediate image representation.
>>>>>>>
>>>>>>>
>>>>>>> My second major concern with with this convention is that it does
>>>>>>> seem
>>>>>>> rather ad hoc.  I think that it would be much better if the
>>>>>>> proposal was
>>>>>>> rigorously separated the algorithmic aspects from the
>>>>>>> non-algorithmic
>>>>>>> elements.  A mechanism for how additional compression techniques
>>>>>>> could
>>>>>>> be added should be notedE.g., the discussion of quantization
>>>>>>> should part
>>>>>>> of the implementation of the lossy compression algorithms and the
>>>>>>> ZQUANTIZ parameter should  probably be  one of the ZVALn, ZNAMEn
>>>>>>> elememts.  Table 36 should titled something like:
>>>>>>>      Supported Compression Algorithms
>>>>>>> with the first column being the name of the compression
>>>>>>> algorithm, the
>>>>>>> second the value of ZCMPTYPE, and also including the ZNAMEs using
>>>>>>> (flagging the critical ones).
>>>>>>>
>>>>>>> I've almost no insight into table compression.  Given that no one
>>>>>>> seems
>>>>>>> to be using this convention, my suggestion would be that it's
>>>>>>> premature
>>>>>>> to add to the standard.
>>>>>>>
>>>>>>> Overall I suspect that the tiling capabilities are going to be
>>>>>>> increasing essential for handling large images, so that at least
>>>>>>> that
>>>>>>> much needs to be made part of the standard.  However I don't feel
>>>>>>> this
>>>>>>> text is ready to be finalized.
>>>>>>>
>>>>>>>        Regards
>>>>>>>        Tom McGlynn
>>>>>>>
>>>>>>> Lucio Chiappetti wrote:
>>>>>>>> ANNOUNCEMENT:  START OF FORMAL PUBLIC COMMENT PERIOD
>>>>>>>>
>>>>>>>> This is to announce the official start of a 3-week formal Public
>>>>>>>> Comment Period on the incorporation of the Tiled Image
>>>>>>>> Compression and
>>>>>>>> Tiled Table Compression conventions in the FITS Standard.
>>>>>>>>
>>>>>>>> This is part of a process to incorporate the most useful and widely
>>>>>>>> used registered conventions (which are valid FITS constructs)
>>>>>>>> into the
>>>>>>>> official definition of the standard.
>>>>>>>>
>>>>>>>> Among these the two compression conventions benefit of a common
>>>>>>>> handling. Given their relative complexity they are better discussed
>>>>>>>> first, before other easier conventions.
>>>>>>>>
>>>>>>>> The proposed text consists
>>>>>>>>
>>>>>>>> - in the ADDITION of an entire new chapter (10)  to the FITS
>>>>>>>> Standard
>>>>>>>>      Document which describes the two conventions in a common
>>>>>>>> prescriptive
>>>>>>>>      framework.
>>>>>>>> - It also includes the ADDITION of a new non-prescriptive
>>>>>>>> Appendix I,
>>>>>>>> - plus the addition of the necessary bibliographic references,
>>>>>>>>
>>>>>>>>      and has been prepared by a technical team including
>>>>>>>> L.Chiappetti,
>>>>>>>>      W.Pence, A.Dobrzycki, R.A.Shaw and W.Thompson (main editor
>>>>>>>> Dick
>>>>>>>> Shaw).
>>>>>>>>
>>>>>>>> - If the proposal is approved also Appendix C will be updated
>>>>>>>> listing the
>>>>>>>>      new keywords, and a section H.3 will be added to Appendix H
>>>>>>>> describing
>>>>>>>>      the updates and the differences with the registered
>>>>>>>> convention.
>>>>>>>>
>>>>>>>>      All the updates are shown in blue colour in their current
>>>>>>>> context
>>>>>>>> (with
>>>>>>>>      the exception of the NEW chapter 10 which is black)
>>>>>>>>
>>>>>>>> The proposed draft text is available at
>>>>>>>> http://sax.iasf-milano.inaf.it/~lucio/FITS/Conventions/compression-upd2.pdf
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Supporting material is provided in the FITS Convention Registry
>>>>>>>> at the
>>>>>>>> following URLs
>>>>>>>> http://fits.gsfc.nasa.gov/registry/tilecompression.html
>>>>>>>> http://fits.gsfc.nasa.gov/registry/tiletablecompression.html
>>>>>>>>
>>>>>>>> Considering that the convention(s) have been in use since several
>>>>>>>> years, are legal FITS, were discussed on FITSBITS when the
>>>>>>>> conventions
>>>>>>>> were entered in the Registry and therefore their usage is well
>>>>>>>> proven
>>>>>>>> (also for what interoperability is concerned), the Public Comment
>>>>>>>> Period is reduced to 3 weeks.
>>>>>>>>
>>>>>>>> Also the review by FITS Working Group Executive can be speeded
>>>>>>>> up and
>>>>>>>> handled in parallel or quickly after the conclusion of the Public
>>>>>>>> Comment Period.
>>>>>>>>
>>>>>>>> Please review the text carefully and post any comments,
>>>>>>>> criticisms, or
>>>>>>>> suggestions on the FITSBITS mailing list (not on iauwfg or
>>>>>>>> elsewhere)
>>>>>>>> ==================================================================
>>>>>>>>
>>>>>>>> The Public Comment Period starts today 16 June 2015 and will last
>>>>>>>> formally for 3 weeks until July 6
>>>>>>>>
>>>>>>>> ==================================================================
>>>>>>>> Background information on the FITS approval process
>>>>>>>>
>>>>>>>> Under the "Rules and Procedures" of the IAU FITS Working Group,
>>>>>>>> http://fits.gsfc.nasa.gov/iaufwg/iaufwg_rules.html, the first
>>>>>>>> step in
>>>>>>>> the official approval process of any FITS proposal will be a formal
>>>>>>>> Public Comment Period to take place on the FITSBITS mailing list.
>>>>>>>> After that the IAU FITS Working Group Executive will review the
>>>>>>>> results. Following that the IAU FITS Working Group will then
>>>>>>>> conduct a
>>>>>>>> final vote to approve or disapprove the proposal.
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> fitsbits mailing list
>>>>>>> fitsbits at listmgr.nrao.edu
>>>>>>> https://listmgr.nrao.edu/mailman/listinfo/fitsbits
>>>>>>
>>>>>> -- 
>>>>>> BSc Richard van Nieuwenhoven
>>>>>> Software Architekt
>>>>>>
>>>>>> adesso Austria GmbH
>>>>>> floridotower 26. Stock              T +43 1 2198790-0
>>>>>> Foridsdorfer Hauptstr. 1            F +43 1 2198790-13
>>>>>> A-1210 Wien                         H +43 664 88614710
>>>>>>                                        E
>>>>>> richard.vannieuwenhoven at adesso.at
>>>>>>                                        www.adesso.at
>>>>>> -------------------------------------------------------------
>>>>>>             >>> business. people. technology. <<<
>>>>>> -------------------------------------------------------------
>>>>>> adesso Austria GmbH mit Sitz in Wien
>>>>>> Handelsgericht Wien FN231467v
>>>>>>
>>>>>> _______________________________________________
>> _______________________________________________
>> fitsbits mailing list
>> fitsbits at listmgr.nrao.edu
>> https://listmgr.nrao.edu/mailman/listinfo/fitsbits
> 


-- 
BSc Richard van Nieuwenhoven
Software Architekt

adesso Austria GmbH
floridotower 26. Stock              T +43 1 2198790-0
Foridsdorfer Hauptstr. 1            F +43 1 2198790-13
A-1210 Wien                         H +43 664 88614710
                                    E richard.vannieuwenhoven at adesso.at
                                    www.adesso.at
-------------------------------------------------------------
         >>> business. people. technology. <<<
-------------------------------------------------------------
adesso Austria GmbH mit Sitz in Wien
Handelsgericht Wien FN231467v



More information about the fitsbits mailing list