[fitsbits] 16-bit floats {External}

Fri Jul 25 06:10:44 EDT 2025

Hi everyone,

Thanks for all the feedback! And thanks Paul for offering to write up a
case, let me know if I can help. A few thoughts/comments regarding things
people have mentioned here:

First I think we should focus on 16-bit floats and not consider 128-bit
floats for now, since the latter are a lot more complicated as in fact most
languages don't even truly support 128 bits of precision but rather 80 or
64 depending on the platform, even for types advertised as 128-bit. 16-bits
are a lot more straightforward because either languages represent them
natively, or 32-bit floats can be used in memory without loss of precision.
I think adding 16-bit float support would have the biggest immediate
impact, and I was only asking about 128-bit floats at the same time out of
curiosity.

Here is a prototype implementation for supporting 16-bit float in Primary
and ImageHDU extensions in astropy, which is a pretty independent
implementation from CFITSIO (we only use CFITSIO for some of the tile
compression/decompression algorithms):

https://github.com/astrofrog/astropy/pull/139

Obviously this needs tests, and similar changes to support 16-bit floats in
tables, but it is a pretty simple change (2 lines added!). I don't think
supporting 16-bit floats will be very difficult for most packages compared
to some other changes, and indeed for languages that don't natively support
representing 16-bit floats they could just use 32-bit floats instead in
memory.

The main use cases that interest me have been mentioned in this thread:

- Radio cubes, which can be very large (100s of Gb or more in future) and
can in principle have high dynamic range, say 1e6 or more. Because of the
dynamic range, using 16-bit ints with BSCALE/BZERO isn't really an option,
so currently one would need to use 32-bit floats. However, in many cases,
if it was an option, using 16-bit floats would have sufficient significant
digits while also allowing large dynamic ranges.

- HiPS datasets, which can be very large (Tb+). If the tiles are stored in
FITS, 16-bit floats would in general be sufficient for the purposes of data
visualization, and would be faster to access than having to decompress
compressed data (Then of course there is a combination of this and radio
data - the proposed HiPS3D format can in principle be used to serve
gigantic datasets)

An alternative is to use e.g. RICE compression on 32-bit floats, but this
has a large computational cost at these data volumes. I did some
preliminary tests with the astropy 16-bit float implementation above with a
'small radio cube which was 7Gb in 32-bit floats, and writing out the data
as 16-bit floats was 15x faster than RICE-compressing it. Accessing the
full data was at least 2x faster with 16-bit floats than accessing the
RICE-compressed 32-bit data. The biggest performance benefit of 16-bit
floats would however be that tools could access only the exact data that is
needed (via e.g. memory-mapping, or range requests when dealing with remote
data) without having to decompress tiles. For instance, if I wanted to
access a single spaxel in my example cube, it is again almost 15x faster to
do so with 16-bit floats than with compressed data. Obviously this will
change a little depending on the tiling parameters, but in fact that is
another benefit - whereas with tiled compression it is often impossible to
optimise all use cases in terms of chunk size/shape, there is no such
choice to make for 16-bit floats.

Regarding comments about the FITS working group and bringing in new people,
with my astropy.io.fits maintainer hat on I would be happy to volunteer if
there was indeed interest in getting new people on board.

Cheers,
Tom

On Fri, 25 Jul 2025 at 03:08, Barrett, Paul via fitsbits <
fitsbits at listmgr.nrao.edu> wrote:

> No, FITS does not need to wait on the sub-16-bit standard. I see no use
> for it in the foreseeable future. I mention it to emphasize the fact that
> if one thinks 16-bits floats are not useful, then sub-16-bits floats are
> even less so.
>
> Note that the use of 16-bit floats is kind of a chicken and egg problem.
> If there is no astronomical data format to store them, then no one is going
> to write software to use them. And if no one writes software to use them,
> then there are no use cases to justify including them in an astronomical
> data format.
>
> I'll try to find the time late next week to write up a use case. I've got
> some deadlines next week.
>
>  -- Paul
>
>
> On Thu, Jul 24, 2025 at 9:46 PM Seaman, Robert Lewis - (rseaman) <
> rseaman at arizona.edu> wrote:
>
>> In that case, it should be easy to write up a few specific astronomical
>> use cases. And collect citations to pertinent literature.
>>
>>
>>
>> Does FITS need to wait on the new sub-16-bit standard before taking
>> action?
>>
>>
>>
>> Rob
>>
>>
>>
>>
>>
>>
>>
>> On 7/24/25, 5:08 PM, "fitsbits" wrote:
>>
>>
>>
>> *External Email*
>>
>>
>> ------------------------------
>>
>> Malcom:
>>
>>
>>
>> Just because we are old does not mean that we are not working on
>> state-of-the art software. The reason for wanting 16-bit floats is for this
>> specific reason. Radio astronomy software would benefit from 16-bit floats.
>>
>>
>>
>> All:
>>
>> As I noted previously, the IEEE is working on sub-16-bit (<= 15) floating
>> point formats for AI. There is clearly a need for such data types in the
>> machine learning community. Currently, hardware vendors are using various
>> sub-16-bit formats. The new standard is meant to standardize on a common
>> one, just like IEEE 754 did decades ago.
>>
>>
>>
>>  -- Paul
>>
>>
>>
>>
>>
>> On Thu, Jul 24, 2025 at 6:32 PM Malcolm J. Currie via fitsbits <
>> fitsbits at listmgr.nrao.edu> wrote:
>>
>> Being (allegedly) retired and having little contact with the
>> astronomical programming world post COVID, it's no surprise that
>> I too was unaware of 16-bit floats before this discussion.
>>
>> My sympathies are with Walter's view, as I'm inclined towards regarding
>> FITS as an interchange format rather than a processing format.  My bias
>> is partly due to having our own flexible and extensible format in
>> Starlink.  I'm wary of additions to the standard that have a quite
>> limited usage solely for FITS as a working format.
>>
>> I agree with the points and questions made by Rob, especially how widely
>> would these two additions be used and evidence of their benefits.  A
>> document for discussion covering these points, how the new data types
>> would be implemented, and the implications for existing software
>> packages should be presented.  In many packages the float16 FITS data
>> would presumably have to be converted to another data type, or say it's
>> not supported.
>>
>> Starlink's data system is now built on HDF5 (rather than the in-house
>> HDS that preceded it).  A quick search turned up an RFC to add 16-bit
>> floats to HDF5, and a long user discussion of the RFC, but I've yet to
>> find HDF's justification for the introduction of float16, or whether it
>> has been implemented.
>>
>> Wikipedia, that trusty source of knowledge and wisdom,
>> (https://en.wikipedia.org/wiki/Half-precision_floating-point_format)
>> relates that the concept has been around as early as 1982.  Also I'd
>> forgotten IEEE 754 had a binary16, presumably because there wasn't a
>> perceived need and it wasn't in the floating-point addition to FITS.
>> We already had BSCALE/BZERO to increase the dynamic range.  The
>> Wikipedia entry does give some reasons for using 16-bit floats, such as
>> machine learning, and lists programming languages that support it, but
>> I'd certainly like to hear more on potential benefits in astronomy.
>>
>> Lucio:
>> > That is true, and it was what I referred to as the early times when one
>> > could write a reader of FITS magtapes on 36-bit or 60-bit mainframes. Or
>> > different endianness.
>>
>> Hence the strange, to a young audience, 2880-byte logical record length.
>> That accompanies the restrictive 80-byte headers that must look so
>> antiquated to postdocs and students.  Are they doing their own thing
>> rather than join in the FITS discussion?  Most of the participants in
>> this thread are of, let's say, a certain vintage.
>>
>> Lucio:
>> > Also I seem to remember in the past there was a rule, or at least a
>> > practice, that a new feature has tp be supported by two independent
>> > existing implementattions.
>>
>> Indeed.
>>
>> >> And finally, finally, discussions like this and about JPEGXL
>> demonstrate
>> >> that we need to revive the IAU FITS WG.
>> >
>> > And that's even more true than all the rest, specially for a delicate
>> item
>> > like basic data formats.
>>
>> Yes but who would serve on it?  There needs to be a mix of decades of
>> FITS experience, combined with a new generation who work with
>> cutting-edge data and state-of the-art tools or who are designing
>> upcoming major data-generating projects.
>>
>> Malcolm
>>
>>
>> _______________________________________________
>> fitsbits mailing list
>> fitsbits at listmgr.nrao.edu
>> https://listmgr.nrao.edu/mailman/listinfo/fitsbits
>>
>> _______________________________________________
> fitsbits mailing list
> fitsbits at listmgr.nrao.edu
> https://listmgr.nrao.edu/mailman/listinfo/fitsbits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listmgr.nrao.edu/pipermail/fitsbits/attachments/20250725/4385ff5b/attachment-0001.html>