[fitsbits] UTF-8 in BINTABLE String Columns {External}

Lucio Chiappetti lucio at lambrate.inaf.it
Sat Apr 11 15:47:50 EDT 2026


On Sat, 4 Apr 2026, Mark Taylor via fitsbits wrote:
> However UTF-8 is inherently a variable-length encoding, so each
> character (strictly, each Unicode code point) may be encoded
> as 1, 2, 3 or 4 bytes.

That's while I advocate a specialized type which gives both the number of 
bytes and the number of code points (jVk)

On Mon, 6 Apr 2026, William Pence wrote:

> So this proposal is to allow UTF-8 characters in ‘A’ TFORM columns 
> in ASCII and Binary tables.  FITS headers would still be restricted to 
> ASCII characters only. Correct?

That I would say is a must of "once FITS forever FITS"

On Mon, 6 Apr 2026, Seaman, Robert Lewis - (rseaman) via fitsbits wrote:

> overall length calculations. But it seems a waste of time to try to
> squeeze UTF-8 (in all its glory) into ASCII FITS header records. This is
> another good reason to de-emphasize or deprecate ASCII FITS headers for
> header-style metadata in a bintable.

Yes, Unicode is essentially jsuyt for metadata not science data, and a 
specific metadata bintable would be desirable.

I thought something about that some 12 years ago, I cannot remember 
whether at time I circulated it to some restricted list. I recently sent 
it privately to some of you, but I think now it fould be as well disclosed 
generally

http://sax.iasf-milano.inaf.it/~lucio/FITS/NewTG/metaext.html
http://sax.iasf-milano.inaf.it/~lucio/FITS/NewTG/metaext.html#uni

Lucio Chiappetti



More information about the fitsbits mailing list