<div dir="ltr">Because this is somewhat of a breaking change, would it not be beneficial in the long run to extend this to UTF-16 and UTF-32?<div><br></div><div> -- Paul</div><div><br></div></div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Thu, Mar 26, 2026 at 9:44 AM Francois-Xavier PINEAU via fitsbits <<a href="mailto:fitsbits@listmgr.nrao.edu">fitsbits@listmgr.nrao.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><u></u>
<div>
<p>Dear fitsbits,</p>
<p><br>
# Background</p>
<p>VOTable (v1.5) is closely compatible with the FITS Binary Table
format:<br>
<a href="https://www.ivoa.net/documents/VOTable/20250116/REC-VOTable-1.5.html#tth_sEc2.3" target="_blank">https://www.ivoa.net/documents/VOTable/20250116/REC-VOTable-1.5.html#tth_sEc2.3</a></p>
<p>In the current draft of VOTable 1.6<br>
<a href="https://github.com/ivoa-std/VOTable/releases/download/auto-pdf-preview/VOTable-draft.pdf" target="_blank">https://github.com/ivoa-std/VOTable/releases/download/auto-pdf-preview/VOTable-draft.pdf</a>
,<br>
UTF-8 strings replace the previous ASCII-only strings.<br>
</p>
<p>If FITS cannot store UTF-8, <strong>lossless round-trip conversion from VOTable to
FITS will no longer be possible</strong>.<br>
Some limitations already exist (e.g., unsigned integer logical
types), but UTF-8 seems more critical.</p>
<p>Personal use cases include the usage of HEALPix sorted and
indexed BINTABLES to build on-the-fly HATS products<br>
or intermediary HiPS catalogue representations from VizieR data
(will contains more and more UTF-8).<br>
* HATS: <a href="https://www.ivoa.net/documents/Notes/HATS/" target="_blank">https://www.ivoa.net/documents/Notes/HATS/</a><br>
* HIPS catalogue: <a href="https://www.ivoa.net/documents/HiPS/" target="_blank">https://www.ivoa.net/documents/HiPS/</a><br>
* VizieR: <a href="https://vizier.u-strasbg.fr/" target="_blank">https://vizier.u-strasbg.fr/</a><br>
<br>
</p>
<p># Possible Solutions</p>
<p>## 1. Use UTF-8 in existing `TFORMn=rA`</p>
<p>Like in VOTAble 1.6, interpret `r` as bytes instead of
characters.<br>
May break truncation operations (TDIPS) if a multi-byte UTF-8
character is split.</p>
<p>## 2. Logical type "UTF-8" backed by a byte array</p>
<p>TFORMn = rB <br>
TLOGTn = 'UTF-8' / LOGT stands for LOGical Type<br>
<br>
Unaware readers see a byte array; UTF-8 aware readers interpret it
as a string.<br>
Introduces two string types in FITS (ASCII and UTF-8).</p>
<p>## 3. New TFORM type (e.g., `TFORMn=rU`)</p>
<p>Definite breakage for current readers.</p>
<p><br>
</p>
<p># Existing Implementations</p>
<p> * TOPCAT/STILTS (Java): Prototype supports Solutions 1 and 2 for
read/write (private communication with Mark Taylor).<br>
* fitstable (Rust): Supports Solutions 1 and 2 for reading
(<a href="https://github.com/cds-astro/cds-fitstable-rust" target="_blank">https://github.com/cds-astro/cds-fitstable-rust</a>).<br>
* VizieR: Appears to provide UTF-8 in TFORMn=rA columns (Solution
1).<br>
* ??</p>
<p><br>
</p>
<p># Feedback Requested</p>
<p>I am curious about:<br>
* other possible approaches<br>
* fitsbits opinions on the most practical solution<br>
* other people interested in having UTF-8 in BINTABLE columns<br>
<br>
Currently, Solution 1 seems the simplest and Solution 2 the
safest,<br>
but I welcome constructive comments and experience from the
community.</p>
<p>Best regards,</p>
<div>-- <br>
<p style="font-family:Arial,sans-serif;color:black;font-size:14px"><span style="font-weight:bold">Francois-Xavier Pineau</span><br>
Ingénieur de Recherche<br>
Tél : +33 (0)3 68 85 24 14,<br>
<a href="mailto:francois-xavier.pineau@astro.unistra.fr" title="Contacter francois-xavier.pineau@astro.unistra.fr" target="_blank">francois-xavier.pineau@astro.unistra.fr</a><br>
<br>
Centre de Données astronomiques de Strasbourg (CDS)<br>
11, rue de l'Université - E03<br>
<br>
<br>
</p>
</div>
</div>
_______________________________________________<br>
fitsbits mailing list<br>
<a href="mailto:fitsbits@listmgr.nrao.edu" target="_blank">fitsbits@listmgr.nrao.edu</a><br>
<a href="https://listmgr.nrao.edu/mailman/listinfo/fitsbits" rel="noreferrer" target="_blank">https://listmgr.nrao.edu/mailman/listinfo/fitsbits</a><br>
</blockquote></div>