[fitsbits] start of Public Comment Period on the column limits convention

William Pence William.Pence at nasa.gov
Sun Jul 12 23:04:06 EDT 2015


On 7/10/2015 9:42 AM, Lucio Chiappetti wrote:
> I collate here some (late) replies to suggested textual changes (all
> meaningful, most sensible). These are just my opinions, not an ultimate
> decision. I present them here to give public visibility "for the
> records", while at the same time drawing the attention of the convention
> editing team.
>
> On Tue, 23 Jun 2015, Mark Calabretta wrote:
>
>> - As they are specific to tables, TDMINn/MAXn & TLMINn/MAXn belong in
>>  Chapter 7, not 4, specifically, in Sects. 7.2.2 and repeated in 7.3.2.
>
> It is possible. Putting them here jsut avoids the repetition of one
> column of text.
>
> Or to avoid duplication should we move 4.4.2.7 to 7.4 "Features common
> to ASCII and binary tables" ?
>
>
>>  (I also suggest that the current misleading title of Sect. 4.4.2.6 be
>>  changed to "HDU labelling keywords".)
>
> 4.4.2.6 or 4.4.2.7 ??
>
>> - In the following sentence
>>
>>    Any undefined elements in the column (or any other IEEE special
>>    values in the case of floating point columns in binary tables)
>>    *shall* be excluded when determining the value of these keywords.
>>
>>  if the intention of the reference to IEEE special values is to exclude
>>  plus and minus infinity, then that should be stated explicitly, and
>>  some sort of explanation given.

> I don't think that was the case (personally I think mainly of NaN but I
> do not know what the original authors had in mind ,,, seems wise to
> exclude all IEEE special values)

The intention is to exclude all IEEE special values.  This wording is 
similar to the definition of DATAMIN and DATAMAX in section 4.4.2.5.

>> - In the following statements
>>
>>  1. TDMINn Keyword.  The value field *shall* contain ...  the minimum
>>  2. TDMAXn Keyword.  The value field *shall* contain ...  the maximum
>>  3. If the value of TDMINn is greater than TDMAXn, ..., then the values
>>     of the pair of keywords *should* be interpreted as undefined.
>>
>>  from (1) and (2), which are imperative, it follows that TDMINn <=
>>  TDMAXn imperatively, so TDMINn > TDMAXn can only happen by mistake.
>
> I think so. My personal inclination would be to swap TDMINn and TDMAXn
> is they are TDMINn > TDMAXn, but that would be a change wrt the original
> convention text from which the statements are taken.
>
>>  Surely the standard needn't elaborate on the infinity of mistakes
>>  that FITS writers might make.  Anyway, in such a case, "*should* be
>>  interpreted" ought to be "*shall* be interpreted".
>
> Ditto. We can change the wording wrt the original convention, but we
> have to record it in Appendix H.3 (and more important, authors of s/w
> using the convention shall change their code !)

Setting TDMINn > TDMAXn is not a mistake and instead is an option that 
software that modifies the values in that column can use to flag that 
min and max values are (perhaps) no longer valid.  This provides an 
alternative to the more computationally expensive options of a) having 
to recompute the new min and max values, or b) delete the keywords from 
the header.

In practice, however, I don't think any software has ever used this 
option, so I doubt if there will be any objections to eliminating this 
feature from the description of these keywords in the FITS Standard.

>> - Instead of saying
>>    the minimum physical value actually contained in column n
>>  why not just say
>>    the minimum value in column n
>
> I believe the reason is to distinguish the TD from the TL keywords.
> For photon lists, the TL keywords specify the legal limit of XY pixels,
> or PHA channels or alike ... usually something like 0-1023, 0-255 etc.
> while the TD are the particular values actually found in a specific file
> which can be in a narrower range.

The defined term (see section 2.2) 'physical value' is used here to make 
it clear that the keyword value refers to the column values *after* the 
scaling by TSCALn and TZEROn has been applied.  Similar wording is used 
in the definition of the DATAMIN and DATAMAX keywords.
I agree that the words 'actually contained' are not needed.

>> - Given the two statements
>>  1. TLMINn Keyword.  The value field shall contain ...  minimum legally
>>  2. TLMAXn Keyword.  The value field shall contain ...  maximum legally
>
>>  logically it follows that a value outside the range [TLMINn,TLMAXn]
>>  must be illegal, which is not what the following says:
>>
>>  3. It is permissible to have values in a column that are less than
>>     TLMINn or greater than TLMAXn, however, the interpretation of any
>>     such out-of-range column elements is not defined.
>>
>>  Instead, I imagine it should be
>>
>>     Column values outside the range TLMINn to TLMAXn *shall* be
>>     interpreted as undefined.
>
> I have no idea why the original wording was in the convention. If it
> were some sort of ADC readout, I guess it cannot by construction be
> outside [TLMINn,TLMAXn] unless it is explicitly set for flagging purposes.

The TLMINn and TLMAXn keywords are used in practice to define the range 
of values that should be included when creating histograms of the values 
in that column (e.g., when creating a 2D image from the X and Y pixel 
list columns).  For various reasons, observatories may want to exclude 
events that are located outside of what is considered to be a 'legal' 
(or 'good') range  of values (e.g. when 'over clocking' the readout from 
a CCD detector).  Events that are located outside of this 'legal' range 
are still perfectly valid, but are excluded when generating standard 2D 
images or 1D spectra from the observation.

>
>> - Instead of saying
>>    the minimum legally defined physical value that may be contained in
>>    column n
>>  why not just say
>>    the legal minimum for a value in column n
>
> is there a nuance about "legally defined physical value" meaning a
> physical impossibility (again case of ADC values and alike ?)
>
>> - I guess if one or other of TLMINn and TLMAXn are omitted, the other
>>  defaults to plus or minus infinity.
>
> Hmm ... that does not make sense in my X-ray mindset.

I've never seen a FITS file that only has one of the pair of keywords, 
but this is not forbidden.  If the keyword is not present, then I would 
presume that the value defaults to the min or max value that can be 
represented by the data type of that column (e.g. -32768 or +32767 for 
an 'I' column in a binary table.)

> On Thu, 25 Jun 2015, Tom McGlynn (NASA/GSFC Code 660.1) wrote:
>
>> One question that I'd like to see clarified for these keyword is whether
>> if a have TDMINx and TDMAXx that there is required to be an actual row I
>> can point to which has this value?

Since TDMINn and TDMAXn represent the min and max values in column, it 
follows that there must be at least one row in the column with those 
values.  How could this not be the case?

>>
>> Do we need to worry about round off errors between the binary and text
>> representation of numbers?

Well, software developers certainly need to be aware of the effects of 
rounding errors (an other numerical precision issues) when reading and 
writing FITS data, but this is a ubiquitous issue that pervades all of 
FITS and is not unique to these particular keywords.

>
> my inclination would be no to both questions.
>
>
> On Sun, 28 Jun 2015, Tom McGlynn wrote:
>
>>  - Can these keywords be used to refer to columns that are vector
>>    valued.  If so does the limit apply to each element of the vector?
>

yes

> my inclination would be to say yes
>
>>    The analogy to BUNIT suggests yes, but the wording regarding 'same
>>    type data type as physical values' is ambiguous.
>

> "same scalar data type" ? "same ... values of the individual elements in
> the associated column" ?

To my mind, 'data type' has nothing to do with whether it is a scalar or 
a vector.

>>  - Should there be a mechanism to allow for min/max values of strings?
>>    Is it illegal to extend these keywords to use for
>>    non-integer/floating point data?  (i.e., strings, complex, bit,
>>    boolean)?  I think the answer to that is yes, but I believe it should
>>    be explicit.
>
> well ... strings and boolean are not numeric, min-max does not make
> sense in the complex plane (unless one thinks of abs). The text states
> "(either an integer or a floating point number)" ... is that not clear
> enough ?

>>  - Is it legal to have an integer limit and a floating point column?
>>    Vice versa?

Yes (to the first question) because the decimal point is optional in a 
floating point number, so '2' is a valid floating point number. See 
section 4.2.4.  But '3.5' is not a valid integer, so that is not allowed.

> I would say no because it is written "These keywords must have the same
> data type as the physical values in the associated column"
>
>>  - If limits on vector valued columns are allowed, then the situation
>>    for complex values is slightly more confused.
>
> I assume they are excluded
>
>>  - The language suggested in 7.2.2 might suggest that the use of these
>> keywords is mandatory. While this is addressed elsewhere I might
>> suggest a language like:
>>  "When describing... a user <shall>..."
>>  The current phrase
>>   "To describe...  a user <shall> ..." sounds to me (at least in
>> isolation) like this is something a user has to do, as opposed to the
>> required way of doing something if a user chooses to do it.
>
> IF we mantain the short text in 7.2.2 and 7.3.2, this is my fault. I
> wrote the cross-references outside the main text. Being a non-native
> speaker, I might have used an unappropriate wording. I meant "if one
> needs to do it, then use those keywords (and do not invent other)"
>
> Of course if we remove 4.4.2.7 and duplicate it, the short text goes away.
>

-Bill
-- 
____________________________________________________________________
Dr. William Pence    Astrophysicist     William.Pence at nasa.gov
NASA/GSFC Code 662     [Emeritus]       +1-301-286-4599 (voice)
Greenbelt MD 20771                      +1-301-286-1684 (fax)



More information about the fitsbits mailing list