From jaffe at strw.leidenuniv.nl  Tue Nov 18 08:11:45 2025
From: jaffe at strw.leidenuniv.nl (jaffe)
Date: Tue, 18 Nov 2025 13:11:45 +0000
Subject: [fitsbits] Fwd: Error estimates in OI_FITS (and elsewhere).
 {External}
In-Reply-To: <5ca0658506d5c07af0860f0abb21c2b5@mail.strw.leidenuniv.nl>
References: <5ca0658506d5c07af0860f0abb21c2b5@mail.strw.leidenuniv.nl>
Message-ID: <de24f6a3619cf4859b158b9beb78cf72@mail.strw.leidenuniv.nl>


In the MATISSE OIR interferometry group we are developing a rather 
useful and general modelling program call OIMODELER (c.f. 
https://github.com/oimodeler/oimodeler).  The modeller assumes OI_FITS 
files as inputs. We are bumping up against the fact that the error model 
represented in OI_FITS is inadequate.  The OI_FITS convention describes 
tables with entries like VIS2DATA (squared visibilities) with an 
associated VIS2ERR.  These are listed as wavelength vectors for each 
UV-point.  Similar entries exist for differental phase and closure 
phases. The problem is that the modeller has to assume that the errors 
are uncorrelated in wavelength and/or time and/or baseline, while in 
reality correlations exist.  The most common case is that in the 
wavelength direction there is both true noise, from Poisson photon noise 
and detector readout noise, which is almost uncorrelated between 
recorded wavelength pixels, and also calibration errors that are highly 
correlated, usually almost constant or slowly varying over the whole 
observed band.  In the time direction successive raw records are 
(almost) uncorrelated, but sometimes the reduced and calibrated data has 
been averaged in time to reduce the data volume.  If the averaging is 
e.g. a gaussian convolution, then the processed records are correlated 
in time.

For the modeller this is a big problem.  It typically calculates the 
probability of the entire observation set for some set of input 
parameters, but to do this is has to know whether all the data points 
are statistically independent, and this is not represented in the input 
data.

For the MATISSE case I have suggested a pragmatic solution: make sure 
that wavelength and time binning is true binning (and not convolutions) 
so no correlations are created, and create a separate column to 
represent the calibration errors, which are almost constant over 
wavelength.
This might be a reason to update the OI_FITS conventions.

The general problem seems very complicated.  The correlation in 
wavelength can be a complicated function of pixel separation and 
wavelength.  Similarly for correlations in time or between spatial 
pixels.

Has anyone dealt with this problem?  Any suggested solution?

Walter