[fitswcs] Re: WCS activity

Thu Aug 9 04:21:05 EDT 2001

On Fri 2001/07/06 22:35:32 MST, Doug Tody wrote
in a message to: FITSWCS <fitswcs at NRAO.EDU>

>A report of our various investigations, including some recommendations
>for further minor revisions to the draft WCS standard is given in the
>following document:
>
>    http://iraf.noao.edu/projects/fitswcs/cover.html
>
>The above document is only a summary.  Further elaboration is given in
>the links pointed to at the end of the document.

Greetings,

Eric and I have reviewed the NOAO proposal and my detailed notes concerning
it are appended.

We believe that the DCF extension of CTYPEj is a good generalization of the
currently proposed distortion corrections.  In broad terms we are looking to
simplify Papers I, II, and III by removing distortion-related material (pixel
regularization, TAN+poly, DSS header translation example, Paper III Eq. 9,
etc.) and moving it into Paper IV:

   "Representations of Distortions in FITS World Coordinate Systems". 

This would be presented as "advanced" WCS usage.  Papers I, II, and III would
contain forward references to IV, and in particular, paper I would reserve
usage of keyword names and values for IV.

In broad outline, Paper IV would introduce the formalism in general terms in
much the same way as Paper I.  This would include the DCF CTYPEj convention
and usage of the DVj_ms, etc.

It would then discuss some specific applications: polynomial, spline, pixel
regularization (PRC) and probably others.  This is analogous to Paper II's
introduction of specific celestial projections.  The polynomial would be
Frank's current document, PRC would come from Paper I.  Someone would add
cubic splines and B-splines.

Then, by analogy with Paper II's tutorial section, there would be a section
discussing particular applications of some of these specific DCFs.  Frank's
spectral examples would be here (even better if he could recast them as
splines).  Eric would do an example of wide-field Doppler correction.  I
would repackage Section 4.1.2 from Paper II (TAN+poly) as -TAN-PLY as just
one of many possible ways of handling a plate solution in the more general
formalism.  The DSS example from Paper II would be recast as -TAN-PLY as a
nitty-gritty example.

Thus IV is logically the right place in the sequence of papers since it
builds on what went before, particularly in the tutorial section.  It also
allows each of Papers II and III to defer discussion (with a forward
reference) of somewhat esoteric or instrument-specific problems - plate
solutions, Fabry-Perot, wide-field Doppler correction, etc.

IV is also the correct place in light of the way that Eric and I intend to
conduct this process.  We will produce a detailed list of changes to the text
of I, II, & III plus a first draft of IV which assumes these changes.  Changes
to I, II, & III will mostly be to remove material which would reappear in IV.
While the changes will be subject to discussion in the usual way, no text (or
software) will be rewritten.  It is our intention that the list of changes
will be so detailed as effectively to serve as a sed script from which the new
can be derived from the old, and in that sense the papers could be said to
have been reworked.

If/when the FITSWCS list accepts I, II, & III with the list of changes these
papers will be revised and forwarded for ratification.  Paper IV, with its
list of co-authors, would be open for discussion on FITSWCS.  However, if the
FITSWCS list cannot reach agreement on revisions to Papers I, II, & III then
Paper IV, and all proposed changes to Papers I, II, & III will be abandoned.
Papers I, II, & III would remain in limbo.

Mark Calabretta, ATNF
Eric Greisen, NRAO

>>>

Notes on NOAO DCF proposal documents
------------------------------------

 1) DCF extension of CTYPEj is a good generalization of the currently proposed
    distortion corrections.

 2) Notation
    --------

    The DCF proposal documents should use our carefully defined index
    notation!  E.g. PVj_ms, not PVl_ns.

    I will use our notation.

 3) Location of the distortion correction in the algorithm chain
    ------------------------------------------------------------

    Should the distortion correction come before or after the CD matrix?

    There are good arguments either way:

    A thread on fitswcs 2000/Jan: "Detector distortion correction
    representations in FITS" argues that it should come before, since it's
    related to the physical media.  E.g. distortions in CCD arrays are
    expected to be pretty-well static for each chip, so correct for
    distortions first, then rotation, skewness and scale next.

    On the other hand, higher-order distortion on a Schmidt plate
    (Pat Wallace's BENDXY) are fixed with respect to the focal plane and if
    the plate is rotated or translated it makes sense to apply these
    corrections via the CD matrix before the distortion function.

    Perhaps there should be a provision for DCFs which come before or after
    the CD matrix!  E.g. PLP before and PLX after.  Similarly SPP/SPX for
    splines and TBP/TBX for table-based corrections.

    A DCF before the CD matrix would require offset parameters (analogous to
    CRPIXi) as well as scaling.

    I will assume that DCFs have offsets as well as scaling parameters.

 4) Form of CTYPEj keyword
    ----------------------

    The form of the DCF keyword must be spelt out explicitly.  Extend CTYPEj
    from "4-3" form to "4-3-3".  Algorithm code and DCF code to be right-
    padded with blanks or "-" as appropriate, e.g.

       "RA---UV     "   or   "RA---UV--XY "
        4444-333-333          4444-333-333

    Distortions applied to simple linear axes would have "8-3" form, e.g.
    "WAVELEN--PLY".

 5) Image transposition
    -------------------

    DCF formalism should allow for transposition of axes.  Currently, if I
    have

       CTYPE1 =  'RA---AZP'
       CTYPE2 =  'DEC--AZP'
       PV2_1  =     202.64

    then I can transpose the data array provided that I swap keywords:

       CTYPE1 =  'DEC--AZP'
       CTYPE2 =  'RA---AZP'
       PV1_1  =     202.64

    This is not the case with the DVj_ms as defined because there is an
    implicit ordering on the axis number.

 6) Subimaging
    ----------

    Suppose we have spatial and spectral distortion described by PLY for all
    axes in a cube:

       CTYPE1 = 'RA---TAN-PLY'
       CTYPE2 = 'DEC--TAN-PLY'
       CTYPE3 = 'WAVELEN--PLY'

    Suppose, however, that the spatial distortion was independent of
    wavelength (as in Example 4.4 of the DCF document) and I wanted to extract
    a subimage consisting of a single image plane and put it in a separate
    FITS file.

    In the present formalism this would be a messy job because the numbering
    of the DVj_ms for the spatial axes is based on the existence of that
    third, spectral, axis.

 7) Axis coupling
    -------------

    In the general case, distortion coupling between axes is not as simple as
    the DCF document suggests.

    For example, consider a data cube with two spatial axes and one spectral
    axis and suppose that distortion in the spectral axis was a function of
    position (e.g. a Fabry-Perot interferometer).  Suppose you want to use a
    PLY to correct for wavelength distortion.  This forces you to attach PLY
    to the spatial axes as well:

       CTYPE1 = 'RA---TAN-PLY'
       CTYPE2 = 'DEC--TAN-PLY'
       CTYPE3 = 'WAVELEN--PLY'

    This is so even if there is no spatial distortion, in which case the
    polynomial for the spatial axes would have zero coefficients.  This is
    inefficient.

    However, it is quite possible that there could be a spatial distortion
    which you might want to describe with, say, a spline function.  In that
    case you would also have to attach SPL to the wavelength axis:

       CTYPE1 = 'RA---TAN-SPL'
       CTYPE2 = 'DEC--TAN-SPL'
       CTYPE3 = 'WAVELEN--SPL'

    Then you would have to define a 3D spline fit, an ugly job.

    So you're left with a compromise, you can't use PLY for spectral together
    with SPL for spatial.

    This problem arises because of the dual function of the DCF codes: firstly
    to define an algorithm, and secondly to associate axes.

 8) Explicit axis coupling
    ----------------------

    A simple solution to the problems raised by transposition, subimaging and
    unwanted axis coupling would be to associate axes explicitly.

    For example, if you wanted

       CTYPE1 = 'RA---TAN-SPL'
       CTYPE2 = 'DEC--TAN-SPL'
       CTYPE3 = 'WAVELEN--PLY'

    but with wavelength dependent on the spatial axes, then you could simply
    say so via the DVj_ms parameters.

    For each axis, DVj_0s would say how many other axes the intermediate world
    coordinate for axis j was dependent on (i.e. N); the next N parameters
    would say which axes and in which order; then N offsets; then N
    normalization values; then N coefficients for computing r; then N+1 values
    of the highest power, then the required number of coefficients.

    All DCFs would assign the same meaning to the first 1+N+N+N values of
    DVj_ms.

    This eliminates the need for DCF qualifiers.

    The transposition problem is solved by transposing the values of DVj_1s to
    DVj_Ns appropriately (for each j).

    The subimaging problem above is solved trivially; since the DVj_ms for the
    spatial axes would only refer to each other, the header cards relating to
    the spectral axis could be eliminated (if so desired).

 9) The PVj_ms, PSj_ms, DVj_ms and DSj_ms keywords
    ----------------------------------------------

    DVj_ms provides for 10,000 parameters (0 - 9999) for single digit axis
    number with no multiple WCS letter.

    However, note that Paper I currently specifies that s is blank for the
    primary axis description.  This would restrict the DVj_ms to 0 - 999.

    Double digit axis number is unlikely, ok, but multiple WCS code is not so
    unlikely.  Restricts DVj_ms to 1,000 values.  This may be insufficient as
    shown below.

    For BINTABLEs containing image arrays or for pixel lists, 10,000 DVj_ms
    values can be obtained but only for single digit axis numbers and single
    digit column numbers with no multiple WCS code (Paper I, Appendix C).

    Is it proposed to extend the number of PVj_ms keywords (currently 0-99) to
    match the DVj_ms (0-9999)?

    Does the number of PSj_ms match that of the DSj_ms (10,000)?

    What is the character length of DSj_ms and PSj_ms?

    Software will have to be prepared to parse PSj_ms whether it's used or
    not.

    Each axis may have up to 10,000 possible PSj_ms and DSj_ms values and
    software systems without dynamic memory allocation will have to reserve
    sufficient space for them ahead of time.  If the PSj_ms and DSj_ms have a
    maximum character length of, say, 25 then 500kb of memory needs to be
    reserved - for each axis!

    To this must be added 80kb for the DVj_ms (plus 80kb for PVj_ms?), per
    axis (assuming the parameters will be maintained as double precision
    variables).

    Thus software without dynamic memory allocation which wants to process up
    to four axes will have to reserve 2.64Mb in array space.  Multiply this by
    N for the number of WCS it expects to have to process simultaneously.  For
    IFS images composed of spectra from many slitlets, N could be in the
    hundreds.

    Clearly the number of PSj_ms and DSj_ms should be restricted!

    No example of the need for PSj_ms has been given.  At least one persuasive
    example of PSj_ms usage should be provided. 

10) The NAXISi=1 question
    ---------------------

    NAXISi=1 is used in both examples in Wells, Greisen & Harten (1981), the
    first example FITS headers published!  Moreover, the first example has
    a degenerate axis sandwiched between two non-degenerate ones.  NAXISi=1 is
    also shown in Greisen & Harten (1981).  

    Usage of NAXISi=1 is deeply entrenched in radioastronomy, more so than
    that of the CDj_i matrix in optical astronomy for which we've already made
    detrimental changes to the spec.

    It would be desirable for extensive past usage of NAXISi=1 in
    radioastronomy (and probably elsewhere) to be supported in future by all
    areas of astronomy.

    NAXISi=1 is only used for transport purposes and can be translated to
    whatever internal representation is required.

    NAXIS comes first in header.  FITS readers know up-front how much memory
    to allocate for WCS parameters.

    New proposal assumes degenerate axes come last.  Subimaging may violate
    this.

    Only a question of the aesthetics of the syntax.  Don't invent new
    conventions which add nothing to the existing ones.

    Simple solution: continue to use NAXISi=1.  FITS interpreters read NAXIS
    value and can reduce it by one for each NAXISi=1 encountered to deduce
    "NDIM" and "WCSDIM".

    Paper I needs to describe the use of NAXISi=1.

11) Errors arising
    --------------

    Our intention was that a FITS interpreter could choose to ignore the PRC
    if it chose to accept the small error arising.

    The DATAMAX, DATAMIN cards recorded the maximum and minimum value of the
    distortion in the image, and MAXCORi recorded the maximum for each axis. 

    This should be propagated to the DCF formalism (but maybe DATAMAX and
    DATAMIN are not the best names).

12) Encapsulation
    -------------

    Encapsulation refers to the separability of the FITS header parser and the
    WCS interpreter; the parser should not need to know any more than the
    generalities, i.e. CDj_i, CRPIXi, CTYPEj, CRVALj, PVj_ms, DVj_ms, DSj_ms.

    This certainly is a desirable design goal.  It means that global
    parameters need to be eliminated as far as possible.  Presently this means
    LONPOLEs, LATPOLEs, RESTFRQs, RESTWAVs, and FPRADIUS.  Arguably it also
    includes RADESYSs and EQUINOXs which qualify the first half of the CTYPEj
    card rather than the second.

    In order for encapsulation to be possible we need a convention on default
    values.  The proposed convention, that the default always be zero, is
    difficult to make workable in general and will lead to artificial
    constructs.  For example, the default for LONPOLEs depends on other
    parameters, sometimes it is zero and other times it is 180 deg.  Similarly
    for LATPOLEs.

    One solution would be that on reading the NAXIS keyword the FITS parser
    should allocate an array to hold the PVj_ms and initialize each element
    with a non-signalling NaN value (any "magic" value will do).  As the
    parser reads each PVj_ms card the corresponding array element is
    initialized; when the parser has finished reading the header, defaulting
    PVj_ms will leave NaNs in the array.  Later, when the WCS interpreter
    software first encounters one of these NaN values it would substitute the
    WCS-specific default value (e.g. this would be done by the wcsset
    routines).  This means that non-zero defaults can be defined and that FITS
    header parsers don't need to know any of the details of the WCS.

    Should LONPOLEs and LATPOLEs be considered an exception?  Originally they
    were made into separate header cards because they complement the CRVALj in
    defining the three Euler angles for a spherical coordinate transformation.
    However, I think they should probably be included in the PVj_ms, so that
    for all celestial projection types, PVj_0s would contain phi_p (i.e.
    LONPOLEs) and PVj'_0s would contain delta_p (i.e. LATPOLEs), where j and
    j' indicate the longitude and latitude axis respectively.

    With this machinery, I would go further and record (phi_0,theta_0), the
    reference point of the projection, in PVj_1s and PVj'_1s.  This would have
    some very useful applications as discussed elsewhere.

    RESTFRQs and RESTWAVs are so basic that you'd want human readers to be
    able to see them clearly in the header (in fact, RESTFREQ is in the SDFITS
    spec).  Should they be consigned exclusively to the PVj_ms, or perhaps be
    duplicated therein?

    We have a quandry, it is evident that we can't satisfy all of the
    following simultaneously:

       1) encapsulation,
       2) no duplication,
       3) human readability.

13) Inversion of the distortion functions
    -------------------------------------

    There is no mention of the problem of inverting the distortion functions. 
    Needs to be discussed.

14) Specification of distortion functions
    -------------------------------------

    Exact details of PLY, PRC, and spline must be spelt out in Paper IV.  (PLY
    is done, ok.)

15) Regarding the formulation of PLY in the DCF paper
    -------------------------------------------------

    The equations are hard to follow in their current form and really need to
    be rewritten in proper mathematical form (i.e. in LaTeX).

    Normalization constants are defined repeatedly for each x_j for each axis.
    Axis j=1 has normalization values for the x_j of all axes, not just x_1,
    Axis j=2 has normalization values for the x_j of all axes, not just x_2,
    etc.  I suspect this was unnecessary, it would have been sufficient for
    axis j=1 just to contain the normalization value for x_1.

    However, with explicit axis coupling it certainly will be necessary.

    Likewise, in Eq. 2 when computing r.

    Introduction of D = N + 1 only serves to confuse matters.  Eliminate it.

    The number of coefficients of the power terms is

         5*N + 2 + (DVj_(k+1)s + 1) * (DVj_(k+2)s + 1) * (DVj_(k+3)s + 1) * ...

    where k = 5N + 1.  E.g. to encode the TAN+poly of Paper II with powers up
    to 7 would take

         5*2 + 2 + (8 * 8 * 8)

    i.e. 524 per axis, times 2 axes which is 1048.  TAN+poly has 2 * 40 = 80.

    With 10,000 coefficients and 2 axes, the maximum power possible is 21 in
    each variable.  For 3 axes the maximum power drops to 9 which is starting
    to become uncomfortable.  For 4 axes it is 6.  (In fact, this is a
    compelling argument for explicit axis coupling.)

    With 1,000 coefficients and 2 axes, the maximum power possible is 9 in
    each variable, but for 3 axes it is 5 which is probably unworkable.  For
    4 axes the maximum power is only 3.

    The formula for computing i,j,k,... from n should be given.  However, it
    should be possible to evaluate Eq. 3 without it.

    Eq.3 should be expressed in the form of a Horner polynomial.  These are
    faster to compute and reduce rounding error, i.e.

       [(((... + a3)*x + a2)*x + a1)*x] * y^j * r^k

    in which case it would make sense to specify the coefficients of the 
    higher power terms in x first.

    There is allowance for only one r variable in the polynomial, e.g. in a
    spectral cube where you need a 2D r term in the spatial plane you can't
    also have a 3D r term.  (Explicit axis coupling probably minimizes the
    impact of this.)

16) Computational efficiency
    ------------------------

    The 80 term TAN+poly formula of paper II requires 1024 parameters in the
    DCF -TAN-PLY formulation but most of these will be zero.

    What is the relative computational efficiency?  This is most likely to be
    apparent in inverting the distortion function.

    Using the program appended, I determined empirically that the floating
    point unit on a Sun workstation adds and multiplies by a random number as
    quickly as it adds 0.0 or multiplies by 0.0 or 1.0.  (Bizarrely, it even
    seemed a bit quicker at adding non-zero numbers!)

    This means that the zero coefficients of the polynomial slow down the
    calculation in proportion to their number.  Thus the 1024 coefficient form
    of TAN+poly will be slower than the 80 coefficient form by about a factor
    of x13.

    Do we need a special purpose, stripped-down version of -PLY to replicate
    TAN+poly with reasonable efficiency?

17) A better PLY
    ------------

    Given that the proposed PLY representation suffers from the drawbacks
    described, a better representation would be to encode the powers of the
    (x,y,...) in the DVj_ms themselves.

    As before, for each axis, DVj_0s would say how many other axes the
    intermediate world coordinate for axis j was dependent on (i.e. N); the
    next N parameters would say which axes and in which order; then N offsets;
    then N normalization values; then N coefficients for computing r.

    Following those basic DVj_ms, each term in the polynomial would be encoded
    as an (N+2)tuple, the first being the coefficient and the next N+1 the
    power for each (x,y,z,...r).

    The number of parameters required to encode an M-term polynomial would be

       1 + 4*N + M * (N+2)

    For 2 axes and 1,000 coefficients you could have 247 terms.  For 3 axes it
    would be 198 terms, for 4 axes 165 terms, for 5 axes 141 terms which ought
    to be enough.

    TAN+poly would take no more than 2 * 169 = 338 parameters; in practical
    cases it would probably take much less - terms with zero coefficients need
    not be encoded, or computed.

    Another nice thing about encoding the powers of PLY in the DVj_ms is that
    there is no limit on the highest power, and you could have fractional and
    negative powers.  Only the number of terms is limited.

    Clearly this idea could be extended; with a (3N+2)tuple, the first
    parameter would be the polynomial coefficient, the next N the power of
    each (x,y,..., z) as they are multipled together, the next 2N the
    coefficients and powers for each (x,y,..., z) as they are added together,
    and then the power of that sum, e.g. for N=2 a term would be
    A * x^B * y^C * (D*x^E + F*y^H)^I.  Since this would allow the elimination
    of the r variable, the number of parameters required to encode an M-term
    polynomial would be

       1 + 3*N + M * (3N+2)

    which, for TAN+ply, is 2 * 327 = 654 (maximum).

    The repetitious computation of r for TAN+poly in this extended scheme
    suggests an even more compact and efficient but still very general
    encoding.  The first 3N+1 coefficients would be as before, the next
    parameter would define a number, K, of "auxiliary" variables (i.e. r) and
    the next K (2N+1)tuples would define them, i.e. a 2D Euclidean r would be
    (1,2,1,2,0.5).  Of course, K=0 would be allowed.  Then the subsequent
    (N+K+1)tuples would define the polynomial terms.  This scheme uses

       1 + 3*N + 1 + K * (2N+1) + M*(N+K+1)

    coefficients.  TAN+poly (N=2, K=1, M=40) would take 2 * 173 = 346 which
    is only slightly more than the first form, but with greatly increased
    flexibility and without compromising computational efficiency (given that
    the auxiliary variables would be cached).

/*----------------------------------------------------------------------------
* Investigate the time taken to evaluate an 8 x 8 x 8 polynomial when the
* coefficients are zero, unity or non-zero.  Racing version.
*
* Compile with
*
*    gcc -O -o ply7 ply7.c -lm
*
* Execute with
*
*    time ply7
*
* MRC 2001/07/19.
*---------------------------------------------------------------------------*/

#include <math.h>
#include <stdio.h>
#include <stdlib.h>

main()
{
  int    i, j, k, l;
  double dv[8][8][8], *p, r, s, u, v, w, x, y;

  srand48(37517465l);

  /* Initialize coefficients. */
  p = dv[0][0];
  for (k = 0; k < 8; k++) {
    for (j = 0; j < 8; j++) {
      for (i = 0; i < 8; i++) {
        r = 1.0 + drand48(); 
        /* Activate one of these. */
/*      *(p++) = 0.0; */ 
/*      *(p++) = 1.0; */ 
        *(p++) = r;   
      }
    }
  }

  x =  1.0 - 1.0/7.0;
  y = -1.0 - 1.0/3.0;
  r = sqrt(x*x + y*y); 

  /* Do it a million times. */
  for (l = 0; l < 1000000; l++) {
    /* Evaluate the polynomial in Horner form. */
    p = dv[0][0];
    s = 0.0;

    w = 1.0;
    for (k = 0; k < 8; k++) {

      v = 1.0;
      for (j = 0; j < 8; j++) {

        u = 0.0;
        for (i = 0; i < 8; i++) {
          u *= x;
          u += *(p++);
        }
        s += u*v*w;

        v *= y;
      }

      w *= r;
    }
  }

  printf("%f  %f\n", dv[0][0][0], s);
}

>>>

Notes on the NOAO DCF proposal email logs (additional to notes on the DCF doc)
------------------------------------------------------------------------------

 1) Default values for WCS keywords
    -------------------------------

    The IRAF default for diagonal elements of CD matrix is 0.0 if any off-
    diagonal elements are non-zero => portability problem for IRAF.

    Possible solution: revert to PCj_i + CDELTj?!

    What are default values of CRPIX & CRVAL?  Paper I needs to say.

    Defaults for (phi_0,theta_0) are listed in Table 9 of Paper II.

    Default value of each PVj_ms for ZPN is correctly stated to be 0.0.

 2) Paper II examples
    -----------------

    Illustrate use of the CUBEFACE keyword?

    Illustrate the coordinate conversion recipe at the end of Section 2?

    Compare the use of LONPOLEs for a zenithal projection with a bulk image
    rotation defined by the CD matrix?

    Incorporate these into existing header construction examples and/or create
    new ones?

 3) Paper III
    ---------

    Wide-field Doppler correction can be handled by a DCF.

    In the SPECSYSs discussion, the sky position for which the LSRK-TOP
    correction was computed must be recorded.  Is it meant to be the reference
    point of the sky coordinates?

    Likewise, what sky position is associated with VSOURCEs and ZSOURCEs?

    It seems to me that SPECSYSs defines an implicit coordinate transformation
    which lies outside our CTYPEj model.  For example, a SPECSYSs = 'LSRK-TOP'
    means that what is labelled as an LSRK velocity is really only accurate at
    a particular point in the field.  There is an implied correction (albeit
    small) at other points in the field.  This can and should be handled as a
    DCF.

    Paper III needs some header interpretation/construction examples!

 4) Curved long-slit example
    ------------------------

    Part of the problem of curved slits is to describe the variation of
    celestial coordinates along the length of the slit.

    In fact, any projection can be used because the distortion function gives
    complete freedom to describe a path on the sky.  Suppose wavelength is
    along the first coordinate axis, with RA on the second and Dec on the
    third, degenerate axis.  The CD matrix and distortion function map
    (i,j,k=1) to (s,x,y) where y need not be constant.  The path is defined by
    the variation of (x,y) as a function of (i,j) ((x,y) is then mapped
    to (phi,theta) and thence to (RA,Dec) in the usual way).

    As a simple but realistic example, suppose a long slit is curved in such a
    way that its projection on the sky coincides with the arc of a small
    circle of angular radius beta degrees ("straight" slits are projected onto
    great circles, these have angular radius 90 deg).

    Obviously the celestial coordinates along the length of the slit will
    correspond to those on the path of a small circle.

    This suggests the use of a projection which projects small circles as
    straight lines.  Any cylindrical projection will do, as will any pseudo-
    cylindrical projection with uniformly divided parallels of latitude (SFL,
    PAR and MOL all satisfy).  SFL is the projection of choice here because
    its parallels are scaled true.  To match the slit curvature, we would map
    the slit onto the parallel which corresponds to the angular radius of the
    slit's small circle.  Hence the slit must lie along native latitude
    theta_s = 90 - beta.  

    The distortion function is still also needed to unbend the slit in the
    image plane.  However, with SFL geometry, the CD matrix and distortion
    function map (i,j,k=1) to (s,x,y) where y is now constant (because theta_s
    is constant).  x remains a function of (i,j).

    While SFL is a natural match to the geometry of the problem, if the slit
    is strongly curved the WCS header description for SFL (or other
    cylindricals) may not be straightforward to construct because SFL has the
    reference point at (phi,theta) = (0,0).  This means that the CRPIXi and
    CRVALj may lie well outside the pixel array for strongly curved slits and
    this would hinder human interpretation.  For weakly curved slits, beta
    would be close to 90 so theta_s would be close to 0 and it shouldn't
    matter.  (I have no idea what beta might be in practice.)

    For strongly curved slits, use of the stereographic projection would
    result in a more readily interpretable header.  STG has the property that
    all circles on the sphere are projected as circles in the plane of
    projection.  Thus the projection of the slit in the (x,y) plane would form
    the arc of a circle.  Although y is no longer constant in this geometry
    the locus of (x,y) does have a simple form, and we are free to choose
    CRPIXi and CRVALj so that they correspond to any point in the pixel array.

    A suggestion to encode (phi_0,theta_0) as PVj_1s and PVj'_1s was made
    elsewhere.  If this was adopted, the problem with SFL could be resolved
    readily by resetting theta_0 to theta_s.

    This could be worked up into an excellent example for Paper IV which
    combined the work of II, III & IV.

 5) Use of BINTABLE to store WCS parameters
    ---------------------------------------

    Fibres may be used to map a set of points in a celestial field to a
    linear arrangement at the spectrograph entrance.  The resulting 2D image
    contains spectra from many independent points on the sky each with its own
    WCS.

    It is suggested that a BINTABLE be used to store the WCS header parameters
    for each of the fibres.

    This scheme would require a convention to denote in the primary header
    that such a BINTABLE was being used.  Additional keywords would be
    required in the BINTABLE to indicate how the 2D image was to be dissected.

    This is possible (it's a Paper I issue) but wouldn't it be simpler just to
    slice the data up and put each spectrum into the BINTABLE along with its
    WCS parameters?  This is really what BINTABLE was designed for.  WCS
    keywords for image arrays in the columns of a BINTABLE are already defined
    in Paper I, Appendix C. 

    Note that the celestial part of the WCS in this case does not correspond
    to a projection.  Each spectrum would have degenerate RA and Dec axes and
    the jCTYPn for these would not need a projection code, i.e. just 'RA' and
    'DEC' is sufficient.  However, a projection code might be added to
    integral field spectroscopy (IFS) data to indicate the projection to be
    used when the data was synthesised into a spectral cube.  (IFS attempts to
    fill the field with a regular array of apertures with minimal dead space.)

    WCS BINTABLE could be accomodated via a CTYPEn code in the primary header.
    For example, assuming we have wavelength along the x-axis,
    CTYPE1='WCS--TBL' and CTYPE2='WCS--TBL'.  The PVj_ms would define the axis
    coupling and BINTABLE extension.  There may conceivably be a requirement
    here for PSj_ms (though I don't see it at the moment).

    The proliferation of DVj_ms and DSj_ms (and PVj_ms and PSj_ms?)
    complicates usage of such tables.  Since tables are limited to 999
    columns, and we want at least 1,000 DVj_ms (or even 10,000?) per axis,
    it's obvious that they're not going to fit if each DVj_ms occupies one
    table column.  The only way to do it would be to pack them all into an
    array column with a special label, e.g. something like "DV1_*A" where the
    "*" indicates an array.  However, this could severely bloat the table.
    For example, as shown above, PLY needs 1048 DVj_ms to encode the current
    80 coefficient TAN+poly, so the DVj_*s arrays in the table would consist
    mostly of zeroes.  (This problem is solved with the revised PLY
    formulation.)

6) Splines
   -------

   Spline distortion functions are needed.

   1D splines will be needed for spectra, cubic splines will suffice.

   B-splines have been used to correct for map plane distortion in the FOC on
   the HST so they should be supported.  For a java applet, see
   http://www.cs.utah.edu/~sjones/spline.html.