[evla-sw-discuss] Source repository and related structures

Fri Jun 20 16:17:52 EDT 2003

This is essentially the material I presented at the ASG meeting of
June 16, with additions and modifications based on the comments 
there.

There are essentially four types of code areas, each with different
functions and properties.  There are also documentation areas, of
two different sorts - documentation of the code, much derived from
automatic documentation tools like javadoc; and documentation of
requirements, approaches and designs.  I'll not discuss these
here.  The code areas are my present concern.  The four types 
are briefly discussed below, and at the end of this document, 
proposed file structures will be presented.

1.  The CVS code management repository.  This is primarily source
code stored in history files (with suffixes of ,v).  Although things
other than source code can be stored in history files, the 
attractiveness of doing so is not very high.  (The VLBA stores
some fpga personalities under SCCS, but this is primarily a 
mechanism for getting into the common backups.)  When you say
"cvs checkout" or "cvs update" you are causing code to be fetched
from this repository to your current working directory.

2.  The system compilation area.  This is an area in which objects,
classes, and binaries are made.  I think a reasonable way to do this
is to institute a cron job that compiles any new code every night.
The code, objects, classes, and binaries in this area is therefore
more or less up to date, with everything any programmer has committed
to the archive.  This area need not be visible to the outside world,
but there are certain conveniences to have it so.  For instance, it
is an area that one can look at just to see current code, without
having to check it out from the repository, which has a certain 
amount of overhead.  Although this area is not absolutely forced
to have the same structure as the CVS repository, it makes using
CVS much more convenient to do so.

When the system becomes quasi operational, we will probably want to
put the real-time part of the system on a "release" basis.  This 
is probably most conveniently done by having a second system compilation
area for the release.  cvs has a "release" facility that would permit 
reusing the system compilation area for compiling both the release 
and the regular, nightly compilation of the current code.  However,
I see no advantage in using this other than a modest saving in disk 
space, to hold the releases.

It should be emphasized that programmers do not work in the system
compilation area.  This area is updated only by the cron job.  The
release area is updated by whoever has the job of creating the
release.  Programmers work in their own compilation areas.

3.  The system include, library, jar, and binary areas.  The system 
binary areas are where a naive user goes to get code to execute.
This includes MIB system images to be loaded to the MIBs.

The jar area is where java classes are stored.  We may want one big
jar file with everything in it, or there may be reasons for wanting
several.  For instance, we might want so make sure that some code
does not accidentally get into the real-time system (code that 
stops and asks for help if it gets in trouble, for instance).
And we might want to restrict some code from general use, for instance
code with the capability of interfering with the insides of the 
real-time system in such a fashion that careless use might slow it
down intolerably.  The easiest way of managing such things would be
to put them in separate jar files.

The include area would be where all C and C++ include files that are
used in more than one software component live.  There would also be
include files within many software components, which define information
for the various source files, but which is never accessed outside the
component.  For instance, within the Nucleus Plus system code areas, 
there are a lot of include files that define system structures that
users should never have to deal with, but the file nucleus.h contains
such useful things as the rate of ticking of the system clock, and
should therefore be moved to the "include" component.

The library areas are where object libraries are stored  (that is, 
compiled code, as stored by ar or an equivalent archive manager).  
The programmer would point his linker to the system library area to
include up-to-date versions of everybody else's code.

Java classes are byte codes that execute on the JVM under any operating
system, under any architecture.  C and C++ object and binary codes 
do not have these nice properties.  The same code, if it is to
be used in different environments, must be recompiled specifically
for the environment.  There are likely to be be a few cases in which
the code needs to do slightly different things in the different
environments; these should be handled by conditional compilation
statements.  (It is reasonable and normal to use the same include 
files for C and C++, with conditional compilation statements to take
account of the differences of the two languages.)  Therefore, there
must be OS specific system compilation areas and library/binary areas.
A single include area and a single jar area should suffice for java
use under all operating systems.

Unlike the system compilation area, there is no particular reason
for having the same file structure for the include, library, and
binary areas.

I am advocating a flat structure for the include, library, and 
binary areas.  By a flat structure, I mean that a single file
subdirectory would hold all includes, another all libraries, and
a third all binaries.  This has a disadvantage that there is some
likelihood of a collision of names of classes, functions, or include
files.  In practice I think both the chance and the consequence of 
such a collision are relatively small, especially with the use of 
sensible naming conventions.  The alternative is to take the java
route of building the file structure into the source code, which has
always impressed me as one of the least desirable features of java,
requiring extensive changing of code when one tidies up ones 
housekeeping.  

There is an argument for just placing the jar area at the top of the 
system compilation area, in that one can then look down the file 
structure, and immediately see the name that one must use to extract 
the class from the jar.

Like the system compilation areas, it is appropriate to have separate
include, library, binary area for the "compiled every night" stuff
and the "release" software.  Borrowing from VLBA usage, I suggest
the (admittedly not very descriptive) names "new" for the "compiled
every night" area, "code" for the release area, and "old" for the
last superceded release (in case something breaks and we need to 
switch back in a hurry).  (VLBA correlator also found it very handy 
to have "old" to find out if a bug was introduced in the last release
or if it had been there all along, and just nobody had noticed.)

4.  Programmer compilation areas.  These are like the system compilation
areas, but are located in the programmer's own area, either in his 
file system on /users/ or on the local disk his own machine.  In this
area, he can compile, link, and load without interfering with the 
operational system.

For C and C++, he will need pointers to the system 'include' and 
'library' areas.  Providing these within a Makefile gets messy, in
part because of the different slope of the slash on Unix and Windows
systems.  I think the most straightforward way to handle this is
to use environment variables to point to these areas.  There also
needs to be a provision within the Makefile to link a main to a 
library or object under development.  This is easily provided through
an environment variable or through a command line option when 'gmake'
is invoked.

At the ASG meeting I expressed concerns that the 'make' program 
provided with the Altium toolset is very primitive, and its 
documentation is even more rudimentary.  The program, mktri, 
appears to have no sensible iterator.  The good 'make' programs
allow you to provide a list of source files, for example,
SOURCES = csc.c dmc.c dmce.c ....
and to derive a list of objects, by specifying the rule by which
sources are converted to objects.
OBJECTS = $(SOURCES:%.c=%.obj)
mktri does not appear to support this feature.  Due to the extremely
sketchy documentation, I can't really swear the feature doesn't exist,
but I was unable to find it by trial and error.  I did find a way
of phrasing things that caused all SOURCES to be compiled, but
unfortunately, when any source changed, all SOURCES were recompiled.
Also, good 'make' programs are integrated with the C (or C++) 
preprocessor, so that one need not specify a dependence on files
included (with #include statements); make knows about such things.
Following a a suggestion by Stephan Witz, I found a way to cause GNU 
make to use the GNU cc preprocessor to make a file listing these 
dependencies, and then to use that file to decide when to call the 
Altium cctri compiler.  Although a bit of a kludge, I consider this a 
much more attractive option than using the Altium mktri, and, unless 
stopped, will take steps to get this implemented.

Proposed Source Archive Directory Structure
It is presumed that code is grouped by functionality, with C, C++, java,
assembler, and other source forms grouped in the same subdirectory.
It is not clear to me whether we want code to reside only in the lowest
leaves of the file tree (the convention that vxWorks uses), or if code
relevant to all lower leaves should be stored in the same directory with
the file links.

home/asg/cvsroot
    evla
	amcs			// Array Monitor and Control System
	    include		// C and C++ include files; other ancillary
				// files
	    mib			// MIB related files
		all		// Common software appearing in all MIBs
		comm		// Software for the communication link to MIBs
		f320		// Software specific to the F320 module MIB
		l301		// Software specific to the L301 module MIB
		l302 (etc)
	    cmp			// Software for the CMP interface to current 
				// VLA M/C system
	    cmib		// Correlator MIB software
	    cbe			// Correlator backend software
	    antenna		// Antenna object software
	    obs			// Other observing layer software
	vendor			// Purchased software
	    ati			// Vendor = ATI
		Nucleus_TC11IB	// Product = Nucleus_TC11IB, Plus OS
		    Plus	// Plus operating system core
			(As delivered OS: 71 .c files,
			   31 .h files, 2 .asm files, 6 misc)
		    SysComp	// Additional system components -
				// net stack, ether driver, shell

The suggestion above, that code be found only in the lowest leaf, is 
the origen of the 'all' subdirectory under 'mib'.  This gives the
impression that the executable for, say, the F320 MIB, being built
in the f320 directory, has to reach up and back down to get mib/all
and mib/comm routines.  While true in a logical sense, in a physical
sense, it reaches all the way to the top, and extracts them from the
system library.

Proposed Directory Structure for Libraries and Executables

home/asg/evla/amcs
    new
	include 		// .h files for C or C++, any OS
	mib
	    lib 		// Libraries for building MIB executables
				// Files with names like libPlus.a, 
			       	// libSys.a, libAll.a, maybe libF320.a
	    bin 		// MIB images, with names like F320mib, etc.
	linux
	    lib 		// libraries compiled for Linux/Intel
	    bin 		// Executables for Linux/Intel
	solaris (structure parallels linux)
	windows (structure parallels linux)
	java			// jar file(s)
				// Might be a symbolic link to the system
				// compilation area (or might be omitted 
				// entirely
    code (structure parallels 'new')
    old  (structure parallels 'new')