[daip] Re: AIPS related question

Patrick P. Murphy pmurphy at NRAO.EDU
Wed Aug 23 11:04:39 EDT 2000


On Wed, 23 Aug 2000 16:51:59 +0200, "Arno Schoenmakers" 
   <schoenmakers at nfra.nl> said: 

> I hope you can help us, otherwise sent this mail please to the person
> who can.

The "daip at nrao.edu" address is a good one to use; this goes to all of us
(those who are left at least) and the relevant person with the right
expertise can pick up on the problem that way.

> We have a problem with accessing tape drives on Linux machines in our
> institute using AIPS.  We can start up AIPS and do the 'mount' to mount
> the tape drive (a DAT device). The TPMON programs are then running:

dop10.> TPMON1: Begins on 2000.08.23 15:57:59
dop10.> TPMON2: Begins on 2000.08.23 15:57:59

The TPMON daemons are used to give *remote* computers access to *local*
tape drives.  If you're not trying to do that, then you don't even need to
have them run.

> However, when we try to read the tape using, e.g., PRTTP, we get the
> following message:

dop10.> PRTTP1: Task PRTTP  (release of 15OCT99) begins
dop10.> PRTTP1: ZTPOPN: STILL WAITING FOR FILE = DA00:TPD001001;

This is a file locking problem.  We have seen it too but have not tracked
it down other than noting that the Linux kernel NFS system does not always
remove file locks of dying or dead processes.  If you use the WHOLOCKED
utility program (available in the up-to-date 31DEC99 $SYSUNIX area as
source, and possibly $SYSLOCAL as a binary) you will get the process ID of
the locking task, but I predict that will not match any running process on
the Linux system.  That's what we've seen.

The quick and dirty solution is to ensure that the DA00 area for that host
(dop10) is on a purely local file system.  That's the setup the AOC has,
and they have not seen this problem.  In C'ville we have several systems
where the DA00 area ($AIPS_ROOT/DA00/$HOST/) is NFS mounted, and that's
where we see this problem.

The second, faster, quick-and-dirty solution is:

    cd $AIPS_ROOT/DA00/$HOST
    cp TPD001001\; foo
    /bin/rm -f TPD001001\;   ; mv foo TPD001001\;

This effectively preserves the file while changing its inode, and thereby
orphans the lock.

> The only way to break the loop is to kill the process PRTTP.EXE.

Nope, that's not true.  See above.

> We are using 15OCT99 on a platform running SuSe6.4 (Linux kernel 2.2.14).

NFRA used to have 31DEC99 and a midnight job; do you know whatever
happened to that?

				- Pat
-- 
  Patrick P. Murphy, Ph.D.            Division Head, Charlottesville Computing
  (804) 296-0372, 296-0236                National Radio Astronomy Observatory
  Home: http://www.chien-noir.com/      Work: http://www.cv.nrao.edu/~pmurphy/
   "Linux is Inevitable."  "Why?"  "Because it's alive!" - John MadDog Hall



More information about the Daip mailing list