[daip] ZSTRTA.EXE
Carlos De Breuck
debreuck at iap.fr
Thu Mar 7 11:30:37 EST 2002
Hi Eric,
Thanks for your suggestions. You seem to have narrowed down the problem,
but I haven't been able to solve it yet.
I tried un-locking the SPD000000; file as you suggested, without succes.
Next, I tried to rebuild the whole directory for one machine (called
giscours, but the problem is the same for all of them), using SYSETUP.
This gets me it bit further, but gives the following errors (using notv):
TVDEVS.SH: Starting TPMON daemons on GISCOURS asynchronously...
Starting up 31DEC00 AIPS with normal priority
ZDCHI1: ZERROR: ON FILE DA00:SPD000000;
ZDCHI1: ZERROR: IN ZDAOPN ERRNO = 13 (Permission denied)
ZDCHIN: COULD NOT READ PARAMETER FILE
ZDCHIN: (USING MINIMUM SYSTEM CONFIGURATION)
Begin the one true AIPS number 1 (release of 31DEC00) at priority = 0
DADEVS.PL: This program is untested under Perl version 5.006
AIPS 1: ZERROR: ON FILE DA00:SPD000000;
AIPS 1: ZERROR: IN ZDAOPN ERRNO = 13 (Permission denied)
ZDCHIN: COULD NOT READ PARAMETER FILE
ZDCHIN: (USING MINIMUM SYSTEM CONFIGURATION)
AIPS 1: You are NOT assigned a TV device or server
AIPS 1: You are NOT assigned a graphics device or server
AIPS 1: Enter user ID number
?134
(and then it hangs on process AIPS1)
This seems like a simple permission problem, so I changed all the files in
$NET0/GISCOURS to writeable for all users:
-rw-rw-rw- 1 aips aips 102400 Mar 7 15:45 ACD000000;
lrwxrwxrwx 1 aips aips 44 Mar 7 15:45 GRD000000; ->
/home/soft/osf/aips/DA00/TARIQUET/GRD000000;
-rw-rw-rw- 1 aips aips 561152 Mar 7 15:45 MED000001;
-rw-rw-rw- 1 aips aips 561152 Mar 7 15:45 MED000002;
-rw-rw-rw- 1 aips aips 561152 Mar 7 15:45 MED000003;
-rw-rw-rw- 1 aips aips 196608 Mar 7 15:45 MED000004;
-rw-rw-rw- 1 aips aips 561152 Mar 7 15:45 MED000005;
-rw-rw-rw- 1 aips aips 561152 Mar 7 15:45 MED000006;
-rw-rw-rw- 1 aips aips 561152 Mar 7 15:45 MED000007;
-rw-rw-rw- 1 aips aips 561152 Mar 7 15:45 MED000008;
lrwxrwxrwx 1 aips aips 44 Mar 7 15:45 PWD000000; ->
/home/soft/osf/aips/DA00/TARIQUET/PWD000000;
-rw-rw-rw- 1 aips aips 1024 Mar 7 15:45 SPD000000;
-rw-rw-rw- 1 aips aips 165888 Mar 7 15:45 TCD000001;
-rw-rw-rw- 1 aips aips 62464 Mar 7 15:45 TDD000004;
-rw-rw-rw- 1 aips aips 1024 Mar 7 15:45 TPD001001;
-rw-rw-rw- 1 aips aips 1024 Mar 7 15:45 TPD001002;
-rw-rw-rw- 1 aips aips 1024 Mar 7 15:45 TPD001003;
-rw-rw-rw- 1 aips aips 1024 Mar 7 15:45 TPD001004;
-rw-rw-rw- 1 aips aips 1024 Mar 7 15:45 TPD001005;
After this change, the errors disappear, but it hangs on ZSTRTA.EXE even
before it gets to asking the user number.
So, I moved to your next option, and edited $AIPGUNIX/ZSTRTA.FOR
to change
CALL ZDCHIN (.TRUE., IBUF)
to
CALL ZDCHIN (.FALSE., IBUF)
This gets me a bit further. It brings up the Copyright 2002 message (which
is strange given that I'm using the 31DEC00 version). It gets to this:
...
TVDEVS.SH: Starting TPMON daemons on GISCOURS asynchronously...
Starting up 31DEC00 AIPS with normal priority
----------------------------------------------------------------------
Copyright (C) 2002
Associated Universities, Inc. Washington DC, USA.
...
Charlottesville, VA 22903-2475 USA
----------------------------------------------------------------------
Begin the one true AIPS number 1 (release of 31DEC00) at priority = 0
DADEVS.PL: This program is untested under Perl version 5.006
STARTPMON: [GISCOURS] Starting TPMON1 with output SUPPRESSED
STARTPMON: [GISCOURS] Starting TPMON2 with output SUPPRESSED
And then it hangs on AIPS1.
I then tried to see how far it got in ZSTRTA.EXE by adding in some WRITE
statement. It goes almost to the end:
C Activate AIPSx (x = MYPOPS).
AIPSN = 'AIPS '
WRITE(*,*)'It gets to here...'
CALL ZACTV8 (AIPSN, MYPOPS, VERSIN, PID, IERR)
WRITE(*,*)'but not here...'
C Never expect to get here, but
C just in case.
IF (IERR.EQ.0) GO TO 999
WRITE (MSGTXT,1060) IERR
990 CALL MSGWRT (8)
C
So, it seems to hang on the CALL ZACTV8.
I haven't been able to get further than that, not even when I run AIPS
from the aips account.
Can you give me some more suggestions?
Thanks,
Carlos
> We have had similar reports occasionally. The most recent one seems
> to have stopped e-mailing me before telling me what if anything they
> found.
>
> In their case, it appeared that some user had decide to "clean up" and
> stopped a variety of processes that appeared dead. Perhaps he
> mistyped the process number or they were not really dead, but the
> computer came to a jarring stop. They got it to go back to a login
> screen eventually by killing more processes from a remote machine.
> But when they logged in, they had what you describe.
>
> My theory is that some one or more of the files in $NET0/<host-name>
> are thought by the operating system to be locked by another process,
> quite possibly over nfs (network file system).
>
> $AIPGUNIX/ZSTRTA.FOR starts by calling ZDCHIN which opens a disk file
> in that area called SPD000000;
>
> That is the most likely file to have this phantom lock.
>
> But I made suggestions like
> mv SPD000000\; SPD.old
> cp SPD.old SPD000000\;
> which should move the lock to SPD.old and leave SPD000000; unlocked
> and they said that did not work.
>
> They mv'd the entire directory and recreated a new one with FILAIP and
> claimed that that did not work.
>
> Of course, they rebooted the computer before all this. I asked about
> other computers that might have been accessing these files.
>
> Of late I have not heard from them and I was not entirely certain that
> they had followed the instructions literally.
>
> You can put print statements into
>
> $AIPGUNIX/ZSTRTA.FOR
>
> (not aips message calls which need ZDCHIN to have worked) ahead of and
> after the call to ZDCHIN and rebuild ZSTRTA with
>
> COMLNK $AIPGUNIX/ZSTRTA
>
> You could also try changing the call to ZDCHIN to
>
> CALL ZDCHIN (.FALSE.)
>
> (the 2nd argument is obsolete) which will skip the disk read. You
> probably do not want to do this long term but to debug what is going
> on...
>
> Let me know what happens,
>
> Eric Greisen
>
>
More information about the Daip
mailing list