[daip] aips hangs
Eric Greisen
egreisen at nrao.edu
Wed Aug 18 15:34:23 EDT 2010
Ulrich Hiller wrote:
> Hello,
> I have a problem with a hanging aips. The messages are:
> :~> /aips/START_AIPS tv=local
> START_AIPS: Will use or start first available Unix Socket based TV
>
> You have a choice of 3 printers. These are:
>
> No. [ type ] Description
> -------------------------------------------------------------
> 1. [ PS] laps w
> 2. [ PS] laps i
> 3. [ PS] laps e
> -------------------------------------------------------------
>
> START_AIPS: Enter your choice, or the word QUIT [default is 1]:
> START_AIPS: Your initial AIPS printer is the laps w
> START_AIPS: - system name laps_w, AIPS type PS
>
> START_AIPS: User data area assignments:
> DADEVS.PL: This program is untested under Perl version 5.010
> (Using global default file /rw/aips/DA00/DADEVS.LIST for DADEVS.PL)
> Disk 1 (1) is /home/aips/DATA/AIDA47_1
> Disk 2 (2) is /disk1/aips/DATA/AIDA47_2
> Disk 3 (3) is /disk2/aips/DATA/AIDA47_3
> Disk 4 (4) is /disk3/aips/DATA/AIDA47_4
> Disk 5 (5) is /disk4/aips/DATA/AIDA47_5
> Disk 6 (6) is /disk5/aips/DATA/AIDA47_6
> Disk 7 (7) is /disk6/aips/DATA/AIDA47_7
>
> Tape assignments:
> Tape 1 is REMOTE
> Tape 2 is REMOTE
>
> START_AIPS: Starting TV servers on aida47 asynchronously
> START_AIPS: - WITH Unix Sockets as requested...
> START_AIPS: Starting TPMON daemons on AIDA47 asynchronously...
> Starting up 31DEC05 AIPS with normal priority
> Begin the one true AIPS number 1 (release of 31DEC05) at priority = 0
> AIPS 1: You are not on a local TV device, welcome stranger
> AIPS 1: You are assigned TV device/server 25
> AIPS 1: You are assigned graphics device/server 25
> AIPS 1: Enter user ID number
> ?DADEVS.PL: This program is untested under Perl version 5.010
> UNIXSERVERS: TVSRV1 is already running on host aida47, user linadm
> UNIXSERVERS: Start XAS1 on aida47, DISPLAY localhost:10.0
> XAS: ** TrueColor FOUND!!!
> XAS: Cannot use shared memory on remote XAS link
> XAS: !!! Shared memory not selected !!!
> XAS: Using screen width height 1270 924, max grey level 255
> UNIXSERVERS: Start graphics server TKSRV1 on aida47, display localhost:10.0
> UNIXSERVERS: Start message server MSSRV1 on aida47, display localhost:10.0
> STARTPMON: [AIDA47] Starting TPMON1 with output SUPPRESSED
>
> AIPS 1: Enter user ID number
> ?1000
>
> Then aips hangs. The X-Aips-tv-screen, the aiops-teksrv-window and the
> aips-msgsrv-window came up. No error messages.
> The computer is freshly booted (before it did not work either), /tmp is
> cleaned.
> The system is opensuse 11.2 x86_64
>
> This happens also on new aips versions.
>
> How can I debug to know what the problem is?
> I do not know whether it gives a clue, but the aips disks are on a nfs
> mounted disk.
I am trying to figure out what you are really doing. If I read this
correctly, you are sitting in front of computer XXX with a window open
on aida47. In that window, you are executing an ancient version of
aips. Having read the user number (and I am assuming it hangs on all
user numbers??) it then needs to create a message file on AIDA47_1
and user catalogs on AIDA47_n for all n. It seems to be hanging there
because the copyright messages come out next. This implies to me that
the necessary file lock daemons for file locking over nfs are probably
not working - if the disks are not on AIDA47. Note that we find that to
be a very bad idea - nfs read and esp write is very slow compared to
doing things on a local disk.
In our set up in Socorro, we may have the central data areas for our
many machines on a central file server. But the
$AIPS_ROOT/DA00/<hostname> directories are actually symbolic links to a
directory on <hostname>. Similarly, the data areas for hostname are on
hostname even if they are reached via symbolic link. This whole
business of file locking is very important in a multi-process
environment, but it requires some mysterious daemon processes to be
running. We have found here that I could read with locking files on
most machines but not on some. When those few were re-booted I could
then read the files on them (PRTAC has the option to read all accounting
files in the LAN).
I seem to remember very similar questions from someone else with
machines named AIDAnn - perhaps you should ask around locally.
Furthermore, 31DEC05 is rether old. We are proud of what we have
accomplished since...
Eric Greisen
More information about the Daip
mailing list