[daip] Problem with CC file
Eric Greisen
egreisen at nrao.edu
Tue Feb 9 16:10:56 EST 2010
Jose Afonso wrote:
>
>>
>> I would love to solve the abort - which should never happen - since we
>> try to catch every error condition and do something reasonable. Can
>> you send me some of the messages from a restart and also the IMHEADER
>> of the facet that is causing the problem? Is there a CC with a bad
>> header in that facet or no CC at all? If the former, did you try PRTCC?
>>
> things were actually a little more confusing: I managed to identify at
> least one problem (a trivial and embarrassing one in fact) that might
> have been behind the IMAGR crash and the restart problem. One of the
> disks that had the box files became full... It seems that this could
> have prevented IMAGR to write new boxes and prevented IMAGR to clean
> deeper - and somehow crashed...
>
> final message was:
> 1 2 09-FEB-2010 01:18:50 IMAGR QINIT: did a FREE of
> 252641 Kwords, OFF 35097307139645
> 1 2 09-FEB-2010 01:18:50 IMAGR QINIT: did a GET of
> 1491878 Kwords, OFF 35097111902781
> 1 8 09-FEB-2010 01:19:02 IMAGR ZABORS: signal 11 received
> 1 8 09-FEB-2010 01:19:02 IMAGR ABORT!
>
You have me pretty stumped but I will continue to study the problem.
Something went wrong with the imaging causing the task to make the
messages below (trying to clean a facet but having no points to clean)
until it got very upset and decided to switch from overlap 2 to overlap
1 mode. The QINIT message above says that it is taking 5.7 Gbytes of
RAM for the pseudo-AP. Do you have at least 8 Gbytes of RAM in your
machine? Otherwise this is not a good idea. The verb SETMAXAP controls
the upper limit and aips can adapt to smaller memory than it thinks
would be ideal. The abort is probably due to the not heavily tested
switch to overlap 1. I tried it under a number of circumstances, but
probably not all. The re-making of the images a second time being an
example of something I would try to avoid (if possible).>
> many thanks,
> Jose
>
Several thoughts:
1. Are you Cleaning way into the noise? The auto-boxing will stop
making boxes well before that level. Then, all it can do is Clean the
existing boxes until they become zero.
2. A restart might wish to use overlap=1 even though overlap 2 is better
when you started the Clean.
3. Try DOTV=1 when you restart and see if you are Cleaning way down into
the noise or wish to add boxes by hand at this point.
4. Is your version of aips and imagr completely up to date? It was last
changed at the end of August 2009.
> before that there was a number of messages:
> 1 3 08-FEB-2010 23:53:21 IMAGR Doing no flagging this time
> 1 2 08-FEB-2010 23:53:21 IMAGR VISDFT: Begin DFT
> component subtraction
> 1 2 08-FEB-2010 23:53:21 IMAGR VISDFT: fields 46 - 46
> chns 1 - 38 in 38 CC models
> 1 3 08-FEB-2010 23:53:21 IMAGR I Polarization model
> processed
> 1 2 08-FEB-2010 23:53:30 IMAGR Imaging fields 8 18
> 1 4 08-FEB-2010 23:54:48 IMAGR Field 8 min = -3.0
> MilliJy,max = 1.4 MilliJy
> 1 4 08-FEB-2010 23:54:48 IMAGR Field 18 min = -2.9
> MilliJy,max = 1.2 MilliJy
> 1 2 08-FEB-2010 23:54:48 IMAGR BGC Clean: using 243
> cell beam + residuals > 698.87 MicroJy
> 1 2 08-FEB-2010 23:54:48 IMAGR 0 Residual map
> points loaded
> 1 6 08-FEB-2010 23:54:48 IMAGR CLEAN NO IMAGE PIXELS -
> TRY AGAIN
This message puzzles me under some circumstances. The histogram of the
residuals is made only in the Clean boxes and so the decision of the
level to load to the ap is based on that and should be lower than the
peak in the Clean boxes. Then the image with the peak residual should
always have some pixels above the cutoff. This may not happen in other
circumstances - primarily overlap 1 mode in which some images will not
be above the cutoff. So I do not see in overlap 2 mode how this can
happen and particularly how it can happen to all facets. I will stare
at it some more (there is a lot of layered code here!).
Cheers,
Eric Greisen
> This ended up happening for all 61 facets, then IMAGR remade images for
> all facets but then:
> ...
> 1 4 09-FEB-2010 00:40:20 IMAGR Field 59 min = -537.6
> MicroJy,max = 1.1 MilliJy
> 1 4 09-FEB-2010 00:40:20 IMAGR Field 60 min = -547.8
> MicroJy,max = 474.4 MicroJy
> 1 4 09-FEB-2010 00:40:20 IMAGR Field 61 min = -546.0
> MicroJy,max = 1.1 MilliJy
> 1 2 09-FEB-2010 00:40:23 IMAGR BGC Clean: using 191
> cell beam + residuals > 698.92 MicroJy
> 1 2 09-FEB-2010 00:40:23 IMAGR 0 Residual map
> points loaded
> 1 8 09-FEB-2010 00:40:23 IMAGR
> *****************************************
> 1 8 09-FEB-2010 00:40:23 IMAGR
> *****************************************
> 1 8 09-FEB-2010 00:40:23 IMAGR INFINITE LOOP CONDITION:
> OVERLAP -> 1
>
beta07 AIPS (31DEC09) 2 09-FEB-2010
> 15:53:31 Page 985
> Pops Prior Date Time Task Messages for user 2
>
> 1 8 09-FEB-2010 00:40:23 IMAGR
> *****************************************
> 1 8 09-FEB-2010 00:40:23 IMAGR
> *****************************************
> 1 2 09-FEB-2010 00:40:23 IMAGR Making images for
> fields 1 through 61
> (and attempted to remade all images again)
> then it reduced the number of boxes in each field:
> ...
> 1 4 09-FEB-2010 01:18:47 IMAGR BOXFIX: Field 39 number
> boxes reduced from 11 to 9
> 1 4 09-FEB-2010 01:18:47 IMAGR BOXFIX: Field 40 number
> boxes reduced from 11 to 10
> 1 4 09-FEB-2010 01:18:47 IMAGR BOXFIX: Field 42 number
> boxes reduced from 11 to 9
> ...
> and managed to keep cleaning somewhat more:
> 1 2 09-FEB-2010 01:18:49 IMAGR BGC Clean: using 185
> cell beam + residuals > 675.18 MicroJy
> 1 2 09-FEB-2010 01:18:49 IMAGR 1283 Residual map
> points loaded
> 1 4 09-FEB-2010 01:18:49 IMAGR Field 15 min algorithm
> flux= -687.901 MicroJy iter= 142
> 1 3 09-FEB-2010 01:18:49 IMAGR Field 15 Clean flux
> density= 57.090 MilliJy 142 comps
> 1 3 09-FEB-2010 01:18:49 IMAGR Total Cleaned flux
> density = 9.487 Jy 14513 comps
> ...
> up to the point when the error message above (the first one) appeared.
>
> perhaps you can see here a desperate attempt from IMAGR to keep going in
> spite not being able to manage writing the clean boxes files...
>
> When afterwards I was trying to launch IMAGR, it complained of, after
> correctly reading the box files...
> 1 7 09-FEB-2010 11:54:34 IMAGR ZTXIO: FORTRAN WRITE
> ERROR = 38
> 1 7 09-FEB-2010 11:54:34 IMAGR ZERROR: IN ZTXIO ERRNO =
> 38 (Function not implemented)
> 1 8 09-FEB-2010 11:54:34 IMAGR CPBOXF: ERROR 4 DOING
> WRIT
> 1 7 09-FEB-2010 11:54:34 IMAGR ZCLOSE: DOES NOT PERFORM
> TEXT FILE CLOSES AS OF 15OCT87
> 1 7 09-FEB-2010 11:54:34 IMAGR ZCLOSE: DOES NOT PERFORM
> TEXT FILE CLOSES AS OF 15OCT87
> 1 3 09-FEB-2010 11:54:34 IMAGR Purports to die of
> UNNATURAL causes
>
> I have now restarted AIPS to see how this goes...
>
> Apologies for the bogus call for help, it seems it was all due to this
> ridiculous disk problem...
>
> Cheers
> Jose
>
More information about the Daip
mailing list