[daip] Problem with CC file

Eric Greisen egreisen at nrao.edu
Tue Feb 9 16:10:56 EST 2010


Jose Afonso wrote:
> 
>>
>> I would love to solve the abort - which should never happen - since we 
>> try to catch every error condition and do something reasonable.  Can 
>> you send me some of the messages from a restart and also the IMHEADER 
>> of the facet that is causing the problem?  Is there a CC with a bad 
>> header in that facet or no CC at all?  If the former, did you try PRTCC?
>>
> things were actually a little more confusing: I managed to identify at 
> least one problem (a trivial and embarrassing one in fact) that might 
> have been behind the IMAGR crash and the restart problem. One of the 
> disks that had the box files became full... It seems that this could 
> have prevented IMAGR to write new boxes and prevented IMAGR to clean 
> deeper - and somehow crashed...
> 
> final message was:
>   1    2   09-FEB-2010  01:18:50     IMAGR     QINIT: did a FREE of    
> 252641 Kwords, OFF   35097307139645
>   1    2   09-FEB-2010  01:18:50     IMAGR     QINIT: did a GET  of   
> 1491878 Kwords, OFF   35097111902781
>   1    8   09-FEB-2010  01:19:02     IMAGR     ZABORS: signal 11 received
>   1    8   09-FEB-2010  01:19:02     IMAGR     ABORT!
> 
You have me pretty stumped but I will continue to study the problem. 
Something went wrong with the imaging causing the task to make the 
messages below (trying to clean a facet but having no points to clean)
until it got very upset and decided to switch from overlap 2 to overlap 
1 mode.  The QINIT message above says that it is taking 5.7 Gbytes of 
RAM for the pseudo-AP.  Do you have at least 8 Gbytes of RAM in your 
machine?  Otherwise this is not a good idea.  The verb SETMAXAP controls 
the upper limit and aips can adapt to smaller memory than it thinks 
would be ideal.  The abort is probably due to the not heavily tested 
switch to overlap 1.  I tried it under a number of circumstances, but 
probably not all.  The re-making of the images a second time being an 
example of something I would try to avoid (if possible).>
 > many thanks,
 > Jose
 >

Several thoughts:
1. Are you Cleaning way into the noise?  The auto-boxing will stop 
making boxes well before that level.  Then, all it can do is Clean the 
existing boxes until they become zero.
2. A restart might wish to use overlap=1 even though overlap 2 is better 
when you started the Clean.
3. Try DOTV=1 when you restart and see if you are Cleaning way down into 
the noise or wish to add boxes by hand at this point.
4. Is your version of aips and imagr completely up to date?  It was last 
changed at the end of August 2009.

> before that there was a number of messages:
>   1    3   08-FEB-2010  23:53:21     IMAGR     Doing no flagging this time
>   1    2   08-FEB-2010  23:53:21     IMAGR     VISDFT: Begin DFT 
> component subtraction
>   1    2   08-FEB-2010  23:53:21     IMAGR     VISDFT: fields 46 - 46 
> chns 1 - 38 in 38 CC models
>   1    3   08-FEB-2010  23:53:21     IMAGR     I Polarization model 
> processed
>   1    2   08-FEB-2010  23:53:30     IMAGR     Imaging fields    8   18
>   1    4   08-FEB-2010  23:54:48     IMAGR     Field    8 min =   -3.0 
> MilliJy,max =    1.4 MilliJy
>   1    4   08-FEB-2010  23:54:48     IMAGR     Field   18 min =   -2.9 
> MilliJy,max =    1.2 MilliJy
>   1    2   08-FEB-2010  23:54:48     IMAGR     BGC Clean: using  243 
> cell beam + residuals >   698.87 MicroJy
>   1    2   08-FEB-2010  23:54:48     IMAGR            0 Residual map 
> points loaded
>   1    6   08-FEB-2010  23:54:48     IMAGR     CLEAN NO IMAGE PIXELS - 
> TRY AGAIN


This message puzzles me under some circumstances.  The histogram of the 
residuals is made only in the Clean boxes and so the decision of the 
level to load to the ap is based on that and should be lower than the 
peak in the Clean boxes.  Then the image with the peak residual should 
always have some pixels above the cutoff.  This may not happen in other 
circumstances - primarily overlap 1 mode in which some images will not 
be above the cutoff.  So I do not see in overlap 2 mode how this can 
happen and particularly how it can happen to all facets.  I will stare 
at it some more (there is a lot of layered code here!).

Cheers,

Eric Greisen

> This ended up happening for all 61 facets, then IMAGR remade images for 
> all facets but then:
> ...
>   1    4   09-FEB-2010  00:40:20     IMAGR     Field   59 min = -537.6 
> MicroJy,max =    1.1 MilliJy
>   1    4   09-FEB-2010  00:40:20     IMAGR     Field   60 min = -547.8 
> MicroJy,max =  474.4 MicroJy
>   1    4   09-FEB-2010  00:40:20     IMAGR     Field   61 min = -546.0 
> MicroJy,max =    1.1 MilliJy
>   1    2   09-FEB-2010  00:40:23     IMAGR     BGC Clean: using  191 
> cell beam + residuals >   698.92 MicroJy
>   1    2   09-FEB-2010  00:40:23     IMAGR            0 Residual map 
> points loaded
>   1    8   09-FEB-2010  00:40:23     IMAGR     
> *****************************************
>   1    8   09-FEB-2010  00:40:23     IMAGR     
> *****************************************
>   1    8   09-FEB-2010  00:40:23     IMAGR     INFINITE LOOP CONDITION: 
> OVERLAP -> 1
> 
      beta07     AIPS (31DEC09)       2               09-FEB-2010   
> 15:53:31               Page  985
> Pops  Prior    Date        Time       Task          Messages for user    2
> 
>   1    8   09-FEB-2010  00:40:23     IMAGR     
> *****************************************
>   1    8   09-FEB-2010  00:40:23     IMAGR     
> *****************************************
>   1    2   09-FEB-2010  00:40:23     IMAGR     Making images for 
> fields    1 through   61
> (and attempted to remade all images again)
> then it reduced the number of boxes in each field:
> ...
>   1    4   09-FEB-2010  01:18:47     IMAGR     BOXFIX: Field   39 number 
> boxes reduced from   11 to    9
>   1    4   09-FEB-2010  01:18:47     IMAGR     BOXFIX: Field   40 number 
> boxes reduced from   11 to   10
>   1    4   09-FEB-2010  01:18:47     IMAGR     BOXFIX: Field   42 number 
> boxes reduced from   11 to    9
> ...
> and managed to keep cleaning somewhat more:
>   1    2   09-FEB-2010  01:18:49     IMAGR     BGC Clean: using  185 
> cell beam + residuals >   675.18 MicroJy
>   1    2   09-FEB-2010  01:18:49     IMAGR         1283 Residual map 
> points loaded
>   1    4   09-FEB-2010  01:18:49     IMAGR     Field   15 min algorithm 
> flux= -687.901 MicroJy iter=      142
>   1    3   09-FEB-2010  01:18:49     IMAGR     Field   15 Clean flux 
> density=   57.090 MilliJy      142 comps
>   1    3   09-FEB-2010  01:18:49     IMAGR     Total Cleaned flux 
> density     =    9.487      Jy    14513 comps
> ...
> up to the point when the error message above (the first one) appeared.
> 
> perhaps you can see here a desperate attempt from IMAGR to keep going in 
> spite not being able to manage writing the clean boxes files...
> 
> When afterwards I was trying to launch IMAGR, it complained of, after 
> correctly reading the box files...
>   1    7   09-FEB-2010  11:54:34     IMAGR     ZTXIO: FORTRAN WRITE 
> ERROR =     38
>   1    7   09-FEB-2010  11:54:34     IMAGR     ZERROR: IN ZTXIO ERRNO = 
> 38 (Function not implemented)
>   1    8   09-FEB-2010  11:54:34     IMAGR     CPBOXF: ERROR    4 DOING 
> WRIT
>   1    7   09-FEB-2010  11:54:34     IMAGR     ZCLOSE: DOES NOT PERFORM 
> TEXT FILE CLOSES AS OF 15OCT87
>   1    7   09-FEB-2010  11:54:34     IMAGR     ZCLOSE: DOES NOT PERFORM 
> TEXT FILE CLOSES AS OF 15OCT87
>   1    3   09-FEB-2010  11:54:34     IMAGR     Purports to die of 
> UNNATURAL causes
> 
> I have now restarted AIPS to see how this goes...
> 
> Apologies for the bogus call for help, it seems it was all due to this 
> ridiculous disk problem...
> 
> Cheers
> Jose
> 




More information about the Daip mailing list