[bananas] Redhat 6.2 and Aips

James Robnett jrobnett at aoc.nrao.edu
Wed Jun 7 12:28:26 EDT 2000


[Moderator's note: this is relevant to AIPS users with Linux systems,
 especially if you purchased one from Dell with Linux pre-installed.
 - Pat Murphy]

  We have discovered a problem with stock Redhat 6.2 systems.

  According to Redhat:
http://www.redhat.com/support/errata/RHBA-2000013-01.html
  
  The default kernel, 2.2.14-5 which ships with Redhat 6.2 as well
as kernels 2.2.13-0.13 and 2.2.14-6.1 suffers from a context switch bug.

"  Red Hat, Intel and Dell have uncovered a problem with the Red Hat Linux 
 6.2 for the x86 (Intel) processor. This problem has been duplicated and 
 confirmed in our lab, though we have had no reports from customers at 
 large. This problem affects all OEM system manufacturers shipping Red Hat 
 Linux 6.2 preinstalled on x86 processor-based systems.
   Under extremely heavy load, data corruption can occur if a page fault 
 occurs during a task switch between kernel space and user space on x86 
 platforms. This problem may affect customer systems, particularly systems 
 that are swapping to disk very heavily (thrashing), or are otherwise very 
 heavily loaded. Lightly loaded servers are unlikely to be affected. This
 issue affects all x86 compatible systems running the kernels listed below."

[Mod: be careful to check the README if you get the patches]

  We have seen AIPS tasks, in particular IMAGR as well as the DDT and Y2K 
benchmarks exhibit this bug on P-III 600Mhz systems with 128Mb of memory.
The bug was not apparent on 466Mhz Celeron systems with 128Mb of memory.

[Mod: the "Y2K" benchmark is a new, experimental version of the classic
 DDT with an even larger (but not huge) data set; it's not yet ready for
 release.  Takes about an hour on reasonalbe PC's.]

  If you haven't already (I think CV has on at least artemis) you should
upgrade your kernels.

  AIPS and AIPS++ maintainers should probably be on the lookout for problem
reports from non-NRAO clients.  Note this is a kernel issue, any software
package can exhibit this behavior, not just AIPS or AIPS++.

  I'll leave it up to Eric Griesen to explain in more detail the symptoms
seen under AIPS since I'd probably just make a fool of myself if I tried.
The symptoms will be application specific, basically be on the lookout
for widely varying results.

James Robnett

[Mod: Eric tells me the symptoms were NaN's (the IEEE floating point
 format's "not a number" magic value) for the most part, where they did
 not belong.  Running the DDT repeatedly (the old one) would sometimes
 reveal a whole slew of UVSRT differences and each run might be slightly
 different.]



More information about the Bananas mailing list