[daip] IMAGR errors on Linux box

Lynn D. Matthews lmatthew at cfa.harvard.edu
Tue May 1 17:55:46 EDT 2007


We installed a new version of the operating system in January, together
with a new RAID, and some time after this (I can't recall quite how long),
occasional messages like this would appear at the console (e.g., once
per day):

Message from syslogd at wrest at Fri Apr 27 16:47:20 2007 ...
wrest kernel: EDAC MC0: UE page 0x909e, offset 0x0, grain 4096, row 0,
labels "": i82860 UE

However, I had no known problems with the machine and we were unable to
pinpoint any definite cause.

It is only this past week that I suddenly started to experience the
bizarre behavior with IMAGR, and running any intensive task like IMAGR now
causes the above messages to occur every few minutes. Nothing on the
system has changed during the past week (that we are aware of), so this
makes it all the more puzzling.

Lynn


On Tue, 1 May 2007, James Robnett wrote:

>
>     Good memory ... almost *exactly* 7 years ago....
>
> http://listmgr.cv.nrao.edu/pipermail/bananas/2000-June/000008.html
>
>      It's possible it's a kernel problem.   We haven't
> done any tests with 2.6.19 kernels ..  Did the problem
> start after some recent upgrade ?  Your message
> seems to imply that it 'recently' started doing this.
>
> If you haven't upgrade the kernel or OS components
> in that time then certainly hardware is a possibility.
>
> james
> ps: The original Redhat errata referenced in the
> above link seems to be lost to the world.
>
> On May 1, 2007, at 3:06 PM, Eric Greisen wrote:
>
> > This reminds me of a page fault bug that we had in RedHat about 7
> > years ago.  The program would go on computing before the new page was
> > rolled all the way in.  RedHat congratulated themselves that they
> > found the problem before anyone had encountered it - but we were at
> > the time studying issues with IMAGR and FRING.  AIPS beats on
> > computers hard enough that timing problems in disk I/O and in paging
> > will be exposed.  Are any of the disks mounted over NFS?
> >
> > I will forward this to a few other folks who may remember other
> > problems or know of Fedora issues.  Note that the error I am
> > remembering was a software issue in the o/S rather than a pure
> > hardware issue.
> >
> > Good luck,
> >
> > Eric Greisen
>




More information about the Daip mailing list