[evla-sw-discuss] power outage recovery log

James Robnett jrobnett at nrao.edu
Sun Oct 24 12:39:42 EDT 2010


Apologies, entirely my fault.  In the process of enabling the automounter
to control mounting of native lustre filesystems I had to make a slight
change to where clients and servers statically mounted the root lustre
filesystem at boot time.

I thought I'd updated all of them but overlooked the EVLA metadata server(1).

It doesn't mount the lustre filesystem but the path to where it mounts
the metadata (mdt) was the same that that automounter now wanted to
control so the latter prevented it.

James
1) In fact I overlooked the metadata server on all 3 lustre instances,
the AOC, EVLA and test version would all fail if the metadata server
rebooted.  I named their mountpoint /lustre/mdt even though it's not
really a lustre mount.  It's now /export/lustre/mdt.

> Recovery from power outage, October 23, 2010.
>
> 2.  Lustre, according to KSCott, was mis-configured.  He fixed that
> and rebooted cbe-master.  CBE, and d10, were then happy.
>
> 4.  Once we were getting data through the system it was apparant that
> telcal was unhappy.  mchammer was not able to see lustre data.  I
> suspect that the archive cannot either and that the problem is the
> same as that with cbe-master.  KScott is looking at it now.  mchammer
> was rebooted.





More information about the evla-sw-discuss mailing list