[Difx-users] Multiple Heads in same subnet - one waits for the other to finish ?
Stuart Weston
stuart.weston at aut.ac.nz
Wed Apr 19 22:37:53 EDT 2017
I have two head nodes, each head node has 6 workers.
I split the job up into two groups of files, the idea being Head-1 does scans/files 1-6 and Head-2 does scans/files 7-11.
I create two separate input files with different file lists etc. Also two separate thread and machine files appropriate to the two different groups of ip addresses.
head-1, worker-1-1, worker-1-2 ... worker-1-6
head-2, worker-2-1 ..... worker-2-6
So set two jobs running in parallel
Head-1 > mpirun -machinefile machines-1 -np 5 mpifxcorr hw03_1.input
Head-2 > mpirun -machinefile machines-2 -np 5 mpifxcorr hw03_2.input
Now all machines are in the same subnet. I am guessing some communication is going on as Head-2 seem's to wait while Head-1 processes files 1-6, once Head -1 has finished Head-2 gets busy doing files 7-11.
Is there any way to have Head-1 and Head-2 running at the same time ? ie Head-2 doesn't wait for Head-1 to finish !
Stuart Weston Bsc (Hons), MPhil (Hons), MInstP
Mobile: 021 713062
Skype: stuart.d.weston
Email: stuart.weston at aut.ac.nz<mailto:stuart.weston at aut.ac.nz>
http://www.atnf.csiro.au/people/Stuart.Weston/index.html
Software Engineer
Institute for Radio Astronomy & Space Research (IRASR)
School of Computing & Mathematical Sciences
Faculty of Creative Technologies
Auckland University of Technology, New Zealand.
http://www.irasr.aut.ac.nz/
[NewIRASRLogo]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listmgr.nrao.edu/pipermail/difx-users/attachments/20170420/af295e6b/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image003.jpg
Type: image/jpeg
Size: 11334 bytes
Desc: image003.jpg
URL: <http://listmgr.nrao.edu/pipermail/difx-users/attachments/20170420/af295e6b/attachment.jpg>
More information about the Difx-users
mailing list