[Difx-users] mpicorrdifx cannot be loaded correctly on more than a single node
Arash Roshanineshat
arashroshani92 at gmail.com
Wed Jun 28 17:34:53 EDT 2017
Hi,
I could install difx but it can only be run on a single node cluster.
The *.machines and *.threads files are attached to this email.
Openmpi is installed on all nodes and difx folder and data folder is
shared among the clusters using NFS filesystem. Difx works perfectly
with correct output on single machines.
executing "startdifx -v -f e17d05-Sm-Sr_1000.input" returns the
following error:
DIFX_MACHINES -> /home/arash/Shared_Examples/Example2/C.txt
Found modules:
Executing: mpirun -np 6 --hostfile
/home/arash/Shared_Examples/Example2/e17d05-Sm-Sr_1000.machines --mca
mpi_yield_when_idle 1 --mca rmaps seq runmpifxcorr.DiFX-2.5
/home/arash/Shared_Examples/Example2/e17d05-Sm-Sr_1000.input
--------------------------------------------------------------------------
While computing bindings, we found no available cpus on
the following node:
Node: fringes-difx0
Please check your allocation.
--------------------------------------------------------------------------
Elapsed time (s) = 0.50417590141
and executing
$ mpirun -np 6 --hostfile
/home/arash/Shared_Examples/Example2/e17d05-Sm-Sr_1000.machines
/home/arash/difx/bin/mpifxcorr
/home/arash/Shared_Examples/Example2/e17d05-Sm-Sr_1000.input
seems to be working but by observing the cpu usage, I see only 6 cpus
involving "5 in fringes-difx0 and 1 in fringes-difx1". I was expecting
it to use the number of cpus equal to the number in "*.threads" file.
How can I solve this issue?
the specification of the cluster is Socket=2, Core per Socket=10 and
Threads per core=2.
Best Regards
Arash Roshanineshat
-------------- next part --------------
fringes-difx0
fringes-difx0
fringes-difx0
fringes-difx0
fringes-difx1
fringes-difx0
-------------- next part --------------
NUMBER OF CORES: 2
20
19
More information about the Difx-users
mailing list