[Difx-users] Error in running the startdifx command with DiFX software {External} {External} {External} {External}

Adam Deller adeller at astro.swin.edu.au
Sun Jul 9 22:36:28 EDT 2023


Hi De Wu,

That's still a little bit odd to me, since the header should not be
differing at all if the same .im file is used.  (Any difference in the
header would normally be due to different uvw entries resulting from
different CALC results). Maybe that is spurious though.  I think the
remaining differences seen in the visibility contents are probably due to
an amplitude scaling effect that (once the visibilities are normalised by
the autocorrelations) would disappear. The reference correlation is so old
now, I think it has a couple of now-corrected errors in it that affected
the amplitude scaling at the sub-percent level.

Cheers,
Adam



On Mon, 10 Jul 2023 at 10:44, 深空探测 via Difx-users <
difx-users at listmgr.nrao.edu> wrote:

> Hi Adam,
>
> I made in the DiFX program, specifically regarding the replacement of the "
> example_1.im" file with the "reference_1.im" file.
>
> After performing the file replacement and rerunning the "diffDiFX.py"
> program, I observed results consistent with your previous advice. The
> program displayed the following outcome:
>
> - At the end, 1032 records showed disagreement in the header.
> - After processing 1848 records, the mean percentage absolute difference
> was calculated as 0.05823689, and the mean percentage mean difference was
> determined as -0.01528561 + 0.00029173i.
>
> These results indicate that the differences between the two files have
> indeed become very small. I sincerely appreciate your assistance and
> guidance in this matter.
>
> Best regards,
>
> De Wu
>
> Adam Deller <adeller at astro.swin.edu.au> 于2023年7月8日周六 15:40写道:
>
>> Hi De Wu,
>>
>> For the differences to be so gross with the rdv70 dataset, I would guess
>> that the discrepancy would be caused by a difference in the model being
>> used.  difxcalc will generate a slightly different model compared to the
>> older calcif2.  If you want to compare apples to apples, you can copy the
>> reference .im file to have the name wude_1.im and then run the
>> correlation again (ensuring that the im file is not being regenerated again
>> - you can add "--dont-calc" to the startdifx invocation to be sure) you
>> should hopefully see the differences disappear.
>>
>> Because the S/N on the individual visibility points is very low, even a
>> small change in the model leads to a large difference in the result.
>>
>> Cheers,
>> Adam
>>
>> On Fri, 7 Jul 2023 at 12:52, 深空探测 <wude7826580 at gmail.com> wrote:
>>
>>> Hi Adam,
>>>
>>> I wanted to provide you with an update regarding my previous confusion
>>> regarding the "--mca" option. I have finally gained a clear understanding,
>>> and I can confirm that the "startdifx" command executed successfully
>>> without using the "--mca" option.  I sincerely apologize for not thoroughly
>>> comprehending the implications of the "--mca" option, which caused the
>>> recurring issues with mpirun.
>>>
>>> For testing purposes, I utilized the rdv70 dataset and followed the
>>> instructions outlined in the README file. Specifically, I used the command
>>> "diffDiFX.py reference_1.difx/DIFX_54656_074996.s0000.b0000
>>> example_1.difx/DIFX_54656_074996.s0000.b0000 -i example_1.input" to compare
>>> my own computed results (example data) with the reference data. The final
>>> two lines of the displayed results were as follows:
>>>
>>> "At the end, 1320 records disagreed on the header.
>>> After 1848 records, the mean percentage absolute difference is
>>> 67.67053685, and the mean difference is 2.27732242 + 5.89398558 i."
>>>
>>> Although I did not encounter any issues during the data processing, it
>>> appears that there are substantial differences in the comparison results. I
>>> am uncertain about the specific step that might have caused this problem.
>>>
>>> Furthermore, when I generated the directory for the 1234 files using the
>>> "difx2mark4 -e 1234 example_1.difx" command, I encountered an issue when
>>> executing the command "fourfit -pt -c ../1234 191-2050" within the 1234
>>> directory. The resulting error messages were as follows:
>>>
>>> "fourfit: Invalid $block statement '$STATION A B BR-VLBA AXEL 2.0000
>>> 90.0 ......
>>> fourfit: Failure in locate_blocks()
>>> fourfit: Low-level parse of
>>> '/home/wude/difx/test_data/rdv70/1234/191-2050//4C39_25.2SN1CT' failed
>>> fourfit: The above errors occurred while processing
>>> fourfit: 191-2050//4C39_25.2SN1CT
>>> fourfit: the top-level resolution is as follows: Error reading root for
>>> file 191-2050/, skipping."
>>>
>>> However, when I conducted a test using the tc016a.pulsar dataset and ran
>>> the command "fourfit -pt -c ../1234 No0040," I successfully obtained the
>>> interference fringe image.
>>>
>>> Thank you for your time and support.
>>>
>>> Best regards,
>>>
>>> De Wu
>>>
>>> Adam Deller <adeller at astro.swin.edu.au> 于2023年7月6日周四 12:19写道:
>>>
>>>> Sorry, I just saw that you had done this (and reported in your second
>>>> email):
>>>>
>>>> *Subsequently, I proceeded to run the command "mpirun -np 8
>>>> -machinefile wude_1.machines mpifxcorr wude_1.input," and I was able to
>>>> obtain the ".difx" files successfully.*
>>>>
>>>> So if you edit the startdifx file and find where mpirun is being
>>>> invoked, and remove those --mca options, you should be fine.
>>>>
>>>> Cheers,
>>>> Adam
>>>>
>>>> On Thu, 6 Jul 2023 at 14:16, Adam Deller <adeller at astro.swin.edu.au>
>>>> wrote:
>>>>
>>>>> Hi Wu,
>>>>>
>>>>> calcif2 is the delay-generating program that requires the calcserver
>>>>> to be running (which wasn't the case for you). Setting
>>>>> DIFX_CALC_PROGRAM=difxcalc determines which program which will be called by
>>>>> startdifx.   But you were trying to run calcif2 itself from the command
>>>>> line, so naturally this won't work.  If you run difxcalc wude_1.calc, it
>>>>> should work.  And as you saw, if you run startdifx after setting
>>>>> DIFX_CALC_PROGRAM=difxcalc , that also works fine.
>>>>>
>>>>> Once you have run difxcalc (or calcif2) the .im file will be
>>>>> generated. If you try to run difxcalc/calcif2 again once the .im file has
>>>>> been generated, it won't run unless you force it (since it sees that the
>>>>> .im file has been generated, so no need to re-generate it).
>>>>>
>>>>> So your remaining problem now is that MPI seems to think that you
>>>>> don't have any available CPUs on your host.  Once again (I think this is
>>>>> the third time I'm making this suggestion): please try running the mpirun
>>>>> command *without* the --mca options.  I.e.,
>>>>>
>>>>> mpirun -np 4 --hostfile wude_1.machines runmpifxcorr.DiFX-2.6.2
>>>>> wude_1.input
>>>>>
>>>>> You may also have success by adding --oversubscribe to the mpirun
>>>>> command (although that is more of a band-aid getting around the fact that
>>>>> it seems that openmpi isn't seeing how many CPUs are available).
>>>>>
>>>>> If you can figure out what mpirun option is causing the problem, you
>>>>> will then be able to modify startdifx to remove the offending option for
>>>>> you always.
>>>>>
>>>>> Cheers,
>>>>> Adam
>>>>>
>>>>> On Tue, 4 Jul 2023 at 17:30, 深空探测 <wude7826580 at gmail.com> wrote:
>>>>>
>>>>>> Subject: Issue with DiFX Testing - RPC Errors and CPU Allocation
>>>>>>
>>>>>> Hi Adam,
>>>>>>
>>>>>> I apologize for the delay in getting back to you. I've been
>>>>>> conducting tests with DiFX lately, and I encountered a few issues that I
>>>>>> would appreciate your insight on.
>>>>>>
>>>>>> Initially, I faced problems running the `mpirun` command, but I
>>>>>> managed to resolve them by reinstalling DiFX on a new CentOS7 system.
>>>>>> Previously, I had installed `openmpi-1.6.5` in the `/usr/local` directory,
>>>>>> but this time, I used the command `sudo yum install openmpi-devel` to
>>>>>> install `openmpi`, and then I installed DiFX in the
>>>>>> `/home/wude/difx/DIFXROOT` directory. Following this setup, the `mpirun`
>>>>>> command started working correctly. I suspect that the previous installation
>>>>>> in the system directory might have been causing the issues with `mpirun`.
>>>>>>
>>>>>> However, I encountered a new problem when running the command
>>>>>> `calcif2 wude_1.calc`. The output displayed the following error:
>>>>>>
>>>>>>
>>>>>> ----------------------------------------------------------------------------------------
>>>>>> calcif2 processing file 1/1 = wude_1
>>>>>> localhost: RPC: Program not registered
>>>>>> Error: calcif2: RPC clnt_create fails for host: localhost
>>>>>> Error: Cannot initialize CalcParams
>>>>>>
>>>>>> ----------------------------------------------------------------------------------------
>>>>>>
>>>>>> Previously, I resolved a similar error by running the command:
>>>>>> `export DIFX_CALC_PROGRAM=difxcalc`. However, when I tried the same
>>>>>> solution this time, it didn't resolve the issue.
>>>>>>
>>>>>> Additionally, when running the command: `mpirun -np 4 --hostfile
>>>>>> wude_1.machines --mca mpi_yield_when_idle 1 --mca rmaps seq
>>>>>> runmpifxcorr.DiFX-2.6.2 wude_1.input`, the output displayed the following
>>>>>> message:
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------------------------------------------------
>>>>>> While computing bindings, we found no available CPUs on the following
>>>>>> node:
>>>>>>     Node: wude
>>>>>> Please check your allocation.
>>>>>>
>>>>>> ---------------------------------------------------------------------------------------------------------------
>>>>>>
>>>>>> My hostname is "wude", and it seems like there are no available CPUs,
>>>>>> but I can't determine the cause of this issue. Hence, I am reaching out to
>>>>>> seek your guidance on this matter.
>>>>>>
>>>>>> Thank you for your time and support.
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> De Wu
>>>>>>
>>>>>> Adam Deller <adeller at astro.swin.edu.au> 于2023年6月26日周一 07:36写道:
>>>>>>
>>>>>>> Have you tried removing the --mca options from the command? E.g.,
>>>>>>>
>>>>>>> mpirun -np 4 --hostfile /vlbi/aov070/aov070_1.machines
>>>>>>> runmpifxcorr.DiFX-2.6.2 /vlbi/aov070/aov070_1.input
>>>>>>>
>>>>>>> I have a suspicion that either the seq or rmaps option is not
>>>>>>> playing nice, but it is easiest to just remove all the options and see if
>>>>>>> that makes any difference.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Adam
>>>>>>>
>>>>>>> On Mon, 26 Jun 2023 at 01:58, 深空探测 <wude7826580 at gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Adam,
>>>>>>>>
>>>>>>>> As you suggested, I removed the "| head" from the command, and I
>>>>>>>> was able to run it successfully.
>>>>>>>>
>>>>>>>> However, when executing the following command: "mpirun -np 4
>>>>>>>> --hostfile /vlbi/aov070/aov070_1.machines --mca mpi_yield_when_idle 1 --mca
>>>>>>>> rmaps seq runmpifxcorr.DiFX-2.6.2 /vlbi/aov070/aov070_1.input". The output
>>>>>>>> displayed the following message:
>>>>>>>>
>>>>>>>>
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>> mpirun noticed that the job aborted, but has no info as to the
>>>>>>>> process
>>>>>>>> that caused that situation.
>>>>>>>>
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>
>>>>>>>> Additionally, when running the command "mpirun -np 4 -H
>>>>>>>> localhost,localhost,localhost,localhost --mca mpi_yield_when_idle 1 --mca
>>>>>>>> rmaps seq runmpifxcorr.DiFX-2.6.2 /vlbi/aov070/aov070_1.input," and it
>>>>>>>> resulted in the following error message:
>>>>>>>>
>>>>>>>>
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>> There are no nodes allocated to this job.
>>>>>>>>
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>
>>>>>>>> It is quite puzzling that even when specifying only one localhost
>>>>>>>> in the command, I still receive this output. I have been considering the
>>>>>>>> possibility that this issue might be due to limitations in system
>>>>>>>> resources, node access permissions, or node configuration within the
>>>>>>>> CentOS7 virtual machine environment.
>>>>>>>>
>>>>>>>> Thank you for your attention to this matter.
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>> De Wu
>>>>>>>>
>>>>>>>> Adam Deller <adeller at astro.swin.edu.au> 于2023年6月22日周四 15:53写道:
>>>>>>>>
>>>>>>>>> Hi De Wu,
>>>>>>>>>
>>>>>>>>> The "SIGPIPE detected on fd 13 - aborting" errors when running
>>>>>>>>> mpispeed are related to piping the output to head.  Remove the "| head" and
>>>>>>>>> you should see it run normally.
>>>>>>>>>
>>>>>>>>> For running mpifxcorr, the obvious difference between your
>>>>>>>>> invocation of mpispeed and mpifxcorr is the use of the various mca
>>>>>>>>> options.  What happens if you add " --mca mpi_yield_when_idle 1 --mca rmaps
>>>>>>>>> seq" to your mpispeed launch (before or after the -H localhost,localhost)?
>>>>>>>>> If it doesn't work, then probably one or the other of those options is the
>>>>>>>>> problem, and you need to change startdifx to get rid of the offending
>>>>>>>>> option when running mpirun.
>>>>>>>>>
>>>>>>>>> If running mpispeed still works when with those options, what
>>>>>>>>> about the following:
>>>>>>>>> 1. manually run mpirun -np 4 --hostfile
>>>>>>>>> /vlbi/aov070/aov070_1.machines --mca mpi_yield_when_idle 1 --mca rmaps seq
>>>>>>>>>  runmpifxcorr.DiFX-2.6.2 /vlbi/aov070/aov070_1.input, see what output comes
>>>>>>>>> out
>>>>>>>>> 2. manually run mpirun -np 4 -H
>>>>>>>>> localhost,localhost,localhost,localhost --mca mpi_yield_when_idle 1 --mca
>>>>>>>>> rmaps seq  runmpifxcorr.DiFX-2.6.2 /vlbi/aov070/aov070_1.input, see what
>>>>>>>>> output comes out
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Adam
>>>>>>>>>
>>>>>>>>> On Mon, 19 Jun 2023 at 18:02, 深空探测 via Difx-users <
>>>>>>>>> difx-users at listmgr.nrao.edu> wrote:
>>>>>>>>>
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> I recently reinstalled OpenMPI-1.6.5 and successfully ran the
>>>>>>>>>> example program provided within the OpenMPI package. By executing the
>>>>>>>>>> command "mpiexec -n 6 ./hello_c," I obtained the following output:
>>>>>>>>>>
>>>>>>>>>> ```
>>>>>>>>>> wude at wude DiFX-2.6.2 examples> mpiexec -n 6 ./hello_c
>>>>>>>>>> Hello, world, I am 4 of 6
>>>>>>>>>> Hello, world, I am 2 of 6
>>>>>>>>>> Hello, world, I am 0 of 6
>>>>>>>>>> Hello, world, I am 1 of 6
>>>>>>>>>> Hello, world, I am 3 of 6
>>>>>>>>>> Hello, world, I am 5 of 6
>>>>>>>>>> ```
>>>>>>>>>>
>>>>>>>>>> The program executed without any issues, displaying the expected
>>>>>>>>>> output. Each line represents a separate process, showing the process number
>>>>>>>>>> and the total number of processes involved.
>>>>>>>>>>
>>>>>>>>>> However, I encountered some difficulties when running the command
>>>>>>>>>> "mpirun -H localhost,localhost mpispeed 1000 10s 1 | head." Although both
>>>>>>>>>> nodes seem to run properly, there appear to be some errors in the output.
>>>>>>>>>> Below is the output I received, with "wude" being my username:
>>>>>>>>>>
>>>>>>>>>> ```
>>>>>>>>>> wude at wude DiFX-2.6.2 ~> mpirun -H localhost,localhost mpispeed
>>>>>>>>>> 1000 10s 1 | head
>>>>>>>>>> Processor = wude
>>>>>>>>>> Rank = 0/2
>>>>>>>>>> [0] Starting
>>>>>>>>>> Processor = wude
>>>>>>>>>> Rank = 1/2
>>>>>>>>>> [1] Starting
>>>>>>>>>> [1] Recvd 0 -> 0 : 2740.66 Mbps curr : 2740.66 Mbps mean
>>>>>>>>>> [1] Recvd 1 -> 0 : 60830.52 Mbps curr : 5245.02 Mbps mean
>>>>>>>>>> [1] Recvd 2 -> 0 : 69260.57 Mbps curr : 7580.50 Mbps mean
>>>>>>>>>> [1] Recvd 3 -> 0 : 68545.44 Mbps curr : 9747.65 Mbps mean
>>>>>>>>>> [wude:05649] mpirun: SIGPIPE detected on fd 13 - aborting
>>>>>>>>>> mpirun: killing job...
>>>>>>>>>>
>>>>>>>>>> [wude:05649] mpirun: SIGPIPE detected on fd 13 - aborting
>>>>>>>>>> mpirun: killing job...
>>>>>>>>>> ```
>>>>>>>>>>
>>>>>>>>>> I'm unsure whether you experience the same "mpirun: SIGPIPE
>>>>>>>>>> detected on fd 13 - aborting mpirun: killing job..." message when running
>>>>>>>>>> this command on your computer.
>>>>>>>>>>
>>>>>>>>>> Furthermore, when I ran the command "startdifx -v -f -n
>>>>>>>>>> aov070.joblist," the .difx file was not generated. Could you please provide
>>>>>>>>>> some guidance or suggestions to help me troubleshoot this issue?
>>>>>>>>>>
>>>>>>>>>> Here is the output I received when running the command:
>>>>>>>>>>
>>>>>>>>>> ```
>>>>>>>>>> wude at wude DiFX-2.6.2 aov070> startdifx -v -f -n aov070.joblist
>>>>>>>>>> No errors with input file /vlbi/aov070/aov070_1.input
>>>>>>>>>>
>>>>>>>>>> Executing:  mpirun -np 4 --hostfile
>>>>>>>>>> /vlbi/aov070/aov070_1.machines --mca mpi_yield_when_idle 1 --mca rmaps seq
>>>>>>>>>>  runmpifxcorr.DiFX-2.6.2 /vlbi/aov070/aov070_1.input
>>>>>>>>>>
>>>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>>> mpirun noticed that the job aborted, but has no info as to the
>>>>>>>>>> process
>>>>>>>>>> that caused that situation.
>>>>>>>>>>
>>>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>>> Elapsed time (s) = 82.2610619068
>>>>>>>>>> ```
>>>>>>>>>> Best regards,
>>>>>>>>>>
>>>>>>>>>> De Wu
>>>>>>>>>>
>>>>>>>>>> Adam Deller <adeller at astro.swin.edu.au> 于2023年5月25日周四 08:42写道:
>>>>>>>>>>
>>>>>>>>>>> Hi De Wu,
>>>>>>>>>>>
>>>>>>>>>>> If I run
>>>>>>>>>>>
>>>>>>>>>>> mpirun -H localhost,localhost mpispeed 1000 10s 1
>>>>>>>>>>>
>>>>>>>>>>> it runs correctly as follows:
>>>>>>>>>>>
>>>>>>>>>>> adeller at ar313-adeller trunk Downloads> mpirun -H
>>>>>>>>>>> localhost,localhost mpispeed 1000 10s 1 | head
>>>>>>>>>>> Processor = <my host name>
>>>>>>>>>>> Rank = 0/2
>>>>>>>>>>> [0] Starting
>>>>>>>>>>> Processor =<my host name>
>>>>>>>>>>> Rank = 1/2
>>>>>>>>>>> [1] Starting
>>>>>>>>>>>
>>>>>>>>>>> It seems like in your case, MPI is looking at the two identical
>>>>>>>>>>> host names you've given and is deciding to only start one process, rather
>>>>>>>>>>> than two. What if you run
>>>>>>>>>>>
>>>>>>>>>>> mpirun -n 2 -H wude,wude mpispeed 1000 10s 1
>>>>>>>>>>>
>>>>>>>>>>> ?
>>>>>>>>>>>
>>>>>>>>>>> I think the issue is with your MPI installation / the parameters
>>>>>>>>>>> being passed to mpirun. Unfortunately as I've mentioned previously the
>>>>>>>>>>> behaviour of MPI with default parameters seems to change from
>>>>>>>>>>> implementation to implementation and version to version - you just need to
>>>>>>>>>>> track down what is needed to make sure it actually runs the number of
>>>>>>>>>>> processes you want on the nodes you want!
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Adam
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, 24 May 2023 at 18:30, 深空探测 via Difx-users <
>>>>>>>>>>> difx-users at listmgr.nrao.edu> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi  All,
>>>>>>>>>>>>
>>>>>>>>>>>> I am writing to seek assistance regarding an issue I
>>>>>>>>>>>> encountered while working with MPI on a CentOS 7 virtual machine.
>>>>>>>>>>>>
>>>>>>>>>>>> I have successfully installed openmpi-1.6.5 on the CentOS 7
>>>>>>>>>>>> virtual machine. However, when I attempted to execute the command
>>>>>>>>>>>> "startdifx -f -n -v aov070.joblist," I received the following error message:
>>>>>>>>>>>>
>>>>>>>>>>>> "Environment variable DIFX_CALC_PROGRAM was set, so
>>>>>>>>>>>> Using specified calc program: difxcalc
>>>>>>>>>>>>
>>>>>>>>>>>> No errors with input file /vlbi/corr/aov070/aov070_1.input
>>>>>>>>>>>>
>>>>>>>>>>>> Executing: mpirun -np 4 --hostfile
>>>>>>>>>>>> /vlbi/corr/aov070/aov070_1.machines --mca mpi_yield_when_idle 1 --mca rmaps
>>>>>>>>>>>> seq runmpifxcorr.DiFX-2.6.2 /vlbi/corr/aov070/aov070_1.input
>>>>>>>>>>>>
>>>>>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>>>>> mpirun noticed that the job aborted, but has no info as to the
>>>>>>>>>>>> process that caused that situation.
>>>>>>>>>>>>
>>>>>>>>>>>> --------------------------------------------------------------------------"
>>>>>>>>>>>>
>>>>>>>>>>>> To further investigate the MPI functionality, I wrote a Python
>>>>>>>>>>>> program “mpi_hello_world.py” as follows:
>>>>>>>>>>>>
>>>>>>>>>>>> from mpi4py import MPI
>>>>>>>>>>>>
>>>>>>>>>>>> comm = MPI.COMM_WORLD
>>>>>>>>>>>> rank = comm.Get_rank()
>>>>>>>>>>>> size = comm.Get_size()
>>>>>>>>>>>>
>>>>>>>>>>>> print("Hello from rank", rank, "of", size)
>>>>>>>>>>>>
>>>>>>>>>>>> When I executed the command "mpiexec -n 4 python
>>>>>>>>>>>> mpi_hello_world.py," the output was as follows:
>>>>>>>>>>>>
>>>>>>>>>>>> ('Hello from rank', 0, 'of', 1)
>>>>>>>>>>>> ('Hello from rank', 0, 'of', 1)
>>>>>>>>>>>> ('Hello from rank', 0, 'of', 1)
>>>>>>>>>>>> ('Hello from rank', 0, 'of', 1)
>>>>>>>>>>>>
>>>>>>>>>>>> Additionally, I attempted to test the MPI functionality using
>>>>>>>>>>>> the "mpispeed" command with the following execution command: "mpirun -H
>>>>>>>>>>>> wude,wude mpispeed 1000 10s 1".  “wude” is my hostname. However, I
>>>>>>>>>>>> encountered the following error message:
>>>>>>>>>>>>
>>>>>>>>>>>> "Processor = wude
>>>>>>>>>>>> Rank = 0/1
>>>>>>>>>>>> Sorry, must run with an even number of processes
>>>>>>>>>>>> This program should be invoked in a manner similar to:
>>>>>>>>>>>> mpirun -H host1,host2,...,hostN mpispeed
>>>>>>>>>>>> [<numSends>|<timeSend>s] [<sendSizeMByte>]
>>>>>>>>>>>>     where
>>>>>>>>>>>>         numSends: number of blocks to send (e.g., 256), or
>>>>>>>>>>>>         timeSend: duration in seconds to send (e.g., 100s)
>>>>>>>>>>>>
>>>>>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>>>>> mpirun noticed that the job aborted, but has no info as to the
>>>>>>>>>>>> process that caused that situation.
>>>>>>>>>>>>
>>>>>>>>>>>> --------------------------------------------------------------------------"
>>>>>>>>>>>>
>>>>>>>>>>>> I am uncertain about the source of these issues and would
>>>>>>>>>>>> greatly appreciate your guidance in resolving them. If you have any
>>>>>>>>>>>> insights or suggestions regarding the aforementioned errors and how I can
>>>>>>>>>>>> rectify them, please let me know.
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you for your time and assistance.
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>
>>>>>>>>>>>> De Wu
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Difx-users mailing list
>>>>>>>>>>>> Difx-users at listmgr.nrao.edu
>>>>>>>>>>>> https://listmgr.nrao.edu/mailman/listinfo/difx-users
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> !=============================================================!
>>>>>>>>>>> Prof. Adam Deller
>>>>>>>>>>> Centre for Astrophysics & Supercomputing
>>>>>>>>>>> Swinburne University of Technology
>>>>>>>>>>> John St, Hawthorn VIC 3122 Australia
>>>>>>>>>>> phone: +61 3 9214 5307
>>>>>>>>>>> fax: +61 3 9214 8797
>>>>>>>>>>> !=============================================================!
>>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Difx-users mailing list
>>>>>>>>>> Difx-users at listmgr.nrao.edu
>>>>>>>>>> https://listmgr.nrao.edu/mailman/listinfo/difx-users
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> !=============================================================!
>>>>>>>>> Prof. Adam Deller
>>>>>>>>> Centre for Astrophysics & Supercomputing
>>>>>>>>> Swinburne University of Technology
>>>>>>>>> John St, Hawthorn VIC 3122 Australia
>>>>>>>>> phone: +61 3 9214 5307
>>>>>>>>> fax: +61 3 9214 8797
>>>>>>>>> !=============================================================!
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> !=============================================================!
>>>>>>> Prof. Adam Deller
>>>>>>> Centre for Astrophysics & Supercomputing
>>>>>>> Swinburne University of Technology
>>>>>>> John St, Hawthorn VIC 3122 Australia
>>>>>>> phone: +61 3 9214 5307
>>>>>>> fax: +61 3 9214 8797
>>>>>>> !=============================================================!
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> !=============================================================!
>>>>> Prof. Adam Deller
>>>>> Centre for Astrophysics & Supercomputing
>>>>> Swinburne University of Technology
>>>>> John St, Hawthorn VIC 3122 Australia
>>>>> phone: +61 3 9214 5307
>>>>> fax: +61 3 9214 8797
>>>>> !=============================================================!
>>>>>
>>>>
>>>>
>>>> --
>>>> !=============================================================!
>>>> Prof. Adam Deller
>>>> Centre for Astrophysics & Supercomputing
>>>> Swinburne University of Technology
>>>> John St, Hawthorn VIC 3122 Australia
>>>> phone: +61 3 9214 5307
>>>> fax: +61 3 9214 8797
>>>> !=============================================================!
>>>>
>>>
>>
>> --
>> !=============================================================!
>> Prof. Adam Deller
>> Centre for Astrophysics & Supercomputing
>> Swinburne University of Technology
>> John St, Hawthorn VIC 3122 Australia
>> phone: +61 3 9214 5307
>> fax: +61 3 9214 8797
>> !=============================================================!
>>
> _______________________________________________
> Difx-users mailing list
> Difx-users at listmgr.nrao.edu
> https://listmgr.nrao.edu/mailman/listinfo/difx-users
>


-- 
!=============================================================!
Prof. Adam Deller
Centre for Astrophysics & Supercomputing
Swinburne University of Technology
John St, Hawthorn VIC 3122 Australia
phone: +61 3 9214 5307
fax: +61 3 9214 8797
!=============================================================!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listmgr.nrao.edu/pipermail/difx-users/attachments/20230710/6f9a3199/attachment-0001.html>


More information about the Difx-users mailing list