[Difx-users] HPC using Ethernet vs Infiniband

Phillips, Chris (S&A, Marsfield) Chris.Phillips at csiro.au
Mon Sep 20 19:48:39 EDT 2021


Hi Walter,

I know Keith Banister has been looking into using RoCE for ASKAP CRAFT related stuff. Before you dive too deeply into trying RoCE I would suggest

	- How much benefit does it have for *TCP*. (Most applications I have heard is for UDP). I am assuming MPI only uses TCP.
	- Are you actually network (rather than CPU) limited? It could be a lot of effort for no real gain.

If you are keen I can put you in contact with Keith, if you don’t know him.

Cheers
Chris

> On Sep 21, 2021, at 5:28 AM, Walter Brisken via Difx-users <difx-users at listmgr.nrao.edu> wrote:
> 
> 
> Hi DiFX Users,
> 
> In the not so distant future we at VLBA may be be in the position to upgrade the network backbone of the VLBA correlator.  Currently we have a 40 Gbps Infiniband system dating back about 10 years.  At the time we installed that system, Infiniband showed clear advantages, likely driven by RDMA capability which offloads a significant amount of work from the CPU.  Now it seems Ethernet has RoCE (RDMA over Converged Ethernet) which aims to do the same thing.
> 
> 1. Does anyone have experience with RoCE?  If so, is this as easy to configure as the OpenMPI page suggests?  Any drawbacks of using it?
> 
> 2. Has anyone else gone through this decision process recently?  If so, any thoughts or advice?
> 
> 3. Has anyone run DiFX on an RoCE-based network?
> 
> 	-Walter
> 
> -------------------------
> Walter Brisken
> NRAO
> Deputy Assistant Director for VLBA Development
> (505)-234-5912 (cell)
> (575)-835-7133 (office; not useful during COVID times)
> 
> _______________________________________________
> Difx-users mailing list
> Difx-users at listmgr.nrao.edu
> https://listmgr.nrao.edu/mailman/listinfo/difx-users




More information about the Difx-users mailing list