[Difx-users] Difx on XEON Scalable Platform CPUs?

Tue Apr 10 11:26:29 EDT 2018

Hi Peter,

Thanks for chiming in.  Indeed getting benchmarks on a system would be 
useful.  To get useful benchmarking values I find the system being studied 
needs to be in a cluster environment with other machines serving as the 
"datastream nodes" and with sufficient network backbone that I/O is not 
the limiting factor.

Your information about the execution units vs. CPU series is quite good to 
know.  An important thing to determine is if I/O (either system-wide, or 
even just RAM<-->Cache<-->CPU) becomes the limiting factor in these 
systems for the calculations we perform.

 	-W

On Tue, 10 Apr 2018, Kerney, Peter wrote:

> I did send this to only Walter and Chris but I’ll give it a wider audience.
>
>
>
> The list of AVX-512 optimised IPP routines is maintained here. https://software.intel.com/en-us/articles/intel-ipp-functions-optimized-for-intel-avx-512 and now has a section on Skylake "Intel® Xeon® Processor Scalable Family".
>
> The list also keeps growing, see the release notes here. https://software.intel.com/en-us/articles/intel-ipp-release-notes-and-new-features
>
> If there is any particular IPP routine missing that you need then let me know and I can contact the developers.
>
>
>
> Just a note on the "bronze (31xx)", "silver (41xx)", "gold (51xx,61xx)", or "platinum (81xx)". The 61xx and the 81xx CPUs have two AVX-512 execution units per core, as opposed to one on the 31xx/41xx/51xx, and so performance of them will be better on AVX-512 optimised workloads.
>
>
>
> Do you have a "recommended configuration" or an idea of what you would need to run a benchmark? If no one has access to a system I can try and arrange this.
>
>
>
> Regards, Peter Kerney.
>
>
>
> ----------------------------------------------
>
> Peter Kerney, Enterprise Solution Architect
>
> Intel Australia Pty Ltd
>
> Level 17, 111 Pacific Highway
>
> North Sydney NSW 2060 Australia
>
> peter.kerney at intel.com<mailto:peter.kerney at intel.com>
>
> Ph: +61299375981
>
> Mb: 0407013230 (+61407013230)
>
> ----------------------------------------------
>
>
> From: Difx-users [mailto:difx-users-bounces at listmgr.nrao.edu] On Behalf Of Richard Dodson
> Sent: Tuesday, April 10, 2018 12:19 PM
> To: Chris Phillips <Chris.Phillips at csiro.au>
> Cc: difxusers <difx-users at nrao.edu>
> Subject: Re: [Difx-users] Difx on XEON Scalable Platform CPUs?
>
> Hi Walter
>
> When I had DiFX running on early versions of Xeon Phi (2012?) I had to by-pass the IPP libraries (at the time Intel said they were not keen to carry those forwards). So all functionality was included in GENERIC. Some things had to be inline functions, but I could use the MKL libraries for others.
> Has IPP been ported now?
>
>
> If not (and maybe if so) it could be worth looking to see where the speed up would be (it was dominated by the complex multiply accumulate at the time) and see if that could be improved. For example the intrinsic for genericAddProduct_32fc could be replaced with the code example in:
>
> In https://software.intel.com/en-us/forums/intel-isa-extensions/topic/74737
>
> The comments include:
>
> SSE3 (included in SSE4.2) has specific instructions which the compiler will use to vectorize complex without requiring intrinsics or shuffle.  In order to use fully AVX2 or AVX512, as John said, it will be necessary to split the data. Compilers will not attempt to evaluate whether a mixture of SSE3 and AVX2 will prove faster.  The statistics produced by opt-report may help to evaluate this, and might lead you to use some SSE3 intrinsics in case the overhead of splitting the data would be significant.
>
>
>
> Page 5-26 of https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf may have useful examples as well
>
>       Richard
>
> On Tue, Apr 10, 2018 at 9:20 AM, <Chris.Phillips at csiro.au<mailto:Chris.Phillips at csiro.au>> wrote:
> Hi Walter
>
> No I haven’t, but just a note that when I tried DIFX  on the KnightsLanding an year or so ago, IPP had not been ported to AVX512 at that stage
>
> Cheers
> Chris
>
>
>
>> On 10 Apr 2018, at 11:15, Walter Brisken <wbrisken at lbo.us<mailto:wbrisken at lbo.us>> wrote:
>>
>>
>> Hi DiFX Users,
>>
>> I'm wondering if anyone on this list has tried DiFX on Intel XEON CPUs called "Scalable Platform".  These are distinguished by "bronze", "silver", "gold", or "platinum" varieties.  They are also labeled as "Skylake-SP" processors.  If anyone has such experience, I'm wondering if any benchmarks are available.  Also wondering if any attempt to evaluate improvement the AVX512 instructions offer in DiFX computing.
>>
>> Thanks,
>>
>> Walter
>>
>> --
>> -------------------------
>> Walter Brisken
>> Director
>> Long Baseline Observatory
>> (575)-835-7133 (office)
>> (505)-234-5912 (cell)
>>
>> _______________________________________________
>> Difx-users mailing list
>> Difx-users at listmgr.nrao.edu<mailto:Difx-users at listmgr.nrao.edu>
>> https://listmgr.nrao.edu/mailman/listinfo/difx-users
>
>
> _______________________________________________
> Difx-users mailing list
> Difx-users at listmgr.nrao.edu<mailto:Difx-users at listmgr.nrao.edu>
> https://listmgr.nrao.edu/mailman/listinfo/difx-users
>
>
>
> --
> -------------------------
> Dr Richard Dodson,
> International Centre for Radio Astronomy Research
> University of Western Australia
> P: +8 6488 7842 E: richard.dodson at icrar.org<mailto:richard.dodson at icrar.org>
>

-- 
-------------------------
Walter Brisken
Director
Long Baseline Observatory
(575)-835-7133 (office)
(505)-234-5912 (cell)