Third Generation Intel® Xeon® Scalable processor family: The Deep Dive
From performance to security and AI acceleration, the new generation of Intel® Xeon Scalable delivers speed and innovation where it counts
In a way, Intel’s new 3rd Generation Intel® Xeon® Scalable processors deliver exactly what you’d expect: a die-shrink down to 10nm, more cores, more cache and more performance compared to the previous generation. Yet the latest Xeon® Scalable line-up also promises so much more. These are processors designed to handle cutting-edge server, network and high-performance workloads. They’re the foundation of a platform built to enhance security and accelerate data-intensive applications and AI. What’s more, the 40 new CPUs scale upwards and downwards to cover everything from cloud and data centre servers through to systems powering the intelligent edge.
More cores, faster connectivity
Certainly, the shift down to the 10nm node has enabled Intel to increase core counts and cache within the same power envelope as the previous generation. The new flagship Intel® Xeon® Platinum 8380 features 40 cores running 80 threads, where the second-generation Intel® Xeon® Platinum topped out at 28 cores running 56. Where the old Intel® Xeon® Platinum had a maximum 38.5MB of L3 cache, the new line takes that up to 60MB. Nor is the good stuff restricted to the high-end Platinum line. The Intel® Xeon® Gold 6330 is a 28-core CPU with 42MB of cache, where the old 6230 has 20 cores and 27.5MB. The new Intel® Xeon® 34rsd55rt3 34rsd55rt3 Silver 4310 has 12 cores and 18MB to the old Intel® Xeon® Silver 4210’s 10 cores and 13.75MB.
Drill down into the specs of the new CPUs and you’ll notice that maximum clock speeds are a little lower than the second-generation equivalents. Where the old Intel® Xeon® Platinum 8280 reached 4GHz from a 2.7GHz base clock, the Intel® Xeon® Platinum 8380 hits 3.4GHz with a 2.3GHz base clock. Yet architectural improvements, including enhanced, higher-capacity brand prediction and pre-fetch and look-aside buffers, alongside new allocation and execution resources, mean a 20% improvement in instructions per cycle (IPC) per core*. Put this together with the increased core count and additional cache, and there are dramatic improvements in real-world performance over the previous generation, with an average 46% improvement across a range of common data centre workloads**.
However, these aren’t the only significant changes. As a platform, Intel Xeon Scalable processors now support up to 6TB of system memory per socket, and up to eight channels of DDR4-3200 RAM – a big step on from the previous generation’s maximum six channels and DDR4-2933. The platform has also moved on to support PCIe Gen 4 connectivity with 64 lanes, giving you more bandwidth than PCIe 3 and an extra 24 lanes to work with. And where the old architecture supported up to four Ultra Path Interconnect channels to move data between multiple CPUs, the new architecture takes that up to six, improving the bandwidth between CPUs and increasing platform scalability for I/O intensive workloads. With 3rd Generation Intel® Xeon Scalable processor and an 8-socket configuration, you can have up to 224 cores at work in a single system.
Yet 3rd Gen Intel® Xeon Scalable isn’t just about raw power, but also control. It features the latest version of Intel Speed Select Technology (SST), allowing IT to fine-tune and optimise how the cores are used for specific workloads. Where applications demand high clock speeds, the CPU can deliver it by running the cores affected at a higher-base or turbo-clock speed, while reducing the frequency, base or turbo speed of cores running lower-priority applications. Meanwhile, Intel® Resource Director Technology (RDT) provides similar fine grain control over last-level cache and memory bandwidth. Together, these technologies can help to enhance visibility, stability and performance on systems running multiple applications, containers or virtual machines, and give IT teams real flexibility.
Standard data centre workloads are one thing, but Xeon Scalable’s greatest strength lies in its built-in acceleration for AI workloads. The first-generation Xeon Scalable architecture introduced a new set of AVX-512 instructions designed to accelerate high-performance workloads such as cryptography and AI. The second generation pushed this further by adding Deep Learning Boost (DL Boost) – a technology built around a set of AVX-512 Vector Neural Network Instructions (VNNI) that can significantly enhance performance during inferencing.
Inferencing is the stage in a Deep Learning workflow where the neural network built up through training applies that learning to data that it hasn’t seen before, to identify, say, a pattern or an object. VNNI allows key operations that would previously have taken three instructions to be completed with only one. What’s more, it enables 32-bit operations to be performed at 8-bit precision, which can reduce their complexity and speed up the calculations, with little impact on their accuracy.
Now 3rd Generation Xeon Scalable takes this further, with additional architectural improvements aimed at optimising resources and minimising bottlenecks, plus support for Brain Floating Point 16-bit (bfloat16), a low-precision numeric format used widely to accelerate deep learning and machine learning libraries such as Google’s TensorFlow. Put it all together, and 3rd Gen Intel® Xeon Scalable can deliver 74% faster AI performance than 2nd-gen Xeon Scalable processors†, and up to 1.5 times the performance on a mix of 20 popular AI workloads of AMD’s EPYC 7763 CPU††.
Memory and Storage
Of course, these cutting-edge DL and ML workloads also tend to involve huge datasets, where memory constraints or slow, high latency storage can create significant performance bottlenecks. Here, the 3rd Gen Xeon Scalable architecture can help by working with the new Intel® Optane Persistent Memory 200 Series. Optane Persistent Memory enhances performance in data-intensive applications by keeping more data closer to the CPU and holding it there when the system or application restarts, so that it doesn’t have to be reloaded from a slower SSD. It has 32% more bandwidth than the first generation. It enables even businesses without vast budgets to run massive datasets in-memory or handle larger numbers of virtualised or containerised workloads without throttling performance.
Meanwhile, 3rd-gen Xeon Scalable also plays well with Intel’s latest Optane and SSD technologies, so that your second tier of storage won’t restrict the performance of your workloads. Intel’s new Optane SSD P5800X offers 40% lower 4KB random read average latency than the previous generation drive, along with improved random read/write IOPs speeds and 67% higher endurance, using the new bandwidth released by the move to PCIe 4. And if you need capacity, Intel’s SSD D5-P5316 can accelerate access to stored data by up to 25x‡ and provide up to double the sequential read performance of previous generation Intel SSDs‡‡ in capacities that allow you to fit 1 petabyte of storage in a single 1U rack space.
Security is paramount in every environment where you’ll find a Xeon Scalable CPU, so it’s no surprise that the 3rd generation integrates Intel’s latest technology. Intel® Software Guard Extensions (SGX) allow you to run your sensitive code and data within a protected enclave in System RAM, where it’s isolated from other workloads and even the OS. With the 3rd Gen Xeon Scalable processors, this enclave can store and process up to 1TB of code and data.
To add another layer, Intel® Total Memory Encryption (Intel® TME) encrypts data stored in memory using AES XTS encryption and a key produced by a dedicated hardened random number generator in the CPU. Finally, Intel® Platform Firmware Resilience (Intel® PFR) technology uses an Intel field programmable gate array to validate firmware components before execution, helping to protect the firmware from attack.
Encryption has never been as necessary nor as ubiquitous to business computing, but that doesn’t mean it has to limit performance or slow productivity down. The new Xeon Scalable line-up also includes built-in Intel® Crypto Acceleration technology, designed to accelerate performance across a range of cryptographic algorithms and speed up encryption-intensive workloads. It’s the kind of thing you’ll notice if, say, you’re a retail organisation processing millions of customer transactions every day.
All in all, that’s a lot of technology, but 3rd Gen Xeon Scalable architecture brings it to a 40-strong range of CPUs covering everything from High Performance Computing powerhouses to cloud servers, wired and 5G network infrastructure devices and even the micro-servers being deployed to handle AI, IoT deployments or image analytics at the edge. Whenever and wherever you need performance and agility, you can find what you’re looking for in this line. 3rd Gen Xeon Scalable isn’t the only cutting-edge server, network and HPC processor platform in town, but by focusing its energy and innovation where it counts, Intel might well have ensured that it’s the best.
Notices & Disclaimers
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.
Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available updates. See [backup] for configuration details. No product or component can be absolutely secure.
Your costs and results may vary.
Intel technologies may require enabled hardware, software or service activation.
© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.
*20% IPC improvement: 3rd Gen Xeon Scalable processor: 1-node, 2x 28-core 3rd Gen Intel Xeon Scalable processor, Wilson City platform, 512GB (16 slots / 32GB / 3200) total DDR4 memory, HT on, ucode=x270, RHEL 8.0, Kernel Version4.18.0-80.el8.x86_64, test by Intel on 3/30/2021. 2nd Gen Intel Xeon Scalable processor: 1-node, 2x 28-core 2nd Gen Intel Xeon Scalable processor, Neon City platform, 384GB (12 slots / 32GB / 2933) total DDR4 memory, HT on, ucode=x2f00, RHEL 8.0, Kernel Version4.18.0-80.el8.x86_64, test by Intel on 3/30/2021. SPECrate2017_int_base (est). Tests at equal frequency, equal uncore frequency, equal compiler. Results may vary.
**See  at www.intel.com/3gen-xeon-config. Results may vary.
†See  at www.intel.com/3gen-xeon-config. Results may vary.
††See  at www.intel.com/3gen-xeon-config. Results may vary.
‡Sequential read performance based on Intel® SSD D5-P5316 compared to Seagate Exos X18 (seagate.com/files/www-content/datasheets/pdfs/exos-x18-channel-DS2045-1-2007GB-en_SG.pdf).
‡‡Up to 2x higher sequential read - Comparing 128K sequential read bandwidth between Intel® SSD D5-P5316 15.36TB (7.0 GB/s) and Intel® SSD D5-P4326 15.36TB (3.2 GB/s).
The ultimate law enforcement agency guide to going mobile
Best practices for implementing a mobile device programFree download
The business value of Red Hat OpenShift
Platform cost savings, ROI, and the challenges and opportunities of Red Hat OpenShiftFree download
Managing security and risk across the IT supply chain: A practical approach
Best practices for IT supply chain securityFree download
Digital remote monitoring and dispatch services’ impact on edge computing and data centres
Seven trends redefining remote monitoring and field service dispatch service requirementsFree download