Posted on 22 March, 2022
The AMD Instinct™ MI200 series accelerators are AMD's newest datacentre GPUs, designed to power discoveries at Exascale systems, enabling scientists to tackle our most pressing challenges from climate change to vaccine research. With MI200 accelerators and ROCm™ 5.0 software ecosystem, innovators can tap the power of the world's most powerful HPC and AI datacentre GPUs to accelerate their time to science and discovery.1
World's Fastest HPC & AI Accelerator1:
Based on the 2nd Gen AMD CDNA™ architecture, AMD Instinct™ MI200 accelerators deliver a quantum leap in HPC and AI performance over existing datacentre GPUs today. With a dramatic 4x advantage in HPC performance compared to existing datacentre GPUs, the MI250X accelerator delivers exceptional performance for a broad set of HPC applications.1 The MI200 accelerator is built to accelerate deep learning training, with the MI250X surpassing 380 teraflops peak theoretical FP16 performance, and offering users a powerful platform to fuel the convergence of HPC and AI.1
Innovations Delivering Performance Leadership:
AMD innovations in architecture, packagaing and integration are pushing the boundaries of computing by unifying the most important processors in the datacentre, the CPU and GPU accelerator. With Industry-first multi-chip GPU modules along with 3rd Gen AMD Infinity Architecure, AMD is delivering performance, efficiency and overall system throughput for HPC and AI using AMD EPYC™ CPUs and AMD Instinct™ MI200 series accelerators.
Ecosystem without Borders:
AMD ROCm™ is an open software platform allowing researchers to tap the power of AMD Instinct™ accelerators to drive scientific discoveries. The ROCm platform is built on the foundation of open portability, supporting environments across multiple accelerator vendors and architectures. With ROCm™ 5.0, AMD extends its platform powering top HPC and AI applications with AMD Instinct™ MI200 series accelerators, increasing accessibility of ROCm for developers and delivering outstanding performance across key workloads.
Get in touch with our Sales team to find out more about the AMD Instinct™ MI200 series accelerators by calling us on 01727 876100 or emailing us at [email protected].
1). Ml200-Dl - World's fastest data center GPU is the AMO Instinct'" Ml25DX. Calculations conducted by AMO Performance Labs as of Sep 15, 2021, for the AMO Instinct'" Ml25DX (128GB HBM2e DAM module) accelerator at 1,700 MHz peak boost engine clock resulted in 95.7 TFLOPS peak theoretical double precision (FP64 Matrix), 47.9 TFLOPS peak theoretical double precision (FP64), 95.7 TFLOPS peak theoretical single precision matrix (FP32 Matrix), 47.9 TFLOPS peak theoretical single precision (FP32), 383.0 TFLOPS peak theoretical half precision (FP16), and 383.0 TFLOPS peak theoretical Bfloat16 format precision (BF16) floating-point performance. Calculations conducted by AMO Performance Labs as of Sep 18, 2020 for the AMO Instinct'" Mil?? (32GB HBM2 PCle® card) accelerator at 1,502 MHz peak boost engine clock resulted in 11.54 TFLOPS peak theoretical double precision (FP64), 46.1 TFLOPS peak theoretical single precision matrix (FP32), 23.1 TFLOPS peak theoretical single precision (FP32), 184.6 TFLOPS peak theoretical half precision (FP16) floating-point performance. Published results on the NVidia Ampere Al?? (80GB) GPU accelerator, boost engine clock of 1410 MHz, resulted in 19.5 TFLOPS peak double precision tensor cores (FP64 Tensor Core), 9.7 TFLOPS peak double precision (FP64). 19.5 TFLOPS peak single precision (FP32), 78 TFLOPS peak half precision (FP16), 312 TFLOPS peak half precision (FP16 Tensor Flow), 39 TFLOPS peak Bfloat 16 (BF16), 312 TFLOPS peak Bfloat16 format precision (BF16 Tensor Flow), theoretical floating-point performance. The TF32 data format is not IEEE compliant and not included in this comparison.
https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf, page 15, Table 1. Ml200-01