March 2, 2022 Suppliers News

Unprecedented Acceleration at Every Scale

The NVIDIA A100 Tensor Core GPU delivers unprecedentedacceleration – at every scale – to power the world’s highestperforming elastic data centers for AI, data analytics, and highperformance computing (HPC) applications. As the engine ofthe NVIDIA data center platform, A100 provides up to 20X higherperformance over the prior NVIDIA Volta™ generation. A100 canefficiently scale up or be partitioned into seven isolated GPUinstances with Multi-Instance GPU (MIG), providing a unifiedplatform that enables elastic data centers to dynamically adjustto shifting workload demands.

NVIDIA A100 Tensor Core technology supports a broad rangeof math precisions, providing a single accelerator for everyworkload. The latest generation A100 80GB doubles GPU memoryand debuts the world’s fastest memory bandwidth at 2 terabytesper second (TB/s), speeding time to solution for the largestmodels and most massive datasets.

A100 is part of the complete NVIDIA data center solution thatincorporates building blocks across hardware, networking,software, libraries, and optimized AI models and applicationsfrom the NVIDIA NGC™ catalog. Representing the most powerfulend-to-end AI and HPC platform for data centers, it allowsresearchers to deliver real-world results and deploy solutionsinto production at scale.

Groundbreaking Innovations

NVIDIA AMPERE ARCHITECTURE

Whether using MIG to partition anA100 GPU into smaller instancesor NVLink to connect multipleGPUs to speed large-scale workloads, A100 canreadily handle different-sized acceleration needs,from the smallest job to the biggest multi-nodeworkload. A100’s versatility means IT managerscan maximize the utility of every GPU in their datacenter, around the clock.

THIRD-GENERATION TENSOR CORES

NVIDIA A100 delivers 312teraFLOPS (TFLOPS) of deeplearning performance. That’s 20Xthe Tensor floating-point operations per second(FLOPS) for deep learning training and 20X theTensor tera operations per second (TOPS) fordeep learning inference compared to NVIDIAVolta GPUs.

NEXT-GENERATION NVLINK

NVIDIA NVLink in A100 delivers2X higher throughput comparedto the previous generation. Whencombined with NVIDIA NVSwitch™,up to 16 A100 GPUs can be interconnected at upto 600 gigabytes per second (GB/sec), unleashingthe highest application performance possible ona single server. NVLink is available in A100 SXMGPUs via HGX A100 server boards and in PCIeGPUs via an NVLink Bridge for up to 2 GPUs.

MULTI-INSTANCE GPU (MIG)

An A100 GPU can be partitionedinto as many as seven GPUinstances, fully isolated atthe hardware level with theirown high-bandwidth memory, cache, andcompute cores. MIG gives developers accessto breakthrough acceleration for all theirapplications, and IT administrators can offerright-sized GPU acceleration for every job,optimizing utilization and expanding access toevery user and application.

HIGH-BANDWIDTH MEMORY(HBM2E)

With up to 80 gigabytes ofHBM2e, A100 delivers the world’sfastest GPU memory bandwidthof over 2TB/s, as well as a dynamic randomaccess memory (DRAM) utilization efficiencyof 95%. A100 delivers 1.7X higher memorybandwidth over the previous generation.

STRUCTURAL SPARSITY

AI networks have millions tobillions of parameters. Not all ofthese parameters are needed foraccurate predictions, and somecan be converted to zeros, making the models“sparse” without compromising accuracy.Tensor Cores in A100 can provide up to 2Xhigher performance for sparse models. Whilethe sparsity feature more readily benefits AIinference, it can also improve the performanceof model training.

The NVIDIA A100 Tensor Core GPU is the flagship product of the NVIDIA data centerplatform for deep learning, HPC, and data analytics. The platform accelerates over 2,000 applications, including every major deep learning framework. A100 is availableeverywhere, from desktops to servers to cloud services, delivering both dramaticperformance gains and cost-saving opportunities.