At the heart of NVIDIA’s A100 GPU is the NVIDIA Ampere architecture, which introduces double-precision tensor cores allowing for more than 2x the throughput of the V100 – a significant reduction in simulation run times. The double-precision FP64 performance is 9.7 TFLOPS, and with tensor cores this doubles to 19.5 TFLOPS. The single-precision FP32 performance is 19.5 TFLOPS and with the new Tensor Float (TF) precision this number significantly increases to 156 TFLOPS; ~20x higher than the previous generation V100. TF32 works as a hybrid of FP16 and FP32 math models that uses the same 10-bit precision mantissa as FP16, and 8-bit exponent of FP32, allowing for speedup increases on specific benchmarks.
Furthermore, the A100 supports a massive 40GB of high-bandwidth memory (HBM2) that enables up to 1.6TB/s memory bandwidth. This is 1.7x higher memory bandwidth over the previous generation V100.
|A100 80GB PCIe|
|FP64 Tensor Core||19.5 TFLOPS|
|Tensor Float 32 (TF32)||156 TFLOPS | 312 TFLOPS*|
|BFLOAT16 Tensor Core||312 TFLOPS | 624 TFLOPS*|
|FP16 Tensor Core||312 TFLOPS | 624 TFLOPS*|
|INT8 Tensor Core||624 TOPS | 1248 TOPS*|
|GPU Memory||80GB HBM2e|
|GPU Memory Bandwidth||2,039 GB/s|
|Max Thermal Design Power (TDP)||400W|
|Multi-Instance GPU||Up to 7 MIGs @ 10GB|
|Interconnect||NVLink: 600 GB/s
PCIe Gen4: 64 GB/s
|Server Options||NVIDIA HGX™ A100-Partner and NVIDIA-Certified Systems with 4,8, or 16 GPUs NVIDIA DGX™ A100 with 8 GPUs|
In addition to the product specification improvements noted above, the NVIDIA A100 introduces 3 key new features that will further accelerate High-Performance Computing (HPC), Training and Artificial Intelligence (AI) Inference workloads:
- 3rd Generation NVIDIA NVLink™ – The new generation of NVLink™ has 2x GPU-to-GPU throughput over the previous generation V100.
- Multi-Instance GPU (MIG) – This feature enables a single A100 GPU to be partitioned into as many as seven separate GPUs, which benefits cloud users looking to utilize their GPUs for AI inference and data analytics workloads
- Structural Sparsity – This feature supports sparse matrix operations in tensor cores and increases the throughput of tensor core operations by 2x (see Figure 6)
FREE Shipping on Orders $50 and Over
Our goal is to offer you the best shipping options, no matter where you live.
We work with DHL, EMS, AMAMEX, UPS, FEDEX etc. You can choose the shipping method you like. If you have special requirement for the shipping method you could contact us for detail fee counting and the arrangement.
- United States: 2-7 business days
- Canada: 5-15 business days
- Other Countries：1-3 weeks, depending on the different destinations.
Return & Refund
One Year Warranty
If your product malfunctions or fails due to a manufacturing defect within one year of the purchase date, we will repair or replace it for free.
30 Day Refund
If you are not happy with your purchase, you can return it to us within 30 days of the delivery date for a full refund.
Payment & Security
Your payment information is processed securely. We do not store credit card details nor have access to your credit card information.