Learn the way NVIDIA created the brand new A800 GPU to bypass the US ban on sale of superior chips to China!
NVIDIA Gives A800 GPU To Bypass US Ban On China!
Two months after it was banned by the US authorities from promoting high-performance AI chips to China, NVIDIA launched a brand new A800 GPU designed to bypass these restrictions.
The brand new NVIDIA A800 relies on the identical Ampere microarchitecture because the A100, which was used because the efficiency baseline by the US authorities.
Regardless of its numerically bigger mannequin quantity (the fortunate quantity 8 was in all probability picked to enchantment to the Chinese language), it is a detuned half, with barely diminished efficiency to satisfy export management limitations.
The NVIDIA A800 GPU, which went into manufacturing in Q3, is one other various product to the NVIDIA A100 GPU for patrons in China.
The A800 meets the U.S. authorities’s clear check for diminished export management and can’t be programmed to exceed it.
NVIDIA might be hoping that the barely slower NVIDIA A800 GPU will permit it to proceed supplying China with A100-level chips which can be used to energy supercomputers and high-performance datacenters for synthetic intelligence functions.
As I’ll present you within the subsequent part, besides in very high-end functions, there gained’t be actually important efficiency distinction between the A800 and the A100. So NVIDIA clients who need or want the A100 can have no situation choosing the A800 as a substitute.
Nevertheless, this could solely be a stopgap repair, as NVIDIA is caught promoting A100-level chips to China till and except the US authorities adjustments its thoughts.
Learn extra : AMD, NVIDIA Banned From Promoting AI Chips To China!
How Quick Is The NVIDIA A800 GPU?
The US authorities considers the NVIDIA A100 because the efficiency baseline for its export management restrictions on China.
Any chip equal or quicker to that Ampere-based chip, which was launched on Could 14, 2020, is forbidden to be offered or exported to China. However as they are saying, the satan is within the particulars.
The US authorities didn’t specify simply how a lot slower chips have to be, to qualify for export to China. So NVIDIA might technically get away by barely detuning the A100, whereas providing virtually the identical efficiency stage.
And that was what NVIDIA did with the A800 – it’s principally the A100 with a 33% slower NVLink interconnect velocity. NVIDIA additionally restricted the utmost variety of GPUs supported in a single server to eight.
That solely barely reduces the efficiency of A800 servers, evaluate to A100 servers, whereas providing the identical quantity of GPU compute efficiency. Most customers is not going to discover the distinction.
The one important obstacle is on the very high-end – Chinese language firms at the moment are restricted to a most of eight GPUs per server, as a substitute of as much as sixteen.
To indicate you what I imply, I dug into the A800 specs, and in contrast them to the A100 under:
NVIDIA A100 vs A800 : 80GB PCIe Model
Specs | A100 80GB PCIe |
A800 80GB PCIe |
FP64 | 9.7 TFLOPS | |
FP64 Tensor Core | 19.5 TFLOPS | |
FP32 | 19.5 TFLOPS | |
Tensor Float 32 | 156 TFLOPS | |
BFLOAT 16 Tensor Core | 312 TFLOPS | |
FP16 Tensor Core | 312 TFLOPS | |
INT8 Tensor Core | 624 TOPS | |
GPU Reminiscence | 80 GB HBM2 | |
GPU Reminiscence Bandwifth | 1,935 GB/s | |
TDP | 300 W | |
Multi-Occasion GPU | As much as 7 MIGs @ 10 GB | |
Interconnect | NVLink : 600 GB/s PCIe Gen4 : 64 GB/s |
NVLink : 400 GB/s PCIe Gen4 : 64 GB/s |
Server Choices | 1-8 GPUs |
NVIDIA A100 vs A800 : 80GB SXM Model
Specs | A100 80GB SXM |
A800 80GB SXM |
FP64 | 9.7 TFLOPS | |
FP64 Tensor Core | 19.5 TFLOPS | |
FP32 | 19.5 TFLOPS | |
Tensor Float 32 | 156 TFLOPS | |
BFLOAT 16 Tensor Core | 312 TFLOPS | |
FP16 Tensor Core | 312 TFLOPS | |
INT8 Tensor Core | 624 TOPS | |
GPU Reminiscence | 80 GB HBM2 | |
GPU Reminiscence Bandwifth | 2,039 GB/s | |
TDP | 400 W | |
Multi-Occasion GPU | As much as 7 MIGs @ 10 GB | |
Interconnect | NVLink : 600 GB/s PCIe Gen4 : 64 GB/s |
NVLink : 400 GB/s PCIe Gen4 : 64 GB/s |
Server Choices | 4/ 8 / 16 GPUs | 4 / 8 GPUs |
NVIDIA A100 vs A800 : 40GB PCIe Model
Specs | A100 40GB PCIe |
A800 40GB PCIe |
FP64 | 9.7 TFLOPS | |
FP64 Tensor Core | 19.5 TFLOPS | |
FP32 | 19.5 TFLOPS | |
Tensor Float 32 | 156 TFLOPS | |
BFLOAT 16 Tensor Core | 312 TFLOPS | |
FP16 Tensor Core | 312 TFLOPS | |
INT8 Tensor Core | 624 TOPS | |
GPU Reminiscence | 40 GB HBM2 | |
GPU Reminiscence Bandwifth | 1,555 GB/s | |
TDP | 250 W | |
Multi-Occasion GPU | As much as 7 MIGs @ 10 GB | |
Interconnect | NVLink : 600 GB/s PCIe Gen4 : 64 GB/s |
NVLink : 400 GB/s PCIe Gen4 : 64 GB/s |
Server Choices | 1-8 GPUs |
Please Help My Work!
Help my work by a financial institution switch / PayPal / bank card!
Title : Adrian Wong
Financial institution Switch : CIMB 7064555917 (Swift Code : CIBBMYKL)
Credit score Card / Paypal : https://paypal.me/techarp
Dr. Adrian Wong has been writing about tech and science since 1997, even publishing a guide with Prentice Corridor known as Breaking By means of The BIOS Barrier (ISBN 978-0131455368) whereas in medical college.
He continues to dedicate numerous hours each day writing about tech, medication and science, in his pursuit of details in a post-truth world.
Really helpful Studying
Help Tech ARP!
Please assist us by visiting our sponsors, taking part within the Tech ARP Boards, or donating to our fund. Thanks!