This is a part about ASICs from the "Hardware for Deep Learning" series. The content of the series is here. As of beginning 2021, ASICs now is the only real alternative to GPUs for
1) deep learning…
References: AWS Amazon has its own solutions for both training and inference. AWS Inferentia AWS Inferentia was announced in November 2018. It was designed by Annapurna Labs, a subsidiary of Amazon. Each AWS Inferentia chip contains four NeuronCores. Each NeuronCore implements a high-performance systolic array matrix multiply engine (as Google TPU). NeuronCores are also equipped with a large on-chip cache (but the exact numbers are unknown). [source] AWS Inferentia supports FP16, BF16, and INT8 data types. Furthermore, Inferentia can take a 32-bit trained model and automatically run it at the speed of a 16-bit model using BF16. Each chip can deliver 64 TFLOPS on FP16 and BF16, and 128 TOPS on INT8 data. (source) You can have up to 16 Inferentia chips per EC2 Inf1 instance. Inferentia is optimized for maximizing throughput for small batch sizes, which is beneficial for applications that have strict latency requirements. The AWS Neuron SDK consists of a compiler, run-time, and profiling tools. It enables complex neural net models, created and trained in popular frameworks such as TensorFlow, PyTorch, and MXNet, to be executed using Inf1 instances. AWS Neuron also supports the ability to split large models for execution across multiple Inferentia chips using a high-speed physical chip-to-chip interconnect. The technical details on Inferentia are very scarce. References: AWS Trainium December 1st, 2020 Amazon announced its AWS Trainium chip. AWS Trainium is the second custom machine learning chip designed by AWS and it's targeted at training models in the cloud. AWS Trainium shares the same AWS Neuron SDK as AWS Inferentia, so it's integrated with TensorFlow, PyTorch, and MXNet. AWS Trainium will be available in 2021. For now, almost no technical details are available. References: Huawei Ascend Huawei has its own solutions for both training and inference as well. The lineage of AI products is pretty vast, but we'll focus on accelerator cards…