We have verified that inputs with up to 5,120 tokens can fit on a single NVIDIA A100 80 GB, or a single NVIDIA H100 80 GB. We have verified numerical accuracy on both NVIDIA A100 and H100 GPUs.
The A100 comes with 3,456 FP64 CUDA Cores, 6,912 FP32 CUDA Cores, 432 Tensor Cores, 108 streaming multiprocessors and 40 GB of GPU memory within a 400-watt power envelope. With the A100 already in ...