Accelerate GPT-J inference with DeepSpeed-Inference on GPUs
September 13, 2022 — GPTJ, DeepSpeed, HuggingFace, Optimization
August 30, 2022 — BERT, Tensorflow, HuggingFace, Keras
August 24, 2022 — BERT, Habana, HuggingFace, Optimum
August 16, 2022 — BERT, DeepSpeed, HuggingFace, Optimization
August 2, 2022 — BERT, OnnxRuntime, HuggingFace, Optimization
July 26, 2022 — BERT, Habana, HuggingFace, AWS
July 19, 2022 — ViT, OnnxRuntime, HuggingFace, Optimization
July 13, 2022 — BERT, OnnxRuntime, HuggingFace, Optimization
July 5, 2022 — BERT, Habana, HuggingFace, Optimum
June 30, 2022 — BERT, OnnxRuntime, HuggingFace, Optimization
June 21, 2022 — BERT, HuggingFace, ONNX, Optimum
June 14, 2022 — BERT, Habana, HuggingFace, Optimum
June 7, 2022 — BERT, OnnxRuntime, HuggingFace, Quantization