Blog Newsletter Tags Projects About Me Contact

Inferentia

Published on
November 21, 2023
Deploy Embedding Models on AWS inferentia2 with Amazon SageMaker
#GenerativeAI #Embeddings #SageMaker #Inferentia
In this blog post, you will learn how to compile and deploy Embedding Models on AWS Inferentia2.
Read more →
Published on
November 14, 2023
Deploy Llama 2 7B on AWS inferentia2 with Amazon SageMaker
#GenerativeAI #Llama #SageMaker #Inferentia
In this blog post, you will learn how to compile and deploy Llama 2 7B on AWS Inferentia2 with Amazon SageMaker.
Read more →
Published on
November 7, 2023
Deploy Stable Diffusion XL on AWS inferentia2 with Amazon SageMaker
#GenerativeAI #SDXL #SageMaker #Inferentia
In this blog post, you will learn how to compile and deploy Stable Diffusion XL on AWS Inferentia2 with Amazon SageMaker.
Read more →
Published on
June 28, 2023
Optimize & Deploy BERT on AWS inferentia2
#Inferentia #HuggingFace #BERT #NLP
Learn how to optimize and deploy BERT on AWS Inferentia2
Read more →
Published on
April 19, 2022
Accelerated document embeddings with Hugging Face Transformers and AWS Inferentia
#HuggingFace #AWS #BERT #Inferentia
Learn how to accelerate Sentence Transformers inference inference using Hugging Face Transformers and AWS Inferentia.
Read more →
Published on
March 16, 2022
Speed up BERT inference with Hugging Face Transformers and AWS Inferentia
#HuggingFace #AWS #BERT #Inferentia
Learn how to accelerate BERT and Transformers inference using Hugging Face Transformers and AWS Inferentia.
Read more →

Inferentia

Deploy Embedding Models on AWS inferentia2 with Amazon SageMaker

Deploy Llama 2 7B on AWS inferentia2 with Amazon SageMaker

Deploy Stable Diffusion XL on AWS inferentia2 with Amazon SageMaker

Optimize & Deploy BERT on AWS inferentia2

Accelerated document embeddings with Hugging Face Transformers and AWS Inferentia

Speed up BERT inference with Hugging Face Transformers and AWS Inferentia