Blog Newsletter Tags Projects About Me Contact

Llama

Published on
January 11, 2024
Scale LLM Inference on Amazon SageMaker with Multi-Replica Endpoints
#LLAMA #HuggingFace #LLM #SageMaker
In this blog post you will learn how to increase the throughput of Llama 13B on Amazon SageMaker using single instance multi-replica endpoints.
Read more →
Published on
November 14, 2023
Deploy Llama 2 7B on AWS inferentia2 with Amazon SageMaker
#GenerativeAI #Llama #SageMaker #Inferentia
In this blog post, you will learn how to compile and deploy Llama 2 7B on AWS Inferentia2 with Amazon SageMaker.
Read more →
Published on
September 26, 2023
Llama 2 on Amazon SageMaker a Benchmark
#LLAMA #HuggingFace #LLM #SageMaker
Benchmark evaluating varying sizes of Llama 2 on a range of Amazon EC2 instance types with different load levels on latency (ms per token), and throughput (tokens per second).
Read more →
Published on
July 26, 2023
Extended Guide: Instruction-tune Llama 2
#GenerativeAI #HuggingFace #LLM #Llama
This blog post is an extended guide on instruction-tuning Llama 2 from Meta AI
Read more →
Published on
July 21, 2023
LLaMA 2 - Every Resource you need
#GenerativeAI #HuggingFace #LLM #LLaMA
All Resources for LLaMA 2, How to test, train, and deploy it.
Read more →
Published on
July 18, 2023
Fine-tune LLaMA 2 (7-70B) on Amazon SageMaker
#LLAMA #HuggingFace #LLM #SageMaker
Learn how to train LLaMa 2 using QLoRA Hugging Face Transformers on Amazon SageMaker
Read more →