Going Production: Auto-scaling Hugging Face Transformers with Amazon SageMaker
Learn how to add auto-scaling to your Hugging Face Transformers SageMaker Endpoints.
Learn how to add auto-scaling to your Hugging Face Transformers SageMaker Endpoints.
馃尭 BigScience released their first modeling paper introducing T0 which outperforms GPT-3 on many zero-shot tasks while being 16x smaller! Deploy BigScience the 3 Billion version (T0_3B) to Amazon SageMaker with a few lines of code to run a scalable production workload!
Deploy Hugging Face Transformers to Amazon SageMaker and create an API for the Endpoint using AWS Lambda, API Gateway and AWS CDK.
The latest developments in NLP show that you can overcome this limitation by providing a few examples at inference time with a large language model - a technique known as Few-Shot Learning. In this blog post, we'll explain what Few-Shot Learning is, and explore how a large language model called GPT-Neo.
Learn how to train distributed models for summarization using Hugging Face Transformers and Amazon SageMaker and upload them afterwards to huggingface.co.
Learn how to build a Multilingual Serverless BERT Question Answering API with a model size of more than 2GB and then testing it in German and France.
Learn how to use the newest cutting edge computing power of AWS with the benefits of serverless architectures to leverage Google's "State-of-the-Art" NLP Model.
Learn how to build and deploy an AWS Lambda function with a custom python docker container as runtime with the use of Amazon ECR.