- Published on:
- 3 min read
Open-Assistant is a project to make an open-source chat GPT. The most recent models they have released have gone through instruction tuning and are available on the Hub. There are a variety of sizes from 1.4B to 20B parameters.
News & Announcements 📣
Hugging Face released a new library called PEFT, or Parameter-Efficient Fine-Tuning. PEFT approaches only fine-tune a small number of (extra) model parameters while freezing most parameters of the pre-trained LLMs. Check out the 🤗 PEFT: Parameter-Efficient Fine-Tuning of Billion-Scale Models on Low-Resource Hardware blog to learn more
Runway introduced Gen-1, a new model that uses language and images to generate new videos out of existing ones.
Writer open-soruced Palymra, a Language Model trained in business and marketing writing. The model comes in three sizes, from 128 million to 20 Billion parameters, available on Hugging Face.
Tutorials & Demos 📝
I wrote a blog post on how to deploy the FLAN-T5-XXL on Amazon SageMaker for inference.
Emily Webber shared how she trained Stable Diffusion on 10TB of images using Amazon SageMaker.
Moshe Wasserblat created an example of using GPT-2 for data augmentation to use smaller models with the same accuracy.
Salesforce shared a Gradio demo for BLIP-2 for an image-to-text generation.
Reads & Papers 📚
Meta AI introduced Toolformer, a language model that teaches itself to use various tools in a self-supervised way. The model learned to use a calculator or call an external API service.
Google Research wrote a blog post about the Flan Collection: Advancing open source methods for instruction tuning, giving insights on why the FLAN-T5 models outperform previous instruction models.
Pierre Guillou wrote a blog post about Document AI focusing on Document Understanding model at line level with LiLT, Tesseract and DocLayNet dataset.
Sebastian Raschka created a transformative reading list for better understanding Large Language Models.
Raza Habib explored if it is worth fine-tuning LLMs or if you can leverage smaller models for the same result.
Multimodal Chain-of-Thought Reasoning in Language Models incorporates vision features for CoT, outperforming existing multimodal models.
Benchmarking Large Language Models for News Summarization sharing good research on how to improve abstractive summarization.
See you next week 👋🏻👋🏻