IGEL - first German instruction-tuned Large Language Model - April 11, 2023

Disclaimer: This content is generated by AI using my social media posts. Make sure to follow.

Introducing IGEL, an instruction-tuned German large Language Model! 🇩🇪🤯

IGEL is an LLM designed for German language understanding tasks, including sentiment analysis, language translation, and question answering. The first version of IGEL is built on top of BigScience BLOOM and adapted to German. 🔥

News & Announcements 📣

MetaAI announced and released Segment Anything or SAM. The first Foundation model for Image Segmentation with the dataset with 11 million images and 1 billion masks.

Open Assistant released their first ChatGPT model via their Application. You can test it here.

Together released a new version of their chatGPT-NeoX 20B model with higher quality by fine-tuning on user feedback.

VALL-E X, for cross-lingual speech synthesis, got released by Microsoft

The University of Berkley released Koala-13B! An open-source chatbot trained by fine-tuning LLaMA on web dialogue! 50% of responses are similar to ChatGPT.

Tutorials & Demos 📝

Research Team at Hugging Face created StackLlama🦙 An end-to-end tutorial for training Llama with RLHF on preference data such as the StackExchange questions!

Regis from HF created a tutorial on how to deploy BLOOMZ (176B) on Habana Gaudi2, outperforming NVIDIA A100s.

Langchain created a template on how to create AIPlugins for LLMs.

Reads & Papers 📚

Camel proposes an agent framework named role-playing for generating synthetic data for LLMs with LLMs. Harm de Vires gives great detail on Why we should train smaller LLMs on more tokens.

Samuel R. Bowman surveys a paper about Eight Things to Know about Large Language Models.

LLMs can Iteratively Self-Refine themselves.

Microsoft shows that using GPT-4 for data generation can help improve smaller models.

Bloomberg released a paper on their experiences training a 50B GPT model specialized on financial data.

HuggingGPT from Microsoft presents a new method to use LLMs as “routers” for requests to smaller fine-tuned LMs.

I hope you enjoyed this newsletter. 🤗 If you have any questions or are interested in collaborating, feel free to contact me on Twitter or LinkedIn.

See you next week 👋🏻👋🏻