PyTorch 2.0 is here - March 28, 2023

Disclaimer: This content is generated by AI using my social media posts. Make sure to follow.

PyTorch 2.0 officially got released, being faster, more pythonic, and staying as dynamic as before. If you want to learn how to use PyTorch 2.0 with Hugging Face Transformers, check out my updated blog post โ€œhow to get started with PyTorch 2.0 and Hugging Face Transformersโ€.

News & Announcements ๐Ÿ“ฃ

Google released a new conversational dataset with over 550K multilingual conversations between humans and virtual assistants.

BigCode released v1.2 of the Stack, the biggest open-source code dataset, now including Gtihub issues and metadata.

EleutherAI released v2.0 of GPT-NeoX, an open-source Megatron-DeepSpeed-based library used to train LLM models.

Microsoft shared the pre-training script for DeBERTa-v3.

DeepMind released the weights and modeling code for Hierarchical Perceiver. Perceivers can process arbitrary modalities in any combination and are able to handle up to a few hundred thousand inputs.

Langchain enables you to use AI Plugins (the ones ChatGPT is using) by ANY language model

Microsoft open source BEIT-v3 achieving 98.3% Top 5 accuracy on ImageNet.

Tutorials & Demos ๐Ÿ“

Hugging Face created a tutorial on how to train your ControlNet with diffusers ๐Ÿงจ.

I created an example of how to deploy Flan-ul2 20B to Amazon SageMaker for real-time inference.

We also looked into how to Efficiently train Large Language Models with LoRA and Hugging Face PEFT, where we managed to train FLAN-T5 XXL (11B) on a single GPU.

Reads & Papers ๐Ÿ“š

The Langchain team wrote a blog post about the difficulty of evaluating large language models.

Fine-tuned Language Models are Continual Learners. A team of researchers looked into how to extend the knowledge and abilities of models without forgetting previous skills.

CoLT5: Faster Long-Range Transformers with Conditional Computation implements a new method that allows Transformer models to process up to 64 000 tokens outperforming previous long context transformers like LongT5.

Reflexion: an autonomous agent with dynamic memory and self-reflection looks into if and how models can learn from previous mistakes through self-reflection.


I hope you enjoyed this newsletter. ๐Ÿค— If you have any questions or are interested in collaborating, feel free to contact me on Twitter or LinkedIn.

See you next week ๐Ÿ‘‹๐Ÿป๐Ÿ‘‹๐Ÿป