- Published on:
- 2 min read
PyTorch 2.0 officially got released, being faster, more pythonic, and staying as dynamic as before. If you want to learn how to use PyTorch 2.0 with Hugging Face Transformers, check out my updated blog post “how to get started with PyTorch 2.0 and Hugging Face Transformers”.
News & Announcements 📣
Google released a new conversational dataset with over 550K multilingual conversations between humans and virtual assistants.
BigCode released v1.2 of the Stack, the biggest open-source code dataset, now including Gtihub issues and metadata.
EleutherAI released v2.0 of GPT-NeoX, an open-source Megatron-DeepSpeed-based library used to train LLM models.
Microsoft shared the pre-training script for DeBERTa-v3.
Langchain enables you to use AI Plugins (the ones ChatGPT is using) by ANY language model
Microsoft open source BEIT-v3 achieving 98.3% Top 5 accuracy on ImageNet.
Tutorials & Demos 📝
Hugging Face created a tutorial on how to train your ControlNet with diffusers 🧨.
I created an example of how to deploy Flan-ul2 20B to Amazon SageMaker for real-time inference.
We also looked into how to Efficiently train Large Language Models with LoRA and Hugging Face PEFT, where we managed to train FLAN-T5 XXL (11B) on a single GPU.
Reads & Papers 📚
The Langchain team wrote a blog post about the difficulty of evaluating large language models.
Fine-tuned Language Models are Continual Learners. A team of researchers looked into how to extend the knowledge and abilities of models without forgetting previous skills.
CoLT5: Faster Long-Range Transformers with Conditional Computation implements a new method that allows Transformer models to process up to 64 000 tokens outperforming previous long context transformers like LongT5.
Reflexion: an autonomous agent with dynamic memory and self-reflection looks into if and how models can learn from previous mistakes through self-reflection.
See you next week 👋🏻👋🏻