Issue 11: Unveiling SimPO, Chameleon Insights, and AWS Inferentia2 Updates - May 26, 2024

Disclaimer: This content is generated by AI using my social media posts. Make sure to follow.

This week's highlights include the release of SimPO, Meta's new Chameleon model, and AWS Inferentia2 support for Hugging Face models.


SimPO: A New Approach to RLHF
SimPO (Simple Preference Optimization) has been released to enhance simplicity and training stability for offline preference tuning, outperforming DPO or ORPO. SimPO, like DPO, is reward-free but uses the average log probability of a sequence as the implicit reward, reducing time by 20% and GPU memory by 10%. The team at Princeton University has released the code and all model checkpoints from the ablations, available on GitHub.

Meta Llama 3 on AWS Inferentia2
Deploy Meta's Llama 3 70B on AWS Inferentia2 using Hugging Face Optimum. With an easy setup, you can deploy Llama 3 70B on inf2.48xlarge with interactive Gradio demos, achieving impressive benchmarks with llmperf. This deployment brings significant performance improvements, making it a valuable resource for developers and data scientists.

NVIDIA A100 & H100 GPUs for Hugging Face Inference
New NVIDIA A100 & H100 GPUs are now available for Hugging Face inference endpoints on Google Cloud. Deploy models swiftly and efficiently with access to powerful GPUs in the us-east4 region. This expansion promises enhanced performance and scalability for AI applications.

Mistral 7B: New Base and Instruct Models
Mistral AI releases its new 7B base and instruct models on Hugging Face. These models feature an extended vocabulary and function-calling support, licensed under Apache 2.0. Although lacking shared benchmarks, these models promise to deliver powerful AI capabilities for various applications.


Chameleon: Meta Llama 4's Future?
Meta introduces “Chameleon: Mixed-Modal Early-Fusion Foundation Models”, a unified approach for token-based representations of images and text. Chameleon-34B, trained on a mix of text and image data, outperforms Llama2-70B and competes closely with models like GPT-4v and Flamingo-80B. With improvements in multimodal understanding, Chameleon aims to be a significant player in the future of AI.

Uni-MoE: A Unified Multimodal Language Model
Uni-MoE proposes a unified Multimodal Large Language Model (MLLM) architecture handling audio, speech, image, text, and video. Utilizing a three-phase training strategy, Uni-MoE achieves efficiency and high performance across multiple tasks. It matches or outperforms existing MLLMs on various benchmarks, marking a significant advancement in multimodal AI.

Microsoft Phi-3 Models Released
Microsoft introduces Phi-3 small and medium models under the MIT license, claiming to outperform Meta's Llama 3 and Mistral. These models offer high performance on tasks like MMLU and AGI Eval, despite the lack of base models, Phi-3 promises significant advancements in AI capabilities.


AWS Inferentia2: A New Era for Model Deployment
AWS Inferentia2 now supports over 100,000 open models on Hugging Face for Amazon SageMaker. With cost-effective Inferentia2 instances, you can deploy Meta Llama 3 models with just one click. This development offers a robust alternative for GPU-intensive tasks, bringing efficiency and scalability to AI model deployment.

Microsoft and Hugging Face Partnership Expansion
Microsoft and Hugging Face are expanding their partnership to make open models and open-source AI more accessible. This expansion includes new AMD GPU support, optimized containers for Azure, and local inference through WebGPU. These advancements aim to streamline AI development and deployment across different platforms.

Dell Enterprise Hub for On-Premise AI
Together with Dell, Hugging Face launches the Dell Enterprise Hub, providing an enterprise experience for training and deploying open models on-premise. This hub reduces setup time and enhances data protection, supporting various Dell platforms with top open models like Meta Llama 3 and Mistral Mixtral.

I hope you enjoyed this newsletter. 🤗 If you have any questions or are interested in collaborating, feel free to contact me on Twitter or LinkedIn.

See you next week 👋🏻👋🏻