<?xml version="1.0" encoding="UTF-8" ?>
  <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>philschmid.de - RSS feed</title>
        <link>https://www.philschmid.de</link>
        <description>RSS feed for my blog www.philschmid.de</description>
  <item>
  <title>How to correctly use MCP servers with your AI Agents</title>
  <link>https://www.philschmid.de/use-mcp-servers</link>
  <guid>https://www.philschmid.de/use-mcp-servers</guid>
  <description>MCP servers are not dead. Blindly enabling them bloats your context, which leads to higher cost and worse performance. Here are two proven patterns on how to correctly use MCP servers and avoid the bloat.</description>
  <pubDate>Mon, 27 Apr 2026 00:00:00 GMT</pubDate>
</item>
<item>
  <title>8 Tips for Writing Agent Skills</title>
  <link>https://www.philschmid.de/agent-skills-tips</link>
  <guid>https://www.philschmid.de/agent-skills-tips</guid>
  <description>8 Tips for Writing Agent Skills. Know What a Skill Is, Nail the Description, Write Instructions, Keep It Lean, Set the Right Level of Freedom, Don't Skip Negative Cases, Test It Before You Ship It, Know When to Retire a Skill.</description>
  <pubDate>Mon, 13 Apr 2026 00:00:00 GMT</pubDate>
</item>
<item>
  <title>How to use Gemma 4 with the Gemini API and Google AI Studio</title>
  <link>https://www.philschmid.de/gemma-4-gemini-api</link>
  <guid>https://www.philschmid.de/gemma-4-gemini-api</guid>
  <description>Learn how to use Gemma 4 with the Gemini API and Google AI Studio.</description>
  <pubDate>Tue, 07 Apr 2026 00:00:00 GMT</pubDate>
</item>
<item>
  <title>How Kimi, Cursor, and Chroma Train Agentic Models with RL</title>
  <link>https://www.philschmid.de/kimi-composer-context</link>
  <guid>https://www.philschmid.de/kimi-composer-context</guid>
  <description>Learn the unique ways how Kimi, Cursor, and Chroma train agentic models with RL.</description>
  <pubDate>Sat, 28 Mar 2026 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Combine Built-in Tools and Function Calling in the Gemini Interactions API</title>
  <link>https://www.philschmid.de/tool-combo</link>
  <guid>https://www.philschmid.de/tool-combo</guid>
  <description>Learn how to combine built-in tools and function calling in the Gemini Interactions API.</description>
  <pubDate>Tue, 24 Mar 2026 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Developer Guide: Nano Banana 2 with the Gemini Interactions API</title>
  <link>https://www.philschmid.de/nano-banana-2-interactions-api</link>
  <guid>https://www.philschmid.de/nano-banana-2-interactions-api</guid>
  <description>Learn how to use the Gemini Interactions API to build a personalized Japan travel brochure with Nano Banana 2.</description>
  <pubDate>Mon, 16 Mar 2026 00:00:00 GMT</pubDate>
</item>
<item>
  <title>How Autoresearch will change Small Language Models adoption</title>
  <link>https://www.philschmid.de/autoresearch</link>
  <guid>https://www.philschmid.de/autoresearch</guid>
  <description>Autoresearch lets an AI agent run hundreds of model training experiments overnight. Learn how it works, early results from Karpathy and Shopify, and how to apply it.</description>
  <pubDate>Tue, 10 Mar 2026 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Practical Guide to Evaluating and Testing Agent Skills</title>
  <link>https://www.philschmid.de/testing-skills</link>
  <guid>https://www.philschmid.de/testing-skills</guid>
  <description>Learn how to systematically test and improve agent skills using deterministic checks and a real-world Gemini API example.</description>
  <pubDate>Wed, 04 Mar 2026 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Writing a Good AGENTS.md</title>
  <link>https://www.philschmid.de/writing-good-agents</link>
  <guid>https://www.philschmid.de/writing-good-agents</guid>
  <description>Learn what to include, what to skip, and how to structure your AGENTS.md for best results.</description>
  <pubDate>Tue, 24 Feb 2026 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Agents: Inner Loop vs Outer Loop</title>
  <link>https://www.philschmid.de/inner-loop-vs-outer-loop</link>
  <guid>https://www.philschmid.de/inner-loop-vs-outer-loop</guid>
  <description>Most agent frameworks share the same hardcoded tool loop; what differs is how the model uses it. This post explains the inner loop—an agent verifying its own work within a task—and the outer loop—an agent carrying lessons across tasks via persistent memory, skills, and rules files—and why both are needed for agents that feel reliable and get smarter over time.</description>
  <pubDate>Fri, 20 Feb 2026 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Can We Close the Loop in 2026?</title>
  <link>https://www.philschmid.de/closing-the-loop</link>
  <guid>https://www.philschmid.de/closing-the-loop</guid>
  <description>What makes some AI agents feel like collaborators while others need constant babysitting? Two capabilities matter: self-awareness — does the agent understand what it is and how to use its tools — and closing the loop — can it verify its own work before responding. This post breaks down where agents stand today, how production systems like Spotify scaffold verification, and what needs to improve for agents to earn real autonomy in 2026.</description>
  <pubDate>Tue, 17 Feb 2026 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Multimodal Function Calling with Gemini 3 and Interactions API</title>
  <link>https://www.philschmid.de/interactions-multimodal-fc</link>
  <guid>https://www.philschmid.de/interactions-multimodal-fc</guid>
  <description>Multimodal function calling allows tools to return images the model can process natively, similar to how you pass images in prompts. Instead of describing what's in a file, your tool returns the actual image and Gemini 3 processes it natively.</description>
  <pubDate>Fri, 13 Feb 2026 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Getting Started with Gemini Deep Research API</title>
  <link>https://www.philschmid.de/gemini-deep-research-getting-started</link>
  <guid>https://www.philschmid.de/gemini-deep-research-getting-started</guid>
  <description>Learn how to use the new Gemini Deep Research agent via the Interactions API to perform complex research tasks, generate images based on the findings, and translate the results.</description>
  <pubDate>Mon, 02 Feb 2026 00:00:00 GMT</pubDate>
</item>
<item>
  <title>The Agent Client Protocol Overview</title>
  <link>https://www.philschmid.de/acp-overview</link>
  <guid>https://www.philschmid.de/acp-overview</guid>
  <description>The Agent Client Protocol (ACP) is an open standard abstracts the events and outputs of AI agents and provides a common interface for editors to interact with them. Similar to MCP but for agent to client (UI) communication.</description>
  <pubDate>Sun, 01 Feb 2026 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Gemini Interactions API Quick Start</title>
  <link>https://www.philschmid.de/interactions-api-quickstart</link>
  <guid>https://www.philschmid.de/interactions-api-quickstart</guid>
  <description>The Interactions API is a unified interface for building with Gemini models and agents. It simplifies the development of agentic applications by handling server-side state management, tool orchestration, and long-running tasks.</description>
  <pubDate>Thu, 22 Jan 2026 00:00:00 GMT</pubDate>
</item>
<item>
  <title>MCP is Not the Problem, It's your Server: Best Practices for Building MCP Servers</title>
  <link>https://www.philschmid.de/mcp-best-practices</link>
  <guid>https://www.philschmid.de/mcp-best-practices</guid>
  <description>The Model Context Protocol (MCP) has exploded roughly 1 year ago, everyone rushed to build MCP servers. The hype was real. Yet, most MCP servers disappoint. Most developers blame the protocol. The protocol feels like it's dying on social media.</description>
  <pubDate>Wed, 21 Jan 2026 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Transparent PNG Stickers with Nano Banana Pro and Gemini interactions API</title>
  <link>https://www.philschmid.de/generate-stickers</link>
  <guid>https://www.philschmid.de/generate-stickers</guid>
  <description>Learn how to generate transparent PNG stickers using Nano Banana Pro and the Gemini Interactions API, featuring chromakey green background removal with HSV detection.</description>
  <pubDate>Mon, 19 Jan 2026 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Building Agents with the Gemini Interactions API</title>
  <link>https://www.philschmid.de/building-agents-interactions-api</link>
  <guid>https://www.philschmid.de/building-agents-interactions-api</guid>
  <description>Learn how to build AI agents using the new Gemini Interactions API, featuring server-side state management and simplified tool orchestration.</description>
  <pubDate>Wed, 14 Jan 2026 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Introducing MCP CLI: A way to call MCP Servers Efficiently</title>
  <link>https://www.philschmid.de/mcp-cli</link>
  <guid>https://www.philschmid.de/mcp-cli</guid>
  <description>Mcp-cli is a lightweight CLI that allows dynamic discovery of MCP, reducing token consumption while making tool interactions more efficient for AI coding agents.</description>
  <pubDate>Fri, 09 Jan 2026 00:00:00 GMT</pubDate>
</item>
<item>
  <title>The importance of Agent Harness in 2026</title>
  <link>https://www.philschmid.de/agent-harness-2026</link>
  <guid>https://www.philschmid.de/agent-harness-2026</guid>
  <description>In 2026, Agent Harnesses will become essential for building reliable AI systems that can handle complex, multi-day tasks.</description>
  <pubDate>Mon, 05 Jan 2026 00:00:00 GMT</pubDate>
</item>
<item>
  <title>8 Predictions for 2026. What comes next in AI?</title>
  <link>https://www.philschmid.de/2026-predictions</link>
  <guid>https://www.philschmid.de/2026-predictions</guid>
  <description>8 Predictions for 2026, exploring the future of AI, personal agents, smart homes, and more.</description>
  <pubDate>Wed, 31 Dec 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Context Engineering for AI Agents: Part 2</title>
  <link>https://www.philschmid.de/context-engineering-part-2</link>
  <guid>https://www.philschmid.de/context-engineering-part-2</guid>
  <description>Building on the foundations of Context Engineering, this post explores advanced strategies to manage context rot, multi-agent coordination, and action space optimization for AI agents.</description>
  <pubDate>Thu, 04 Dec 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Why (Senior) Engineers Struggle to Build AI Agents</title>
  <link>https://www.philschmid.de/why-engineers-struggle-building-agents</link>
  <guid>https://www.philschmid.de/why-engineers-struggle-building-agents</guid>
  <description>Traditional software engineering is deterministic, while AI agents operate probabilistically. This fundamental difference creates challenges for engineers accustomed to strict interfaces and predictable outcomes.</description>
  <pubDate>Wed, 26 Nov 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Practical Guide on how to build an Agent from scratch with Gemini 3</title>
  <link>https://www.philschmid.de/building-agents</link>
  <guid>https://www.philschmid.de/building-agents</guid>
  <description>A step-by-step practical guide on building AI agents using Gemini 3 Pro, covering tool integration, context management, and best practices for creating effective and reliable agents.</description>
  <pubDate>Fri, 21 Nov 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Gemini 3 Prompting: Best Practices for General Usage</title>
  <link>https://www.philschmid.de/gemini-3-prompt-practices</link>
  <guid>https://www.philschmid.de/gemini-3-prompt-practices</guid>
  <description>A comprehensive guide on best practices for prompting Gemini 3, focusing on clarity, structure, reasoning, and agentic tool use to maximize model performance across various domains.</description>
  <pubDate>Wed, 19 Nov 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Gemini API File Search: A Web Developer Tutorial</title>
  <link>https://www.philschmid.de/gemini-file-search-javascript</link>
  <guid>https://www.philschmid.de/gemini-file-search-javascript</guid>
  <description>Learn how to use the Gemini API File Search tool with JavaScript/TypeScript to build a Retrieval-Augmented Generation (RAG) system.</description>
  <pubDate>Fri, 07 Nov 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Build your first AI Agent with Gemini, n8n and Google Cloud Run</title>
  <link>https://www.philschmid.de/n8n-cloud-run-gemini</link>
  <guid>https://www.philschmid.de/n8n-cloud-run-gemini</guid>
  <description>Learn how to deploy n8n on Google Cloud Run with PostgreSQL and create an AI Agent using Google Gemini 2.5.</description>
  <pubDate>Thu, 30 Oct 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>AI Agent Benchmark Compendium</title>
  <link>https://www.philschmid.de/benchmark-compedium</link>
  <guid>https://www.philschmid.de/benchmark-compedium</guid>
  <description>An extensive compendium of over 50 benchmarks for evaluating AI agents, categorized into Function Calling and Tool Use, General Assistant and Reasoning, Coding and Software Engineering, and Computer Interaction.</description>
  <pubDate>Wed, 15 Oct 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Agents 2.0: From Shallow Loops to Deep Agents</title>
  <link>https://www.philschmid.de/agents-2.0-deep-agents</link>
  <guid>https://www.philschmid.de/agents-2.0-deep-agents</guid>
  <description>An overview of the architectural shift from Shallow Agents (Agent 1.0) to Deep Agents (Agent 2.0) and how to build complex AI agents that can handle multi-step tasks over extended periods.</description>
  <pubDate>Sun, 12 Oct 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>The Rise of Subagents</title>
  <link>https://www.philschmid.de/the-rise-of-subagents</link>
  <guid>https://www.philschmid.de/the-rise-of-subagents</guid>
  <description>The rise of subagents is a trend in the AI community. We are seeing more and more use of subagents to reliably handle specific user goals.</description>
  <pubDate>Mon, 15 Sep 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>The 10 Steps for product AI generation with Gemini 2.5 Flash</title>
  <link>https://www.philschmid.de/gemini-image-generation-product</link>
  <guid>https://www.philschmid.de/gemini-image-generation-product</guid>
  <description>Learn how to use Gemini 2.5 Flash for product image generation.</description>
  <pubDate>Wed, 27 Aug 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Memory in Agents, Make LLMs remember.</title>
  <link>https://www.philschmid.de/memory-in-agents</link>
  <guid>https://www.philschmid.de/memory-in-agents</guid>
  <description>Learn how to engineer long-term memory into stateless AI agents to overcome their biggest limitation and unlock true personalization.</description>
  <pubDate>Mon, 04 Aug 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Google Gemini CLI Cheatsheet</title>
  <link>https://www.philschmid.de/gemini-cli-cheatsheet</link>
  <guid>https://www.philschmid.de/gemini-cli-cheatsheet</guid>
  <description>A comprehensive cheatsheet on using Google's Gemini CLI, covering installation, authentication, configuration, and core commands.</description>
  <pubDate>Thu, 24 Jul 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Code Sandbox MCP: A Simple Code Interpreter for Your AI Agents</title>
  <link>https://www.philschmid.de/code-sandbox-mcp</link>
  <guid>https://www.philschmid.de/code-sandbox-mcp</guid>
  <description>Code Sandbox MCP is a simple, self-hosted code interpreter for your AI agents. It allows you to execute code snippets in containerized environments.</description>
  <pubDate>Tue, 22 Jul 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Integrating Long-Term Memory with Gemini 2.5</title>
  <link>https://www.philschmid.de/gemini-with-memory</link>
  <guid>https://www.philschmid.de/gemini-with-memory</guid>
  <description>This guide shows you how to add long-term memory to your Gemini 2.5 chatbot using the Gemini API and Mem0.</description>
  <pubDate>Thu, 03 Jul 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>The New Skill in AI is Not Prompting, It's Context Engineering</title>
  <link>https://www.philschmid.de/context-engineering</link>
  <guid>https://www.philschmid.de/context-engineering</guid>
  <description>Context Engineering is the new skill in AI. It is about providing the right information and tools, in the right format, at the right time.</description>
  <pubDate>Mon, 30 Jun 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Single vs Multi-Agent System?</title>
  <link>https://www.philschmid.de/single-vs-multi-agents</link>
  <guid>https://www.philschmid.de/single-vs-multi-agents</guid>
  <description>Single vs. multi-agent? The real secret to building AI agents is 'read vs. write'. Learn which to use for your task and build reliable systems.</description>
  <pubDate>Fri, 20 Jun 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Zero to One: Learning Agentic Patterns</title>
  <link>https://www.philschmid.de/agentic-pattern</link>
  <guid>https://www.philschmid.de/agentic-pattern</guid>
  <description>Learn common agentic design patterns and workflows for building robust, scalable AI applications, understanding when to use each.</description>
  <pubDate>Mon, 05 May 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Google Gemini LangChain Cheatsheet</title>
  <link>https://www.philschmid.de/gemini-langchain-cheatsheet</link>
  <guid>https://www.philschmid.de/gemini-langchain-cheatsheet</guid>
  <description>A comprehensive cheatsheet on using Google's Gemini within the LangChain, covering chat functionalities with multimodal inputs, tool usage, structured data generation, and text embedding techniques.</description>
  <pubDate>Mon, 28 Apr 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>OpenAI Codex CLI, how does it work?</title>
  <link>https://www.philschmid.de/openai-codex-cli</link>
  <guid>https://www.philschmid.de/openai-codex-cli</guid>
  <description>I used Gemini 2.5 Pro to better understand the OpenAI Codex CLI, a tool that allows you to interact with an AI model directly in your terminal to perform coding tasks.</description>
  <pubDate>Thu, 17 Apr 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Model Context Protocol (MCP) an overview</title>
  <link>https://www.philschmid.de/mcp-introduction</link>
  <guid>https://www.philschmid.de/mcp-introduction</guid>
  <description>Overview of the Model Context Protocol (MCP) how it works, what are MCP servers and clients, and how to use it.</description>
  <pubDate>Thu, 03 Apr 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>ReAct agent from scratch with Gemini 2.5 and LangGraph</title>
  <link>https://www.philschmid.de/langgraph-gemini-2-5-react-agent</link>
  <guid>https://www.philschmid.de/langgraph-gemini-2-5-react-agent</guid>
  <description>Build a ReAct agent from scratch with Gemini 2.5 and LangGraph.</description>
  <pubDate>Mon, 31 Mar 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Pass@k vs Pass^k: Understanding Agent Reliability</title>
  <link>https://www.philschmid.de/agents-pass-at-k-pass-power-k</link>
  <guid>https://www.philschmid.de/agents-pass-at-k-pass-power-k</guid>
  <description>Production agents need to be reliable. Why pass^k is a better metrics than pass@k.</description>
  <pubDate>Mon, 24 Mar 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Google Gemma 3 Function Calling Example</title>
  <link>https://www.philschmid.de/gemma-function-calling</link>
  <guid>https://www.philschmid.de/gemma-function-calling</guid>
  <description>Learn how to use function calling with Google DeepMind Gemma 3 27B It</description>
  <pubDate>Fri, 14 Mar 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Function Calling Guide: Google DeepMind Gemini 2.0 Flash</title>
  <link>https://www.philschmid.de/gemini-function-calling</link>
  <guid>https://www.philschmid.de/gemini-function-calling</guid>
  <description>Learn how to use function calling with Google DeepMind Gemini 2.0 Flash.</description>
  <pubDate>Wed, 05 Mar 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>From PDFs to Insights: Structured Outputs from PDFs with Gemini 2.0</title>
  <link>https://www.philschmid.de/gemini-pdf-to-data</link>
  <guid>https://www.philschmid.de/gemini-pdf-to-data</guid>
  <description>Learn how to extract structured data from PDFs with Gemini 2.0 and Pydantic.</description>
  <pubDate>Fri, 07 Feb 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial</title>
  <link>https://www.philschmid.de/mini-deepseek-r1</link>
  <guid>https://www.philschmid.de/mini-deepseek-r1</guid>
  <description>Reproduce Deepseek R1 „aha moment“ and train an open model using reinforcement learning trying to teach it self-verification and search abilities all on its own to solve the Countdown Game.</description>
  <pubDate>Thu, 30 Jan 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>How to align open LLMs in 2025 with DPO and and synthetic data</title>
  <link>https://www.philschmid.de/rl-with-llms-in-2025-dpo</link>
  <guid>https://www.philschmid.de/rl-with-llms-in-2025-dpo</guid>
  <description>Learn how to align LLMs using Hugging Face TRL and RLHF through Direct Preference Optimization (DPO) and on-policy synthetic data.</description>
  <pubDate>Thu, 23 Jan 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Bite: How Deepseek R1 was trained</title>
  <link>https://www.philschmid.de/deepseek-r1</link>
  <guid>https://www.philschmid.de/deepseek-r1</guid>
  <description>5 Minute Read on how Deepseek R1 was trained using Group Relative Policy Optimization (GRPO) and RL-focused multi-stage training approach.</description>
  <pubDate>Fri, 17 Jan 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>How to use Anthropic MCP Server with open LLMs, OpenAI or Google Gemini</title>
  <link>https://www.philschmid.de/mcp-example-llama</link>
  <guid>https://www.philschmid.de/mcp-example-llama</guid>
  <description>How to use Anthropic MCP Server with open LLMs, OpenAI or Google Gemini</description>
  <pubDate>Fri, 17 Jan 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Fine-tune classifier with ModernBERT in 2025</title>
  <link>https://www.philschmid.de/fine-tune-modern-bert-in-2025</link>
  <guid>https://www.philschmid.de/fine-tune-modern-bert-in-2025</guid>
  <description>Modern updated guide on how to fine-tune BERT models for classification tasks in 2025.</description>
  <pubDate>Wed, 25 Dec 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>How to fine-tune open LLMs in 2025 with Hugging Face</title>
  <link>https://www.philschmid.de/fine-tune-llms-in-2025</link>
  <guid>https://www.philschmid.de/fine-tune-llms-in-2025</guid>
  <description>The only guide you need to fine-tune open LLMs in 2025, including QLoRA, Spectrum, Flash Attention, Liger Kernels and more.</description>
  <pubDate>Fri, 20 Dec 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deploy QwQ-32B-Preview the best open Reasoning Model on AWS with Hugging Face</title>
  <link>https://www.philschmid.de/sagemaker-deploy-qwq</link>
  <guid>https://www.philschmid.de/sagemaker-deploy-qwq</guid>
  <description>Qwen's QwQ-32B-Preview is the best open reasoning model for mathematical and programming reasoning capabilities among open models directly competing with OpenAI o1.</description>
  <pubDate>Tue, 03 Dec 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deploy Llama 3.2 Vision on Amazon SageMaker</title>
  <link>https://www.philschmid.de/sagemaker-llama32-vision</link>
  <guid>https://www.philschmid.de/sagemaker-llama32-vision</guid>
  <description>Learn how to deploy Llama 3.2 Vision on Amazon SageMaker and run inference with it.</description>
  <pubDate>Thu, 17 Oct 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>How to Fine-Tune Multimodal Models or VLMs with Hugging Face TRL</title>
  <link>https://www.philschmid.de/fine-tune-multimodal-llms-with-trl</link>
  <guid>https://www.philschmid.de/fine-tune-multimodal-llms-with-trl</guid>
  <description>Learn how to fine-tune multimodal models like Llama 3.2 Vision or Qwen 2 VL to create custom image-to-text generation models.</description>
  <pubDate>Mon, 30 Sep 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Evaluate open LLMs with Vertex AI and Gemini</title>
  <link>https://www.philschmid.de/evaluate-llm-with-gemini</link>
  <guid>https://www.philschmid.de/evaluate-llm-with-gemini</guid>
  <description>Evaluate Llama 3.1 8B on Vertex AI with Gemini 1.5 Pro as LLM as a Judge using the Gen AI Evaluation Service.</description>
  <pubDate>Tue, 24 Sep 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Evaluate LLMs using Evaluation Harness and Hugging Face TGI/vLLM</title>
  <link>https://www.philschmid.de/evaluate-llms-with-lm-eval-and-tgi-vllm</link>
  <guid>https://www.philschmid.de/evaluate-llms-with-lm-eval-and-tgi-vllm</guid>
  <description>Evaluate Llama 3.1 8B Instruct on IFEval and GSM8K benchmarks with Chain of Thought reasoning using Evaluation Harness and Hugging Face TGI/vLLM.</description>
  <pubDate>Thu, 19 Sep 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deploy open LLMs with Terraform and Amazon SageMaker</title>
  <link>https://www.philschmid.de/terraform-llm-sagemaker</link>
  <guid>https://www.philschmid.de/terraform-llm-sagemaker</guid>
  <description>Learn how to deploy open large language models (LLMs) like Llama 3.1 8B Instruct to Amazon SageMaker using Terraform Infrastructure as Code (IaC).</description>
  <pubDate>Mon, 05 Aug 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>LLM Evaluation doesn't need to be complicated</title>
  <link>https://www.philschmid.de/llm-evaluation</link>
  <guid>https://www.philschmid.de/llm-evaluation</guid>
  <description>LLM Evaluation doesn't need to be complicated. You don't need complex pipelines, databases or infrastructure components to get started building an effective evaluation pipeline.</description>
  <pubDate>Thu, 11 Jul 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Evaluating Open LLMs with MixEval: The Closest Benchmark to LMSYS Chatbot Arena</title>
  <link>https://www.philschmid.de/evaluate-llm-mixeval</link>
  <guid>https://www.philschmid.de/evaluate-llm-mixeval</guid>
  <description>Evaluate open LLMs with MixEval, the closest benchmark to LMSYS Chatbot Arena for a fraction of the cost and time.</description>
  <pubDate>Fri, 28 Jun 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Train and Deploy open Embedding Models on Amazon SageMaker</title>
  <link>https://www.philschmid.de/sagemaker-train-deploy-embedding-models</link>
  <guid>https://www.philschmid.de/sagemaker-train-deploy-embedding-models</guid>
  <description>Learn how to fine-tune and deploy a custom embedding model on Amazon SageMaker using the new Hugging Face Embedding Container.</description>
  <pubDate>Tue, 25 Jun 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deploy Mixtral 8x7B on AWS Inferentia2 with Hugging Face Optimum</title>
  <link>https://www.philschmid.de/inferentia2-mixtral-8x7b</link>
  <guid>https://www.philschmid.de/inferentia2-mixtral-8x7b</guid>
  <description>Deploy Mixtral 8x7B to AWS Inferentia2 with Hugging Face Optimum on Amazon SageMaker and benchmark it with llmperf.</description>
  <pubDate>Tue, 18 Jun 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Fine-tune Llama 3 with PyTorch FSDP and Q-Lora on Amazon SageMaker</title>
  <link>https://www.philschmid.de/sagemaker-train-deploy-llama3</link>
  <guid>https://www.philschmid.de/sagemaker-train-deploy-llama3</guid>
  <description>Train and deploy Llama 3 on Amazon SageMaker using PyTorch FSDP and Q-Lora</description>
  <pubDate>Tue, 11 Jun 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Fine-tune Embedding models for Retrieval Augmented Generation (RAG)</title>
  <link>https://www.philschmid.de/fine-tune-embedding-model-for-rag</link>
  <guid>https://www.philschmid.de/fine-tune-embedding-model-for-rag</guid>
  <description>Customizing embedding models for domain-specific data can significantly boost the retrieval performance of your RAG Application.</description>
  <pubDate>Tue, 04 Jun 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Understanding the Cost of Generative AI Models in Production</title>
  <link>https://www.philschmid.de/cost-generative-ai</link>
  <guid>https://www.philschmid.de/cost-generative-ai</guid>
  <description>The cost of deploying Generative AI models is very shallow, many people are fixated on raw compute pricing. But in reality, the cost is much more complex.</description>
  <pubDate>Mon, 27 May 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deploy Llama 3 70B on AWS Inferentia2 with Hugging Face Optimum</title>
  <link>https://www.philschmid.de/inferentia2-llama3-70b</link>
  <guid>https://www.philschmid.de/inferentia2-llama3-70b</guid>
  <description>Learn how to deploy Llama 3 70B on AWS Inferentia2 with Hugging Face Optimum on Amazon SageMaker.</description>
  <pubDate>Thu, 23 May 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deploy open LLMs with vLLM on Hugging Face Inference Endpoints</title>
  <link>https://www.philschmid.de/vllm-inference-endpoints</link>
  <guid>https://www.philschmid.de/vllm-inference-endpoints</guid>
  <description>In this blog post, we will show you how to deploy open LLMS with vLLM on Hugging Face Inference Endpoints.</description>
  <pubDate>Thu, 02 May 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Efficiently fine-tune Llama 3 with PyTorch FSDP and Q-Lora</title>
  <link>https://www.philschmid.de/fsdp-qlora-llama3</link>
  <guid>https://www.philschmid.de/fsdp-qlora-llama3</guid>
  <description>Learn how to fine-tune Llama 3 70b with PyTorch FSDP and Q-Lora using Hugging Face TRL, Transformers, PEFT and Datasets.</description>
  <pubDate>Mon, 22 Apr 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deploy Llama 3 on Amazon SageMaker</title>
  <link>https://www.philschmid.de/sagemaker-llama3</link>
  <guid>https://www.philschmid.de/sagemaker-llama3</guid>
  <description>In this blog post you will learn how to deploy Llama 3 70B to Amazon SageMaker.</description>
  <pubDate>Thu, 18 Apr 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Accelerate Mixtral 8x7B with Speculative Decoding and Quantization on Amazon SageMaker</title>
  <link>https://www.philschmid.de/sagemaker-awq-medusa</link>
  <guid>https://www.philschmid.de/sagemaker-awq-medusa</guid>
  <description>In this blog post you will learn how to accelerate Mixtral using Speculative Decoding (Medusa) and Quantization (AWQ).</description>
  <pubDate>Tue, 02 Apr 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deploy Llama 2 70B on AWS Inferentia2 with Hugging Face Optimum</title>
  <link>https://www.philschmid.de/inferentia2-llama-70b-inference</link>
  <guid>https://www.philschmid.de/inferentia2-llama-70b-inference</guid>
  <description>In this blog post you will learn how to deploy Meta Llama 2 70B on AWS Inferentia2 with Hugging Face Optimum on Amazon SageMaker.</description>
  <pubDate>Tue, 26 Mar 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Fine-Tune and Evaluate LLMs in 2024 with Amazon SageMaker</title>
  <link>https://www.philschmid.de/sagemaker-train-evalaute-llms-2024</link>
  <guid>https://www.philschmid.de/sagemaker-train-evalaute-llms-2024</guid>
  <description>In this blog post you will learn how to fine-tune open LLMs from Hugging Face using Amazon SageMaker.</description>
  <pubDate>Tue, 12 Mar 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Evaluate LLMs with Hugging Face Lighteval on Amazon SageMaker</title>
  <link>https://www.philschmid.de/sagemaker-evaluate-llm-lighteval</link>
  <guid>https://www.philschmid.de/sagemaker-evaluate-llm-lighteval</guid>
  <description>In this blog post you will learn how to evaluate LLMs using Hugging Face lighteval on Amazon SageMaker.</description>
  <pubDate>Tue, 05 Mar 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>How to fine-tune Google Gemma with ChatML and Hugging Face TRL</title>
  <link>https://www.philschmid.de/fine-tune-google-gemma</link>
  <guid>https://www.philschmid.de/fine-tune-google-gemma</guid>
  <description>In this blog post you will learn how to fine tune Google Gemma using Hugging Face Transformers, Datasets and TRL.</description>
  <pubDate>Fri, 01 Mar 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>RLHF in 2024 with DPO and Hugging Face</title>
  <link>https://www.philschmid.de/dpo-align-llms-in-2024-with-trl</link>
  <guid>https://www.philschmid.de/dpo-align-llms-in-2024-with-trl</guid>
  <description>In this blog post you will learn how to align LLMs using Hugging Face TRL and RLHF through Direct Preference Optimization (DPO).</description>
  <pubDate>Tue, 23 Jan 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>How to Fine-Tune LLMs in 2024 with Hugging Face</title>
  <link>https://www.philschmid.de/fine-tune-llms-in-2024-with-trl</link>
  <guid>https://www.philschmid.de/fine-tune-llms-in-2024-with-trl</guid>
  <description>In this blog post you will learn how to fine-tune LLMs using Hugging Face TRL, Transformers and Datasets in 2024. We will fine-tune a LLM on a text to SQL dataset.</description>
  <pubDate>Tue, 23 Jan 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Scale LLM Inference on Amazon SageMaker with Multi-Replica Endpoints</title>
  <link>https://www.philschmid.de/sagemaker-multi-replica</link>
  <guid>https://www.philschmid.de/sagemaker-multi-replica</guid>
  <description>In this blog post you will learn how to increase the throughput of Llama 13B on Amazon SageMaker using single instance multi-replica endpoints.</description>
  <pubDate>Thu, 11 Jan 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Fine-tune Llama 7B on AWS Trainium </title>
  <link>https://www.philschmid.de/fine-tune-llama-7b-trainium</link>
  <guid>https://www.philschmid.de/fine-tune-llama-7b-trainium</guid>
  <description>In this blog post you will learn how to fine-tune Llama 7B on AWS Trainium using the Hugging Face Optimum Neuron library.</description>
  <pubDate>Thu, 21 Dec 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Programmatically manage 🤗 Inference Endpoints</title>
  <link>https://www.philschmid.de/inference-endpoints-iac</link>
  <guid>https://www.philschmid.de/inference-endpoints-iac</guid>
  <description>In this blog post you will learn how to use the huggingface_hub library to create, send requests to, pause, and delete Hugging Face Inference Endpoints.</description>
  <pubDate>Wed, 20 Dec 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deploy Mixtral 8x7B on Amazon SageMaker</title>
  <link>https://www.philschmid.de/sagemaker-deploy-mixtral</link>
  <guid>https://www.philschmid.de/sagemaker-deploy-mixtral</guid>
  <description>In this blog post you will learn how to deploy Mixtral 8x7B to Amazon SageMaker.</description>
  <pubDate>Tue, 12 Dec 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deploy Embedding Models on AWS inferentia2 with Amazon SageMaker</title>
  <link>https://www.philschmid.de/inferentia2-embeddings</link>
  <guid>https://www.philschmid.de/inferentia2-embeddings</guid>
  <description>In this blog post, you will learn how to compile and deploy Embedding Models on AWS Inferentia2.</description>
  <pubDate>Tue, 21 Nov 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deploy Llama 2 7B on AWS inferentia2 with Amazon SageMaker</title>
  <link>https://www.philschmid.de/inferentia2-llama-7b</link>
  <guid>https://www.philschmid.de/inferentia2-llama-7b</guid>
  <description>In this blog post, you will learn how to compile and deploy Llama 2 7B on AWS Inferentia2 with Amazon SageMaker.</description>
  <pubDate>Tue, 14 Nov 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deploy Stable Diffusion XL on AWS inferentia2 with Amazon SageMaker</title>
  <link>https://www.philschmid.de/inferentia2-stable-diffusion-xl</link>
  <guid>https://www.philschmid.de/inferentia2-stable-diffusion-xl</guid>
  <description>In this blog post, you will learn how to compile and deploy Stable Diffusion XL on AWS Inferentia2 with Amazon SageMaker.</description>
  <pubDate>Tue, 07 Nov 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Amazon Bedrock: How good (bad) is Titan Embeddings? </title>
  <link>https://www.philschmid.de/amazon-titan-embeddings</link>
  <guid>https://www.philschmid.de/amazon-titan-embeddings</guid>
  <description>In this blog post I took a closer look at Amazon Bedrock Titan embeddings model and how good (bad) the perform.</description>
  <pubDate>Fri, 03 Nov 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Evaluate LLMs and RAG a practical example using Langchain and Hugging Face</title>
  <link>https://www.philschmid.de/evaluate-llm</link>
  <guid>https://www.philschmid.de/evaluate-llm</guid>
  <description>Learn how to evaluate LLMs and RAG pipelines using Langchain and Hugging Face</description>
  <pubDate>Mon, 30 Oct 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deploy Idefics 9B and 80B on Amazon SageMaker</title>
  <link>https://www.philschmid.de/sagemaker-idefics</link>
  <guid>https://www.philschmid.de/sagemaker-idefics</guid>
  <description>Learn how to deploy Hugging Face Idefics 9B and 80B to Amazon SageMaker and send requests with images and text to the model.</description>
  <pubDate>Thu, 12 Oct 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Train and Deploy Mistral 7B with Hugging Face on Amazon SageMaker</title>
  <link>https://www.philschmid.de/sagemaker-mistral</link>
  <guid>https://www.philschmid.de/sagemaker-mistral</guid>
  <description>Learn how to fine-tuned and deploy Mistral 7B with Hugging Face on Amazon SageMaker and leverage technique like Qlora, Flash Attention and response streaming</description>
  <pubDate>Thu, 05 Oct 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Llama 2 on Amazon SageMaker a Benchmark</title>
  <link>https://www.philschmid.de/sagemaker-llama-benchmark</link>
  <guid>https://www.philschmid.de/sagemaker-llama-benchmark</guid>
  <description>Benchmark evaluating varying sizes of Llama 2 on a range of Amazon EC2 instance types with different load levels on latency (ms per token), and throughput (tokens per second).</description>
  <pubDate>Tue, 26 Sep 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Fine-tune Falcon 180B with DeepSpeed ZeRO, LoRA and Flash Attention</title>
  <link>https://www.philschmid.de/deepspeed-lora-flash-attention</link>
  <guid>https://www.philschmid.de/deepspeed-lora-flash-attention</guid>
  <description>In this example we will show how to fine-tune Falcon 180B using DeepSpeed, Hugging Face Transformers, LoRA with Flash Attention on a multi-GPU machine.</description>
  <pubDate>Wed, 20 Sep 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Fine-tune Falcon 180B with QLoRA and Flash Attention on Amazon SageMaker</title>
  <link>https://www.philschmid.de/sagemaker-falcon-180b-qlora</link>
  <guid>https://www.philschmid.de/sagemaker-falcon-180b-qlora</guid>
  <description>Learn how to fine-tune Falcon 180B with QLoRA and Flash Attention on Amazon SageMaker.</description>
  <pubDate>Tue, 12 Sep 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deploy Falcon 180B on Amazon SageMaker</title>
  <link>https://www.philschmid.de/sagemaker-falcon-180b</link>
  <guid>https://www.philschmid.de/sagemaker-falcon-180b</guid>
  <description>Learn how to deploy Falcon 180B to Amazon SageMaker and how to create a chatbot with streaming inference.</description>
  <pubDate>Thu, 07 Sep 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Optimize open LLMs using GPTQ and Hugging Face Optimum</title>
  <link>https://www.philschmid.de/gptq-llama</link>
  <guid>https://www.philschmid.de/gptq-llama</guid>
  <description>Learn how to quantize Llama 2 7B with GPTQ to use 4x less memory.</description>
  <pubDate>Thu, 31 Aug 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>LLMOps: Deploy Open LLMs using Infrastructure as Code with AWS CDK</title>
  <link>https://www.philschmid.de/cdk-llama2</link>
  <guid>https://www.philschmid.de/cdk-llama2</guid>
  <description>Learn how to use Infrastructure as Code with (AWS CDK)] to deploy and manage Llama 2</description>
  <pubDate>Tue, 15 Aug 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deploy Llama 2 7B/13B/70B on Amazon SageMaker</title>
  <link>https://www.philschmid.de/sagemaker-llama-llm</link>
  <guid>https://www.philschmid.de/sagemaker-llama-llm</guid>
  <description>Learn how to deploy Llama 2 models (7B - 70B) to Amazon SageMaker using the Hugging Face LLM Inference DLC.</description>
  <pubDate>Mon, 07 Aug 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Introducing EasyLLM - streamline open LLMs</title>
  <link>https://www.philschmid.de/introducing-easyllm</link>
  <guid>https://www.philschmid.de/introducing-easyllm</guid>
  <description>EasyLLM is an open-source project that provides helpful tools and methods for working with large language models (LLMs).</description>
  <pubDate>Thu, 03 Aug 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Extended Guide: Instruction-tune Llama 2</title>
  <link>https://www.philschmid.de/instruction-tune-llama-2</link>
  <guid>https://www.philschmid.de/instruction-tune-llama-2</guid>
  <description>This blog post is an extended guide on instruction-tuning Llama 2 from Meta AI</description>
  <pubDate>Wed, 26 Jul 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>LLaMA 2 - Every Resource you need</title>
  <link>https://www.philschmid.de/llama-2</link>
  <guid>https://www.philschmid.de/llama-2</guid>
  <description>All Resources for LLaMA 2, How to test, train, and deploy it.</description>
  <pubDate>Fri, 21 Jul 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Fine-tune LLaMA 2 (7-70B) on Amazon SageMaker</title>
  <link>https://www.philschmid.de/sagemaker-llama2-qlora</link>
  <guid>https://www.philschmid.de/sagemaker-llama2-qlora</guid>
  <description>Learn how to train LLaMa 2 using QLoRA Hugging Face Transformers on Amazon SageMaker</description>
  <pubDate>Tue, 18 Jul 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Train LLMs using QLoRA on Amazon SageMaker</title>
  <link>https://www.philschmid.de/sagemaker-falcon-qlora</link>
  <guid>https://www.philschmid.de/sagemaker-falcon-qlora</guid>
  <description>Learn how to train LLMs using QLoRA on Amazon SageMaker</description>
  <pubDate>Thu, 13 Jul 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deploy LLMs with Hugging Face Inference Endpoints</title>
  <link>https://www.philschmid.de/endpoints-llm</link>
  <guid>https://www.philschmid.de/endpoints-llm</guid>
  <description>Learn how to deploy LLMs using Hugging Face Inference Endpoints</description>
  <pubDate>Tue, 04 Jul 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Optimize and Deploy BERT on AWS inferentia2</title>
  <link>https://www.philschmid.de/optimize-deploy-bert-inf2</link>
  <guid>https://www.philschmid.de/optimize-deploy-bert-inf2</guid>
  <description>Learn how to optimize and deploy BERT on AWS Inferentia2</description>
  <pubDate>Wed, 28 Jun 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Securely deploy LLMs inside VPCs with Hugging Face and Amazon SageMaker</title>
  <link>https://www.philschmid.de/sagemaker-llm-vpc</link>
  <guid>https://www.philschmid.de/sagemaker-llm-vpc</guid>
  <description>Learn how to deploy LLMs into VPCs from S3 with Amazon SageMaker using the new Hugging Face LLM Inference DLC.</description>
  <pubDate>Tue, 20 Jun 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deploy Falcon 7B and 40B on Amazon SageMaker</title>
  <link>https://www.philschmid.de/sagemaker-falcon-llm</link>
  <guid>https://www.philschmid.de/sagemaker-falcon-llm</guid>
  <description>Learn how to deploy Falcon 40B to Amazon SageMaker using the new Hugging Face LLM Inference DLC.</description>
  <pubDate>Wed, 07 Jun 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Fine-tune BERT for Text Classification on AWS Trainium</title>
  <link>https://www.philschmid.de/getting-started-trainium</link>
  <guid>https://www.philschmid.de/getting-started-trainium</guid>
  <description>Learn how to fine-tune Hugging Face Transformers using AWS Trainium.</description>
  <pubDate>Tue, 06 Jun 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Introducing the Hugging Face LLM Inference Container for Amazon SageMaker</title>
  <link>https://www.philschmid.de/sagemaker-huggingface-llm</link>
  <guid>https://www.philschmid.de/sagemaker-huggingface-llm</guid>
  <description>Learn how to deploy the open-source LLMs, like BLOOM to Amazon SageMaker for inference using the new Hugging Face LLM Inference Container.</description>
  <pubDate>Wed, 31 May 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Generative AI for Document Understanding with Hugging Face and Amazon SageMaker</title>
  <link>https://www.philschmid.de/sagemaker-donut</link>
  <guid>https://www.philschmid.de/sagemaker-donut</guid>
  <description>Learn how to fine-tune Donut-base a Generative AI model for document-understand/document-parsing using Hugging Face Transformers and Amazon SageMaker.</description>
  <pubDate>Tue, 23 May 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>How to scale LLM workloads to 20B+ with Amazon SageMaker using Hugging Face and PyTorch FSDP</title>
  <link>https://www.philschmid.de/sagemaker-fsdp-gpt</link>
  <guid>https://www.philschmid.de/sagemaker-fsdp-gpt</guid>
  <description>Learn how to fine-tune LLMs on multi-node setups using Amazon SageMaker and Hugging Face Transformers with PyTorch FSDP</description>
  <pubDate>Tue, 02 May 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Setting up AWS Trainium for Hugging Face Transformers</title>
  <link>https://www.philschmid.de/setup-aws-trainium</link>
  <guid>https://www.philschmid.de/setup-aws-trainium</guid>
  <description>Learn how to quickly set up an AWS Trainium using the Hugging Face Neuron Deep Learning AMI and fine-tune BERT</description>
  <pubDate>Tue, 25 Apr 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Train and Deploy BLOOM with Amazon SageMaker and PEFT</title>
  <link>https://www.philschmid.de/bloom-sagemaker-peft</link>
  <guid>https://www.philschmid.de/bloom-sagemaker-peft</guid>
  <description>Learn how to fine-tune BLOOMZ 7B with Amazon SageMaker on a Single GPU using LoRA Hugging Face Transformers.</description>
  <pubDate>Thu, 13 Apr 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Introducing IGEL an instruction-tuned German large Language Model</title>
  <link>https://www.philschmid.de/introducing-igel</link>
  <guid>https://www.philschmid.de/introducing-igel</guid>
  <description>IGEL (Instruction-based German Language Model) is an LLM designed for German language understanding tasks, including sentiment analysis, language translation, and question answering.</description>
  <pubDate>Tue, 04 Apr 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Efficient Large Language Model training with LoRA and Hugging Face</title>
  <link>https://www.philschmid.de/fine-tune-flan-t5-peft</link>
  <guid>https://www.philschmid.de/fine-tune-flan-t5-peft</guid>
  <description>Learn how to fine-tune Google's FLAN-T5 XXL on a Single GPU using LoRA And Hugging Face Transformers.</description>
  <pubDate>Thu, 23 Mar 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deploy FLAN-UL2 20B on Amazon SageMaker</title>
  <link>https://www.philschmid.de/deploy-flan-ul2-sagemaker</link>
  <guid>https://www.philschmid.de/deploy-flan-ul2-sagemaker</guid>
  <description>Learn how to deploy Google's FLAN-UL 20B on Amazon SageMaker for inference.</description>
  <pubDate>Mon, 20 Mar 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Getting started with Pytorch 2.0 and Hugging Face Transformers</title>
  <link>https://www.philschmid.de/getting-started-pytorch-2-0-transformers</link>
  <guid>https://www.philschmid.de/getting-started-pytorch-2-0-transformers</guid>
  <description>Learn how to get started with Pytorch 2.0 and Hugging Face Transformers and reduce your training time up to 2x.</description>
  <pubDate>Thu, 16 Mar 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Controlled text-to-image generation with ControlNet on Inference Endpoints</title>
  <link>https://www.philschmid.de/stable-diffusion-controlnet-endpoint</link>
  <guid>https://www.philschmid.de/stable-diffusion-controlnet-endpoint</guid>
  <description>Learn how to deploy ControlNet Stable Diffusion Pipeline on Hugging Face Inference Endpoints to generate controlled images.</description>
  <pubDate>Fri, 03 Mar 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Combine Amazon SageMaker and DeepSpeed to fine-tune FLAN-T5 XXL</title>
  <link>https://www.philschmid.de/sagemaker-deepspeed</link>
  <guid>https://www.philschmid.de/sagemaker-deepspeed</guid>
  <description>Learn how to fine-tune Google's FLAN-T5 XXL on Amazon SageMaker using DeepSpeed and Hugging Face Transformers.</description>
  <pubDate>Wed, 22 Feb 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Fine-tune FLAN-T5 XL/XXL using DeepSpeed and Hugging Face Transformers</title>
  <link>https://www.philschmid.de/fine-tune-flan-t5-deepspeed</link>
  <guid>https://www.philschmid.de/fine-tune-flan-t5-deepspeed</guid>
  <description>Learn how to fine-tune Google's FLAN-T5 XXL using DeepSpeed and Hugging Face Transformers.</description>
  <pubDate>Thu, 16 Feb 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deploy FLAN-T5 XXL on Amazon SageMaker</title>
  <link>https://www.philschmid.de/deploy-flan-t5-sagemaker</link>
  <guid>https://www.philschmid.de/deploy-flan-t5-sagemaker</guid>
  <description>Learn how to deploy Google's FLAN-T5 XXL on Amazon SageMaker for inference.</description>
  <pubDate>Wed, 08 Feb 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Hugging Face Transformers Examples</title>
  <link>https://www.philschmid.de/huggingface-transformers-examples</link>
  <guid>https://www.philschmid.de/huggingface-transformers-examples</guid>
  <description>Learn how to leverage Hugging Face Transformers to easily fine-tune your models.</description>
  <pubDate>Thu, 26 Jan 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Getting started with Transformers and TPU using PyTorch</title>
  <link>https://www.philschmid.de/getting-started-tpu-transformers</link>
  <guid>https://www.philschmid.de/getting-started-tpu-transformers</guid>
  <description>Learn how to get started with Hugging Face Transformers and TPUs using PyTorch, fine-tune a BERT model for Text Classification using the newest Google Cloud TPUs.</description>
  <pubDate>Mon, 16 Jan 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Fine-tune FLAN-T5 for chat and dialogue summarization</title>
  <link>https://www.philschmid.de/fine-tune-flan-t5</link>
  <guid>https://www.philschmid.de/fine-tune-flan-t5</guid>
  <description>Learn how to fine-tune Google's FLAN-T5 for chat and dialogue summarization using Hugging Face Transformers.</description>
  <pubDate>Tue, 27 Dec 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Managed Transcription with OpenAI Whisper and Hugging Face Inference Endpoints</title>
  <link>https://www.philschmid.de/whisper-inference-endpoints</link>
  <guid>https://www.philschmid.de/whisper-inference-endpoints</guid>
  <description>Learn how to deploy OpenAI Whisper for speech recognition and transcription using Hugging Face Inference Endpoints.</description>
  <pubDate>Tue, 20 Dec 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Stable Diffusion Inpainting example with Hugging Face inference Endpoints</title>
  <link>https://www.philschmid.de/stable-diffusion-inpainting-inference-endpoints</link>
  <guid>https://www.philschmid.de/stable-diffusion-inpainting-inference-endpoints</guid>
  <description>Learn how to deploy Stable Diffusion 2.0 Inpainting on Hugging Face Inference Endpoints to manipulate images.</description>
  <pubDate>Thu, 15 Dec 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Stable Diffusion with Hugging Face Inference Endpoints</title>
  <link>https://www.philschmid.de/stable-diffusion-inference-endpoints</link>
  <guid>https://www.philschmid.de/stable-diffusion-inference-endpoints</guid>
  <description>Learn how to deploy Stable Diffusion 2.0 on Hugging Face Inference Endpoints to generate images based from text.</description>
  <pubDate>Mon, 28 Nov 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Document AI: LiLT a better language agnostic LayoutLM model</title>
  <link>https://www.philschmid.de/fine-tuning-lilt</link>
  <guid>https://www.philschmid.de/fine-tuning-lilt</guid>
  <description>Learn how to fine-tune LiLt (Language independent Layout Transformer) for document-understand/document-parsing using Hugging Face Transformers.</description>
  <pubDate>Tue, 22 Nov 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Multi-Model GPU Inference with Hugging Face Inference Endpoints</title>
  <link>https://www.philschmid.de/multi-model-inference-endpoints</link>
  <guid>https://www.philschmid.de/multi-model-inference-endpoints</guid>
  <description>Learn how to deploy a multiple models on to a GPU with Hugging Face multi-model inference endpoints.</description>
  <pubDate>Thu, 17 Nov 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Serverless Machine Learning Applications with Hugging Face Gradio and AWS Lambda</title>
  <link>https://www.philschmid.de/serverless-gradio</link>
  <guid>https://www.philschmid.de/serverless-gradio</guid>
  <description>Learn how to deploy a Hugging Face Gradio Application using Hugging Face Transformers to AWS Lambda for serverless workloads.</description>
  <pubDate>Tue, 15 Nov 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Accelerate Stable Diffusion inference with DeepSpeed-Inference on GPUs</title>
  <link>https://www.philschmid.de/stable-diffusion-deepspeed-inference</link>
  <guid>https://www.philschmid.de/stable-diffusion-deepspeed-inference</guid>
  <description>Learn how to optimize Stable Diffusion for GPU inference with a 1-line of code using Hugging Face Diffusers and DeepSpeed.</description>
  <pubDate>Tue, 08 Nov 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Stable Diffusion on Amazon SageMaker</title>
  <link>https://www.philschmid.de/sagemaker-stable-diffusion</link>
  <guid>https://www.philschmid.de/sagemaker-stable-diffusion</guid>
  <description>Learn how to deploy Stable Diffusion to Amazon SageMaker to generate images.</description>
  <pubDate>Tue, 01 Nov 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deploy T5 11B for inference for less than $500</title>
  <link>https://www.philschmid.de/deploy-t5-11b</link>
  <guid>https://www.philschmid.de/deploy-t5-11b</guid>
  <description>Learn how to deploy T5 11B on a single GPU using Hugging Face Inference Endpoints.</description>
  <pubDate>Tue, 25 Oct 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Outperform OpenAI GPT-3 with SetFit for text-classification</title>
  <link>https://www.philschmid.de/getting-started-setfit</link>
  <guid>https://www.philschmid.de/getting-started-setfit</guid>
  <description>Learn how to use SetFit to create a text-classification model with only a `8` labeled samples per class, or `32` samples in total. You will also learn how to improve your model by using hyperparamter tuning.</description>
  <pubDate>Tue, 18 Oct 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Fine-tuning LayoutLM for document-understanding using Keras and Hugging Face Transformers </title>
  <link>https://www.philschmid.de/fine-tuning-layoutlm-keras</link>
  <guid>https://www.philschmid.de/fine-tuning-layoutlm-keras</guid>
  <description>Learn how to fine-tune LayoutLM for document-understand using Keras and Hugging Face Transformers.</description>
  <pubDate>Thu, 13 Oct 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deploy LayoutLM with Hugging Face Inference Endpoints</title>
  <link>https://www.philschmid.de/inference-endpoints-layoutlm</link>
  <guid>https://www.philschmid.de/inference-endpoints-layoutlm</guid>
  <description>Learn how to deploy LayoutLM for document-understand using Hugging Face Inference Endpoints.</description>
  <pubDate>Thu, 06 Oct 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Document AI: Fine-tuning LayoutLM for document-understanding using Hugging Face Transformers </title>
  <link>https://www.philschmid.de/fine-tuning-layoutlm</link>
  <guid>https://www.philschmid.de/fine-tuning-layoutlm</guid>
  <description>Learn how to fine-tune LayoutLM for document-understand using Hugging Face Transformers. LayoutLM is a document image understanding and information extraction transformers.</description>
  <pubDate>Tue, 04 Oct 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Custom Inference with Hugging Face Inference Endpoints</title>
  <link>https://www.philschmid.de/custom-inference-handler</link>
  <guid>https://www.philschmid.de/custom-inference-handler</guid>
  <description>Welcome to this tutorial on how to create a custom inference handler for Hugging Face Inference Endpoints.</description>
  <pubDate>Thu, 29 Sep 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Accelerate GPT-J inference with DeepSpeed-Inference on GPUs</title>
  <link>https://www.philschmid.de/gptj-deepspeed-inference</link>
  <guid>https://www.philschmid.de/gptj-deepspeed-inference</guid>
  <description>Learn how to optimize GPT-J for GPU inference with a 1-line of code using Hugging Face Transformers and DeepSpeed.</description>
  <pubDate>Tue, 13 Sep 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Document AI: Fine-tuning Donut for document-parsing using Hugging Face Transformers</title>
  <link>https://www.philschmid.de/fine-tuning-donut</link>
  <guid>https://www.philschmid.de/fine-tuning-donut</guid>
  <description>Learn how to fine-tune Donut-base for document-understand/document-parsing using Hugging Face Transformers. Donut is a new document-understanding model achieving state-of-art performance and can be used for commercial applications.</description>
  <pubDate>Tue, 06 Sep 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Use Sentence Transformers with TensorFlow</title>
  <link>https://www.philschmid.de/tensorflow-sentence-transformers</link>
  <guid>https://www.philschmid.de/tensorflow-sentence-transformers</guid>
  <description>Learn how to Sentence Transformers model with TensorFlow and Keras for creating document embeddings</description>
  <pubDate>Tue, 30 Aug 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Pre-Training BERT with Hugging Face Transformers and Habana Gaudi</title>
  <link>https://www.philschmid.de/pre-training-bert-habana</link>
  <guid>https://www.philschmid.de/pre-training-bert-habana</guid>
  <description>Learn how to pre-traing BERT from scratch using Hugging Face Transformers and Habana Gaudi.</description>
  <pubDate>Wed, 24 Aug 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Accelerate BERT inference with DeepSpeed-Inference on GPUs</title>
  <link>https://www.philschmid.de/bert-deepspeed-inference</link>
  <guid>https://www.philschmid.de/bert-deepspeed-inference</guid>
  <description>Learn how to optimize BERT for GPU inference with a 1-line of code using Hugging Face Transformers and DeepSpeed.</description>
  <pubDate>Tue, 16 Aug 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Accelerate Sentence Transformers with Hugging Face Optimum</title>
  <link>https://www.philschmid.de/optimize-sentence-transformers</link>
  <guid>https://www.philschmid.de/optimize-sentence-transformers</guid>
  <description>Learn how to optimize Sentence Transformers using Hugging Face Optimum. You will learn how dynamically quantize and optimize a Sentence Transformer for ONNX Runtime.</description>
  <pubDate>Tue, 02 Aug 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deep Learning setup made easy with EC2 Remote Runner and Habana Gaudi</title>
  <link>https://www.philschmid.de/habana-gaudi-ec2-runner</link>
  <guid>https://www.philschmid.de/habana-gaudi-ec2-runner</guid>
  <description>Learn how to migrate your training jobs to a Habana Gaudi-based DL1 instance on AWS using EC2 Remote Runner.</description>
  <pubDate>Tue, 26 Jul 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Accelerate Vision Transformer (ViT) with Quantization using Optimum</title>
  <link>https://www.philschmid.de/optimizing-vision-transformer</link>
  <guid>https://www.philschmid.de/optimizing-vision-transformer</guid>
  <description>Learn how to optimize Vision Transformer (ViT) using Hugging Face Optimum. You will learn how dynamically quantize a ViT model for ONNX Runtime.</description>
  <pubDate>Tue, 19 Jul 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Optimizing Transformers for GPUs with Optimum</title>
  <link>https://www.philschmid.de/optimizing-transformers-with-optimum-gpu</link>
  <guid>https://www.philschmid.de/optimizing-transformers-with-optimum-gpu</guid>
  <description>Learn how to optimize Hugging Face Transformers models for NVIDIA GPUs using Optimum. You will learn how to optimize a DistilBERT for ONNX Runtime</description>
  <pubDate>Wed, 13 Jul 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Hugging Face Transformers and Habana Gaudi AWS DL1 Instances</title>
  <link>https://www.philschmid.de/habana-distributed-training</link>
  <guid>https://www.philschmid.de/habana-distributed-training</guid>
  <description>Learn how to learn how to fine-tune XLM-RoBERTa for multi-lingual multi-class text-classification using a Habana Gaudi-based DL1 instance.</description>
  <pubDate>Tue, 05 Jul 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Optimizing Transformers with Hugging Face Optimum</title>
  <link>https://www.philschmid.de/optimizing-transformers-with-optimum</link>
  <guid>https://www.philschmid.de/optimizing-transformers-with-optimum</guid>
  <description>Learn how to optimize Hugging Face Transformers models using Optimum. The session will show you how to dynamically quantize and optimize a DistilBERT model using Hugging Face Optimum and ONNX Runtime. Hugging Face Optimum is an extension of 🤗 Transformers, providing a set of performance optimization tools enabling maximum efficiency to train and run models on targeted hardware.</description>
  <pubDate>Thu, 30 Jun 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Convert Transformers to ONNX with Hugging Face Optimum</title>
  <link>https://www.philschmid.de/convert-transformers-to-onnx</link>
  <guid>https://www.philschmid.de/convert-transformers-to-onnx</guid>
  <description>Introduction guide about ONNX and Transformers. Learn how to convert transformers like BERT to ONNX and what you can do with it.</description>
  <pubDate>Tue, 21 Jun 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Setup Deep Learning environment for Hugging Face Transformers with Habana Gaudi on AWS</title>
  <link>https://www.philschmid.de/getting-started-habana-gaudi</link>
  <guid>https://www.philschmid.de/getting-started-habana-gaudi</guid>
  <description>Learn how to setup a Deep Learning Environment for Hugging Face Transformers with Habana Gaudi on AWS using the DL1 instance type.</description>
  <pubDate>Tue, 14 Jun 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Static Quantization with Hugging Face `optimum` for ~3x latency improvements</title>
  <link>https://www.philschmid.de/static-quantization-optimum</link>
  <guid>https://www.philschmid.de/static-quantization-optimum</guid>
  <description>Learn how to do post-training static quantization on Hugging Face Transformers model with `optimum` to achieve up to 3x latency improvements.</description>
  <pubDate>Tue, 07 Jun 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Advanced PII detection and anonymization with Hugging Face Transformers and Amazon SageMaker</title>
  <link>https://www.philschmid.de/pii-huggingface-sagemaker</link>
  <guid>https://www.philschmid.de/pii-huggingface-sagemaker</guid>
  <description>Learn how to do advanced PII detection and anonymization with Hugging Face Transformers and Amazon SageMaker.</description>
  <pubDate>Tue, 31 May 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>An Amazon SageMaker Inference comparison with Hugging Face Transformers</title>
  <link>https://www.philschmid.de/sagemaker-inference-comparison</link>
  <guid>https://www.philschmid.de/sagemaker-inference-comparison</guid>
  <description>Learn about the different existing Amazon SageMaker Inference options and and how to use them.</description>
  <pubDate>Tue, 17 May 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Semantic Segmantion with Hugging Face's Transformers and Amazon SageMaker</title>
  <link>https://www.philschmid.de/image-segmentation-sagemaker</link>
  <guid>https://www.philschmid.de/image-segmentation-sagemaker</guid>
  <description>Learn how to do image segmentation with Hugging Face Transformers, SegFormer and Amazon SageMaker.</description>
  <pubDate>Tue, 03 May 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Automatic Speech Recogntion with Hugging Face's Transformers and Amazon SageMaker</title>
  <link>https://www.philschmid.de/automatic-speech-recognition-sagemaker</link>
  <guid>https://www.philschmid.de/automatic-speech-recognition-sagemaker</guid>
  <description>Learn how to do automatic speech recognition/speech-to-text with Hugging Face Transformers, Wav2vec2 and Amazon SageMaker.</description>
  <pubDate>Thu, 28 Apr 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Serverless Inference with Hugging Face's Transformers, DistilBERT and Amazon SageMaker</title>
  <link>https://www.philschmid.de/sagemaker-serverless-huggingface-distilbert</link>
  <guid>https://www.philschmid.de/sagemaker-serverless-huggingface-distilbert</guid>
  <description>Learn how to deploy a Transformer model like BERT to Amazon SageMaker Serverless using the Python SageMaker SDK.</description>
  <pubDate>Thu, 21 Apr 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Accelerated document embeddings with Hugging Face Transformers and AWS Inferentia</title>
  <link>https://www.philschmid.de/huggingface-sentence-transformers-aws-inferentia</link>
  <guid>https://www.philschmid.de/huggingface-sentence-transformers-aws-inferentia</guid>
  <description>Learn how to accelerate Sentence Transformers inference inference using Hugging Face Transformers and AWS Inferentia.</description>
  <pubDate>Tue, 19 Apr 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Save up to 90% training cost with AWS Spot Instances and Hugging Face Transformers</title>
  <link>https://www.philschmid.de/sagemaker-spot-instance</link>
  <guid>https://www.philschmid.de/sagemaker-spot-instance</guid>
  <description>Learn how to leverage AWS Spot Instances when training Hugging Face Transformers with Amazon SageMaker to save up to 90% training cost.</description>
  <pubDate>Tue, 22 Mar 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Speed up BERT inference with Hugging Face Transformers and AWS Inferentia</title>
  <link>https://www.philschmid.de/huggingface-bert-aws-inferentia</link>
  <guid>https://www.philschmid.de/huggingface-bert-aws-inferentia</guid>
  <description>Learn how to accelerate BERT and Transformers inference using Hugging Face Transformers and AWS Inferentia.</description>
  <pubDate>Wed, 16 Mar 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Creating document embeddings with Hugging Face's Transformers and Amazon SageMaker</title>
  <link>https://www.philschmid.de/custom-inference-huggingface-sagemaker</link>
  <guid>https://www.philschmid.de/custom-inference-huggingface-sagemaker</guid>
  <description>Learn how to use a custom Inference script for creating document embeddings with Hugging Face's Transformers, Amazon SageMaker, and Sentence Transformers.</description>
  <pubDate>Tue, 08 Mar 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Autoscaling BERT with Hugging Face Transformers, Amazon SageMaker and Terraform module</title>
  <link>https://www.philschmid.de/terraform-huggingface-amazon-sagemaker-advanced</link>
  <guid>https://www.philschmid.de/terraform-huggingface-amazon-sagemaker-advanced</guid>
  <description>Learn how to apply autoscaling to Hugging Face Transformers and Amazon SageMaker using Terraform.</description>
  <pubDate>Tue, 01 Mar 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Multi-Container Endpoints with Hugging Face Transformers and Amazon SageMaker</title>
  <link>https://www.philschmid.de/sagemaker-huggingface-multi-container-endpoint</link>
  <guid>https://www.philschmid.de/sagemaker-huggingface-multi-container-endpoint</guid>
  <description>Learn how to deploy multiple Hugging Face Transformers for inference with Amazon SageMaker and Multi-Container Endpoints.</description>
  <pubDate>Tue, 22 Feb 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Asynchronous Inference with Hugging Face Transformers and Amazon SageMaker</title>
  <link>https://www.philschmid.de/sagemaker-huggingface-async-inference</link>
  <guid>https://www.philschmid.de/sagemaker-huggingface-async-inference</guid>
  <description>Learn how to deploy an Asynchronous Inference model with Hugging Face Transformers and Amazon SageMaker, with autoscaling to zero.</description>
  <pubDate>Tue, 15 Feb 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deploy BERT with Hugging Face Transformers, Amazon SageMaker and Terraform module</title>
  <link>https://www.philschmid.de/terraform-huggingface-amazon-sagemaker</link>
  <guid>https://www.philschmid.de/terraform-huggingface-amazon-sagemaker</guid>
  <description>Learn how to deploy BERT/DistilBERT with Hugging Face Transformers using Amazon SageMaker and Terraform module.</description>
  <pubDate>Tue, 08 Feb 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Task-specific knowledge distillation for BERT using Transformers and Amazon SageMaker</title>
  <link>https://www.philschmid.de/knowledge-distillation-bert-transformers</link>
  <guid>https://www.philschmid.de/knowledge-distillation-bert-transformers</guid>
  <description>Learn how to run apply task-specific knowledge distillation for BERT and text-classification using Hugging Face Transformers and Amazon SageMaker including Hyperparameter search.</description>
  <pubDate>Tue, 01 Feb 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Distributed training on multilingual BERT with Hugging Face Transformers and Amazon SageMaker</title>
  <link>https://www.philschmid.de/pytorch-distributed-training-transformers</link>
  <guid>https://www.philschmid.de/pytorch-distributed-training-transformers</guid>
  <description>Learn how to run large-scale distributed training using multilingual BERT on over 1 million data points with Hugging Face Transformers and Amazon SageMaker</description>
  <pubDate>Tue, 25 Jan 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Financial Text Summarization with Hugging Face Transformers, Keras and Amazon SageMaker</title>
  <link>https://www.philschmid.de/financial-summarizatio-huggingface-keras</link>
  <guid>https://www.philschmid.de/financial-summarizatio-huggingface-keras</guid>
  <description>Learn how to fine-tune a a Hugging Face Transformer for Financial Text Summarization using vanilla `Keras`, `Tensorflow` , `Transformers`, `Datasets` and Amazon SageMaker.</description>
  <pubDate>Wed, 19 Jan 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deploy GPT-J 6B for inference using  Hugging Face Transformers and Amazon SageMaker</title>
  <link>https://www.philschmid.de/deploy-gptj-sagemaker</link>
  <guid>https://www.philschmid.de/deploy-gptj-sagemaker</guid>
  <description>Learn how to deploy EleutherAIs GPT-J 6B for inference using Hugging Face Transformers and Amazon SageMaker.</description>
  <pubDate>Tue, 11 Jan 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Image Classification with Hugging Face Transformers and `Keras` </title>
  <link>https://www.philschmid.de/image-classification-huggingface-transformers-keras</link>
  <guid>https://www.philschmid.de/image-classification-huggingface-transformers-keras</guid>
  <description>Learn how to fine-tune a Vision Transformer for Image Classification Example using vanilla `Keras`, `Transformers`, `Datasets`.</description>
  <pubDate>Tue, 04 Jan 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Workshop: Enterprise-Scale NLP with Hugging Face and Amazon SageMaker</title>
  <link>https://www.philschmid.de/hugginface-sagemaker-workshop</link>
  <guid>https://www.philschmid.de/hugginface-sagemaker-workshop</guid>
  <description>In October and November, we held a workshop series on “Enterprise-Scale NLP with Hugging Face and Amazon SageMaker”. This workshop series consisted out of 3 parts and covers: Getting Started, Going Production and MLOps.</description>
  <pubDate>Wed, 29 Dec 2021 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Hugging Face Transformers with Keras: Fine-tune a non-English BERT for Named Entity Recognition</title>
  <link>https://www.philschmid.de/huggingface-transformers-keras-tf</link>
  <guid>https://www.philschmid.de/huggingface-transformers-keras-tf</guid>
  <description>Learn how to fine-tune a non-English BERT using Hugging Face Transformers and Keras/TF, Transformers, datasets.</description>
  <pubDate>Tue, 21 Dec 2021 00:00:00 GMT</pubDate>
</item>
<item>
  <title>New Serverless Transformers using Amazon SageMaker Serverless Inference and Hugging Face</title>
  <link>https://www.philschmid.de/serverless-transformers-sagemaker-huggingface</link>
  <guid>https://www.philschmid.de/serverless-transformers-sagemaker-huggingface</guid>
  <description>Learn how to deploy Hugging Face Transformers serverless using the new Amazon SageMaker Serverless Inference.</description>
  <pubDate>Wed, 15 Dec 2021 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Hugging Face Transformers BERT fine-tuning using Amazon SageMaker and Training Compiler</title>
  <link>https://www.philschmid.de/huggingface-amazon-sagemaker-training-compiler</link>
  <guid>https://www.philschmid.de/huggingface-amazon-sagemaker-training-compiler</guid>
  <description>Learn how to Compile and fine-tune a Multi-Class Classification Transformers with `Trainer` and `emotion` dataset using Amazon SageMaker Training Compiler.</description>
  <pubDate>Tue, 07 Dec 2021 00:00:00 GMT</pubDate>
</item>
<item>
  <title>MLOps: Using the Hugging Face Hub as model registry with Amazon SageMaker</title>
  <link>https://www.philschmid.de/huggingface-hub-amazon-sagemaker</link>
  <guid>https://www.philschmid.de/huggingface-hub-amazon-sagemaker</guid>
  <description>Learn how to automatically save your model weights, logs, and artifacts to the Hugging Face Hub using Amazon SageMaker and how to deploy the model afterwards for inference.</description>
  <pubDate>Tue, 16 Nov 2021 00:00:00 GMT</pubDate>
</item>
<item>
  <title>A remote guide to re:Invent 2021 machine learning sessions</title>
  <link>https://www.philschmid.de/re-invent-2021</link>
  <guid>https://www.philschmid.de/re-invent-2021</guid>
  <description>If you are like me you are not from the USA and cannot easily travel to Las Vegas. I have the perfect remote guide for your perfect virtual re:Invent 2021 focused on NLP and Machine Learning.</description>
  <pubDate>Thu, 11 Nov 2021 00:00:00 GMT</pubDate>
</item>
<item>
  <title>MLOps: End-to-End Hugging Face Transformers with the Hub and SageMaker Pipelines</title>
  <link>https://www.philschmid.de/mlops-sagemaker-huggingface-transformers</link>
  <guid>https://www.philschmid.de/mlops-sagemaker-huggingface-transformers</guid>
  <description>Learn how to build an End-to-End MLOps Pipeline for Hugging Face Transformers from training to production using Amazon SageMaker.</description>
  <pubDate>Wed, 10 Nov 2021 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Going Production: Auto-scaling Hugging Face Transformers with Amazon SageMaker</title>
  <link>https://www.philschmid.de/auto-scaling-sagemaker-huggingface</link>
  <guid>https://www.philschmid.de/auto-scaling-sagemaker-huggingface</guid>
  <description>Learn how to add auto-scaling to your Hugging Face Transformers SageMaker Endpoints.</description>
  <pubDate>Fri, 29 Oct 2021 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Deploy BigScience T0_3B to AWS and Amazon SageMaker</title>
  <link>https://www.philschmid.de/deploy-bigscience-t0-3b-to-aws-and-amazon-sagemaker</link>
  <guid>https://www.philschmid.de/deploy-bigscience-t0-3b-to-aws-and-amazon-sagemaker</guid>
  <description>🌸 BigScience released their first modeling paper introducing T0 which outperforms GPT-3 on many zero-shot tasks while being 16x smaller! Deploy BigScience the 3 Billion version (T0_3B) to Amazon SageMaker with a few lines of code to run a scalable production workload!</description>
  <pubDate>Wed, 20 Oct 2021 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Scalable, Secure Hugging Face Transformer Endpoints with Amazon SageMaker, AWS Lambda, and CDK</title>
  <link>https://www.philschmid.de/huggingface-transformers-cdk-sagemaker-lambda</link>
  <guid>https://www.philschmid.de/huggingface-transformers-cdk-sagemaker-lambda</guid>
  <description>Deploy Hugging Face Transformers to Amazon SageMaker and create an API for the Endpoint using AWS Lambda, API Gateway and AWS CDK.</description>
  <pubDate>Wed, 06 Oct 2021 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Few-shot learning in practice with GPT-Neo</title>
  <link>https://www.philschmid.de/few-shot-learning-gpt-neo</link>
  <guid>https://www.philschmid.de/few-shot-learning-gpt-neo</guid>
  <description>The latest developments in NLP show that you can overcome this limitation by providing a few examples at inference time with a large language model - a technique known as Few-Shot Learning. In this blog post, we'll explain what Few-Shot Learning is, and explore how a large language model called GPT-Neo.</description>
  <pubDate>Sat, 05 Jun 2021 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Distributed Training: Train BART/T5 for Summarization using 🤗 Transformers and Amazon SageMaker</title>
  <link>https://www.philschmid.de/sagemaker-distributed-training</link>
  <guid>https://www.philschmid.de/sagemaker-distributed-training</guid>
  <description>Learn how to train distributed models for summarization using Hugging Face Transformers and Amazon SageMaker and upload them afterwards to huggingface.co.</description>
  <pubDate>Fri, 09 Apr 2021 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Multilingual Serverless XLM RoBERTa with HuggingFace, AWS Lambda</title>
  <link>https://www.philschmid.de/multilingual-serverless-xlm-roberta-with-huggingface</link>
  <guid>https://www.philschmid.de/multilingual-serverless-xlm-roberta-with-huggingface</guid>
  <description>Learn how to build a Multilingual Serverless BERT Question Answering API with a model size of more than 2GB and then testing it in German and France.</description>
  <pubDate>Thu, 17 Dec 2020 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Serverless BERT with HuggingFace, AWS Lambda, and Docker</title>
  <link>https://www.philschmid.de/serverless-bert-with-huggingface-aws-lambda-docker</link>
  <guid>https://www.philschmid.de/serverless-bert-with-huggingface-aws-lambda-docker</guid>
  <description>Learn how to use the newest cutting edge computing power of AWS with the benefits of serverless architectures to leverage Google's "State-of-the-Art" NLP Model.</description>
  <pubDate>Sun, 06 Dec 2020 00:00:00 GMT</pubDate>
</item>
<item>
  <title>AWS Lambda with custom docker images as runtime</title>
  <link>https://www.philschmid.de/aws-lambda-with-custom-docker-image</link>
  <guid>https://www.philschmid.de/aws-lambda-with-custom-docker-image</guid>
  <description>Learn how to build and deploy an AWS Lambda function with a custom python docker container as runtime with the use of Amazon ECR.</description>
  <pubDate>Wed, 02 Dec 2020 00:00:00 GMT</pubDate>
</item>
<item>
  <title>New Serverless BERT with Huggingface, AWS Lambda, and AWS EFS</title>
  <link>https://www.philschmid.de/new-serverless-bert-with-huggingface-aws-lambda</link>
  <guid>https://www.philschmid.de/new-serverless-bert-with-huggingface-aws-lambda</guid>
  <description>Build a serverless Question-Answering API using the Serverless Framework, AWS Lambda, AWS EFS, efsync, Terraform, the transformers Library from HuggingFace, and a `mobileBert` model from Google fine-tuned on SQuADv2.</description>
  <pubDate>Sun, 15 Nov 2020 00:00:00 GMT</pubDate>
</item>
<item>
  <title>efsync my first open-source MLOps toolkit</title>
  <link>https://www.philschmid.de/efsync-my-first-open-source-mlops-toolkit</link>
  <guid>https://www.philschmid.de/efsync-my-first-open-source-mlops-toolkit</guid>
  <description>efsync is a CLI/SDK tool, which syncs files from S3 or local filesystem automatically to AWS EFS and enables you to install dependencies with the AWS Lambda runtime directly into your EFS filesystem.</description>
  <pubDate>Wed, 04 Nov 2020 00:00:00 GMT</pubDate>
</item>
<item>
  <title>My path to become a certified solution architect</title>
  <link>https://www.philschmid.de/my-path-to-become-a-certified-solution-architect</link>
  <guid>https://www.philschmid.de/my-path-to-become-a-certified-solution-architect</guid>
  <description>This is the Story of how I became a certified solution architect within 28 hours of preparation.</description>
  <pubDate>Sat, 24 Oct 2020 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Create custom Github Action in 4 steps</title>
  <link>https://www.philschmid.de/create-custom-github-action-in-4-steps</link>
  <guid>https://www.philschmid.de/create-custom-github-action-in-4-steps</guid>
  <description>Create a custom github action in 4 steps. Also learn how to test it offline and publish it in the Github Action marketplace.</description>
  <pubDate>Fri, 25 Sep 2020 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Fine-tune a non-English GPT-2 Model with Huggingface</title>
  <link>https://www.philschmid.de/fine-tune-a-non-english-gpt-2-model-with-huggingface</link>
  <guid>https://www.philschmid.de/fine-tune-a-non-english-gpt-2-model-with-huggingface</guid>
  <description>Fine-tune non-English, German GPT-2 model with Huggingface on German recipes. Using their Trainer class and Pipeline objects.</description>
  <pubDate>Sun, 06 Sep 2020 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Mount your AWS EFS volume into AWS Lambda with the Serverless Framework</title>
  <link>https://www.philschmid.de/mount-your-aws-efs-volume-into-aws-lambda-with-the-serverless-framework</link>
  <guid>https://www.philschmid.de/mount-your-aws-efs-volume-into-aws-lambda-with-the-serverless-framework</guid>
  <description>Leverage your Serverless architectures with mounting your AWS EFS volume into your AWS Lambda with the Serverless Framework.</description>
  <pubDate>Wed, 12 Aug 2020 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Serverless BERT with HuggingFace and AWS Lambda</title>
  <link>https://www.philschmid.de/serverless-bert-with-huggingface-and-aws-lambda</link>
  <guid>https://www.philschmid.de/serverless-bert-with-huggingface-and-aws-lambda</guid>
  <description>Build a serverless question-answering API with BERT, HuggingFace, the Serverless Framework and AWS Lambda.</description>
  <pubDate>Tue, 30 Jun 2020 00:00:00 GMT</pubDate>
</item>
<item>
  <title>How to use Google Tag Manager and Google Analytics without Cookies</title>
  <link>https://www.philschmid.de/how-to-use-google-tag-manager-and-google-analytics-without-cookies</link>
  <guid>https://www.philschmid.de/how-to-use-google-tag-manager-and-google-analytics-without-cookies</guid>
  <description>Connect your user behavior with technical insights without using cookies to improve your customer experience.</description>
  <pubDate>Sat, 06 Jun 2020 00:00:00 GMT</pubDate>
</item>
<item>
  <title>BERT Text Classification in a different language</title>
  <link>https://www.philschmid.de/bert-text-classification-in-a-different-language</link>
  <guid>https://www.philschmid.de/bert-text-classification-in-a-different-language</guid>
  <description>Build a non-English (German) BERT multi-class text classification model with HuggingFace and Simple Transformers.</description>
  <pubDate>Fri, 22 May 2020 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Scaling Machine Learning from ZERO to HERO</title>
  <link>https://www.philschmid.de/scaling-machine-learning-from-zero-to-hero</link>
  <guid>https://www.philschmid.de/scaling-machine-learning-from-zero-to-hero</guid>
  <description>Scale your machine learning models by using AWS Lambda, the Serverless Framework, and PyTorch. I will show you how to build scalable deep learning inference architectures.</description>
  <pubDate>Fri, 08 May 2020 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Getting Started with AutoML and AWS AutoGluon</title>
  <link>https://www.philschmid.de/getting-started-with-automl-and-aws-autogluon</link>
  <guid>https://www.philschmid.de/getting-started-with-automl-and-aws-autogluon</guid>
  <description>Built an Object Detection Model with AWS AutoML library AutoGluon.</description>
  <pubDate>Mon, 20 Apr 2020 00:00:00 GMT</pubDate>
</item>
<item>
  <title>K-Fold as Cross-Validation with a BERT Text-Classification Example</title>
  <link>https://www.philschmid.de/k-fold-as-cross-validation-with-a-bert-text-classification-example</link>
  <guid>https://www.philschmid.de/k-fold-as-cross-validation-with-a-bert-text-classification-example</guid>
  <description>Using the K-Fold Cross-Validation to improve your Transformers model validation by the example of BERT Text-Classification</description>
  <pubDate>Tue, 07 Apr 2020 00:00:00 GMT</pubDate>
</item>
<item>
  <title>How to Set Up a CI/CD Pipeline for AWS Lambda With GitHub Actions and Serverless</title>
  <link>https://www.philschmid.de/how-to-set-up-a-ci-cd-pipeline-for-aws-lambda-with-github-actions-and-serverless</link>
  <guid>https://www.philschmid.de/how-to-set-up-a-ci-cd-pipeline-for-aws-lambda-with-github-actions-and-serverless</guid>
  <description>Automatically deploy your Python function with dependencies in less than five minutes</description>
  <pubDate>Wed, 01 Apr 2020 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Set up a CI/CD Pipeline for your Web app on AWS with Github Actions</title>
  <link>https://www.philschmid.de/set-up-a-ci-cd-pipeline-for-your-web-app-on-aws-s3-with-github-actions</link>
  <guid>https://www.philschmid.de/set-up-a-ci-cd-pipeline-for-your-web-app-on-aws-s3-with-github-actions</guid>
  <description>Automatic deploy your React, Vue, Angular or Svelte app on s3 and create a cache Invalidation with Github Actions.</description>
  <pubDate>Wed, 25 Mar 2020 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Getting started with CNNs by calculating LeNet-Layer manually</title>
  <link>https://www.philschmid.de/getting-started-with-cnn-by-calculating-lenet-layer-manually</link>
  <guid>https://www.philschmid.de/getting-started-with-cnn-by-calculating-lenet-layer-manually</guid>
  <description>Getting started explanation to CNNs by calculating Yann LeCun LeNet-5 manually for handwritten digits and learning about Padding and Stride.</description>
  <pubDate>Fri, 28 Feb 2020 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Google Colab the free GPU/TPU Jupyter Notebook Service</title>
  <link>https://www.philschmid.de/google-cola-the-free-gpu-jupyter</link>
  <guid>https://www.philschmid.de/google-cola-the-free-gpu-jupyter</guid>
  <description>A Short Introduction to Google Colab as a free Jupyter notebook service from Google. Learn how to use Accelerated Hardware like GPUs and TPUs to run your Machine learning completely for free in the cloud.</description>
  <pubDate>Wed, 26 Feb 2020 00:00:00 GMT</pubDate>
</item>
    </channel>
  </rss>