How to use Deep Research with the Gemini API
The Gemini Deep Research Agent autonomously plans, searches, and synthesizes long-horizon research tasks into detailed, cited reports.
Deep Research handles long-running tasks by executing in the background. It is exclusively available through the Interactions API (not generate_content).
Two new versions are available:
deep-research-preview-04-2026: Designed for speed & efficiency, ideal to be streamed back to a client UIdeep-research-max-preview-04-2026: Maximum comprehensiveness for automated context gathering & synthesis
What's new
- Collaborative planning: Review and refine the research plan before execution
- Native charts & infographics: Agent-generated charts, graphs, and infographics
- Remote MCP server: Connect external tools via the Model Context Protocol
- Extended tooling: Google Search, URL Context, Code Execution, MCP, and File Search
- Multimodal research grounding: Pass images, PDFs, and audio as research context
Setup
Install the Python SDK:
pip install google-genaiSet your API key as an environment variable. You can create one at aistudio.google.com/apikey.
export GEMINI_API_KEY="your-api-key"Run your first Deep Research task
Start a research task with background=True and poll for the result. Deep Research is asynchronous as tasks can take several minutes to complete.
import time
from google import genai
client = genai.Client()
interaction = client.interactions.create(
input="Research the history of Google TPUs.",
agent="deep-research-preview-04-2026",
background=True,
)
while True:
interaction = client.interactions.get(interaction.id)
if interaction.status == "completed":
print(interaction.outputs[-1].text)
break
elif interaction.status == "failed":
print(f"Research failed: {interaction.error}")
break
time.sleep(10)Collaborative planning
Set collaborative_planning=True to get a research plan back instead of running immediately. Iterate on the plan with previous_interaction_id, then set collaborative_planning=False to execute.
Step 1: Request a plan
import time
from google import genai
client = genai.Client()
plan = client.interactions.create(
agent="deep-research-preview-04-2026",
input="Research Google TPUs vs competitor hardware.",
agent_config={"type": "deep-research", "collaborative_planning": True},
background=True,
)
while (result := client.interactions.get(id=plan.id)).status != "completed":
time.sleep(5)
print(result.outputs[-1].text)Refine the plan: Use previous_interaction_id to continue the conversation. Keep collaborative_planning=True to stay in planning mode. Repeat as needed.
refined = client.interactions.create(
agent="deep-research-preview-04-2026",
input="Add a section comparing power efficiency.",
agent_config={"type": "deep-research", "collaborative_planning": True},
previous_interaction_id=plan.id,
background=True,
)
while (result := client.interactions.get(id=refined.id)).status != "completed":
time.sleep(5)
print(result.outputs[-1].text)Approve and execute: Set collaborative_planning=False to approve the plan and start the research.
Important: You must explicitly set collaborative_planning=False on the final turn. Simply sending "go ahead" without flipping the flag will not trigger report generation.
report = client.interactions.create(
agent="deep-research-preview-04-2026",
input="Plan looks good!",
agent_config={"type": "deep-research", "collaborative_planning": False},
previous_interaction_id=refined.id,
background=True,
)
while (result := client.interactions.get(id=report.id)).status != "completed":
time.sleep(5)
print(result.outputs[-1].text)Native charts and infographics
Set visualization="auto" and ask for visuals in your prompt. The agent generates charts and infographics returned as base64-encoded images.
import base64
from google import genai
client = genai.Client()
interaction = client.interactions.create(
agent="deep-research-preview-04-2026",
input="Analyze global semiconductor market trends. Include charts showing market share changes.",
agent_config={"type": "deep-research", "visualization": "auto"},
background=True,
)
while (result := client.interactions.get(id=interaction.id)).status != "completed":
time.sleep(5)
for output in result.outputs:
if output.type == "text":
print(output.text)
elif output.type == "image" and output.data:
image_bytes = base64.b64decode(output.data)
# display(Image(data=image_bytes)) # JupyterTip: Setting visualization="auto" enables the capability, but best results are achieved by explicitly asking for what you want.
Remote MCP servers
Connect remote MCP servers to give the agent access to external tools. Pass the server name, url, and optional auth headers.
interaction = client.interactions.create(
agent="deep-research-preview-04-2026",
input="Research how recent geopolitical events influenced USD interest rates",
tools=[
{
"type": "mcp_server",
"name": "Finance Data Provider",
"url": "https://finance.example.com/mcp",
"headers": {"Authorization": "Bearer my-token"},
}
],
background=True,
)MCP servers support no-auth, bearer token, and OAuth. For OAuth, fetch the token with a library like google-auth and pass it in headers. Use allowed_tools to restrict which tools the agent can call from the server.
Tool configuration
By default the agent uses Google Search, URL Context, and Code Execution. You can customize the tools the agent can use by providing a list of tools, similar to models. This allows you to, for example, only search the web (via google_search and url_context), only search private sources (via file_search and custom MCP servers), or search a mix of both.
| Tool | Type | Default | Description |
|---|---|---|---|
| Google Search | google_search | ✅ | Search the public web |
| URL Context | url_context | ✅ | Read and summarize web pages |
| Code Execution | code_execution | ✅ | Run code for calculations and data analysis |
| MCP Server | mcp_server | — | Connect remote MCP servers |
| File Search | file_search | — | Search uploaded document corpora |
# Only web search allowed
interaction = client.interactions.create(
agent="deep-research-preview-04-2026",
input="Latest developments in quantum computing.",
tools=[{"type": "google_search"}],
background=True,
)Note: Passing no tools at all defaults to Search, URL Context, and Code Execution enabled.
Multimodal research grounding
Pass images, PDFs, and documents alongside your text prompt to ground the research.
interaction = client.interactions.create(
agent="deep-research-preview-04-2026",
input=[
{"type": "text", "text": "What has been the impact of this research paper?"},
{"type": "document", "uri": "https://arxiv.org/pdf/1706.03762", "mime_type": "application/pdf"},
],
background=True,
)Real-time streaming with visuals and thought summaries
Stream research progress in real time. Enable thinking_summaries="auto" to receive the agent's intermediate reasoning alongside text and generated images.
import base64
from google import genai
from IPython.display import Image, display
client = genai.Client()
interaction_id = None
last_event_id = None
is_complete = False
def process_stream(stream):
global interaction_id, last_event_id, is_complete
for chunk in stream:
if chunk.event_type == "interaction.start":
interaction_id = chunk.interaction.id
if chunk.event_id:
last_event_id = chunk.event_id
if chunk.event_type == "content.delta":
if chunk.delta.type == "text":
print(chunk.delta.text, end="", flush=True)
elif chunk.delta.type == "thought_summary":
print(f"\n💭 {chunk.delta.content.text}", flush=True)
elif chunk.delta.type == "image" and chunk.delta.data:
image_bytes = base64.b64decode(chunk.delta.data)
display(Image(data=image_bytes))
elif chunk.event_type in ("interaction.complete", "error"):
is_complete = True
if chunk.event_type == "interaction.complete":
print("\n✅ Research Complete")
stream = client.interactions.create(
input="Research AI chip market trends. Include charts comparing vendors.",
agent="deep-research-preview-04-2026",
background=True,
stream=True,
agent_config={
"type": "deep-research",
"thinking_summaries": "auto",
"visualization": "auto",
},
)
process_stream(stream)
# Reconnect if the connection drops
while not is_complete and interaction_id:
status = client.interactions.get(interaction_id)
if status.status != "in_progress":
break
stream = client.interactions.get(
id=interaction_id, stream=True, last_event_id=last_event_id,
)
process_stream(stream)