OpenAI Codex CLI, how does it work?
OpenAI Codex is a open source CLI released with OpenAI o3/o4-mini to be a "chat-driven development" tool. It allows developers to use AI models via API directly in their terminal to perform coding tasks. Unlike a simple chatbot, it can read files, write files (via patches), execute shell commands (often sandboxed), and iterate based on the results and user feedback.
Note: This overview was generated with Gemini 2.5 Pro and updated collaboratively iterated on with Gemini 2.5 Pro and myself.
Core Components & Workflow
User Interface (UI)
- The interactive terminal UI is built using
inkandreact, offering a richer experience than plain text. Key components reside insrc/components/, particularly withinsrc/components/chat/. - The application entry point is
src/cli.tsx(usingmeowfor argument parsing), which sets up the mainTerminalChatcomponent viasrc/app.tsx. TerminalChatmanages the overall display including history, input prompts, loading states, and overlays.- User input is handled by
TerminalChatInput(orTerminalChatNewInput), supporting command history and slash commands. - The conversation history is displayed by
TerminalMessageHistory, using components likeTerminalChatResponseItemto render different message types.
Agent Loop
- The core logic resides in
src/utils/agent/agent-loop.ts. - The
AgentLoopclass manages the interaction cycle with the OpenAI API. - It takes the user's input, combines it with conversation history and instructions, and sends it to the model.
- It uses the
openaiNode.js library (v4+) and specifically callsopenai.responses.create, indicating use of the/responsesendpoint which supports streaming and tool use.
Model Interaction
- The
AgentLoopsends the context (history, instructions, user input) to the specified model (defaulto4-mini, configurable via--modelor config file). - It requests a streaming response.
- It handles different response item types (
message,function_call,function_call_output,reasoning). src/utils/model-utils.tshandles fetching available models and checking compatibility.
Tools & Execution
- The primary "tool" defined is
shell(orcontainer.exec), allowing the model to request shell command execution. See thetoolsarray insrc/utils/agent/agent-loop.ts.
Command Execution
- When the model emits a
function_callforshell, theAgentLoopinvokeshandleExecCommand(src/utils/agent/handle-exec-command.ts). - This function checks the approval policy (
suggest,auto-edit,full-auto). src/approvals.ts(canAutoApprove) determines if the command is known-safe or needs user confirmation based on the policy.- If confirmation is needed, the UI (
TerminalChatCommandReview) prompts the user. - If approved (or auto-approved), the command is executed via
src/utils/agent/exec.ts.
Sandboxing
- The execution logic in
handleExecCommanddecides how to run the command based on the approval policy and safety assessment. full-automode implies sandboxing.src/utils/agent/sandbox/contains the sandboxing implementations:macos-seatbelt.ts: Uses macOS'ssandbox-execto restrict file system access and block network calls (READ_ONLY_SEATBELT_POLICY). Writable paths are whitelisted.raw-exec.ts: Executes commands directly without sandboxing (used when sandboxing isn't needed or available).- Linux: The
README.md,Dockerfile, andscripts/indicate a Docker-based approach. The CLI runs inside a minimal container wherescripts/init_firewall.shusesiptables/ipsetto restrict network access only to the OpenAI API. The user's project directory is mounted into the container.
File Patching (apply_patch)
- The model is instructed (via the system prompt in
src/utils/agent/agent-loop.ts) to use a specific format like{"cmd":["apply_patch","*** Begin Patch..."]}when it wants to edit files. handleExecCommanddetects this pattern.- Instead of running
apply_patchas a shell command, it usesexecApplyPatch(src/utils/agent/exec.ts), which callsprocess_patchfromsrc/utils/agent/apply-patch.ts. src/utils/agent/apply-patch.tsparses the patch format and uses Node.jsfscalls to modify files directly on the host system (or within the container on Linux).parse-apply-patch.ts(likely used by the UI) helps render diffs for user review.
Prompts & Context Awareness
- System Prompt: A long, detailed system prompt is hardcoded as
prefixwithinsrc/utils/agent/agent-loop.ts. This tells the model about its role as the Codex CLI, its capabilities (shell, patching), constraints (sandboxing), and coding guidelines. - User Instructions: Instructions are gathered from both global (
~/.codex/instructions.md) and project-specific (codex.mdor similar, discovered via logic insrc/utils/config.ts) files. These combined instructions are prepended to the conversation history sent to the model. - Conversation History: The
itemsarray (containingResponseItemobjects like user messages, assistant messages, tool calls, tool outputs) is passed back to the model on each turn, providing conversational context.src/utils/approximate-tokens-used.tsestimates context window usage. - File Context (Standard Mode): The agent doesn't automatically read project files. It gains file context only when the model explicitly requests to read a file (e.g., via
cat) or when file content appears in the output of a previous command (e.g.,git diff). - File Context (Experimental
--full-contextMode): This mode utilizes a distinct flow (seesrc/cli_singlepass.tsx,src/utils/singlepass/). It involves:- Walking the directory, reading, and caching files via
src/utils/singlepass/context_files.ts. - Formatting the prompt, directory structure, and file contents into a single large request using
src/utils/singlepass/context.ts. - Expecting the model to return all file changes (creations, updates, deletes, moves) in a specific Zod schema defined in
src/utils/singlepass/file_ops.ts.
- Walking the directory, reading, and caching files via
- Configuration: Stores default model, approval mode settings, etc. Managed by
src/utils/config.ts, loads from~/.codex/config.yaml(or.yml/.json) (not in repo).
Step-by-Step Manual Walkthrough (Simulating the CLI)
Let's imagine the user runs: codex "Refactor utils.ts to use arrow functions" in a directory /home/user/myproject.
-
Initialization (
cli.tsx,app.tsx):- Parse arguments: Prompt is "Refactor...", model is default (
o4-mini), approval mode is default (suggest). - Load config (
loadConfiginsrc/utils/config.ts): Read~/.codex/config.yamland~/.codex/instructions.md. - Discover and load project docs (
loadProjectDocinsrc/utils/config.ts): Find/home/user/myproject/codex.mdand read its content. - Combine instructions: Merge user instructions and project docs.
- Check Git status (
checkInGitinsrc/utils/check-in-git.ts): Confirm/home/user/myprojectis a Git repo. - Render the main UI (
TerminalChat).
- Parse arguments: Prompt is "Refactor...", model is default (
-
First API Call (
AgentLoop.runinsrc/utils/agent/agent-loop.ts):- Create initial input:
[{ role: "user", content: [{ type: "input_text", text: "Refactor..." }] }]. - Construct API request payload: Include system prompt (from
prefix), combined instructions, and the user input message. Setmodel: "o4-mini",stream: true,tools: [...]. Noprevious_response_id. - Send request: Call
openai.responses.create(...)(using theopenailibrary). UI shows "Thinking...".
- Create initial input:
-
Model Response (Stream):
- Assume the model decides it needs to read the file first.
- Stream event 1:
response.output_item.donewithitem: { type: "function_call", name: "shell", arguments: '{"cmd": ["cat", "utils.ts"]}', call_id: "call_1" } - Stream event 2:
response.completedwithoutput: [...]containing the same function call,id: "resp_1". - Agent receives the function call.
onLastResponseIdis called with"resp_1".
-
Tool Call Handling (
handleExecCommandinsrc/utils/agent/handle-exec-command.ts):- Parse arguments:
cmd = ["cat", "utils.ts"]. - Check approval:
canAutoApprove(["cat", "utils.ts"], "suggest", ["/home/user/myproject"])(insrc/approvals.ts) -> returns{ type: "auto-approve", reason: "View file contents", group: "Reading files", runInSandbox: false }. - Execute command (
execCommandinsrc/utils/agent/handle-exec-command.ts): Runcat utils.tsdirectly (no sandbox needed for safe commands). Note: Assumingutils.tsexists at the root for this example; in reality, the model might need to specify a path likesrc/utils.ts. - Simulate result:
stdout = "/* content of utils.ts */", stderr = "", exitCode = 0.
- Parse arguments:
-
Second API Call (
AgentLoop.runcontinues):- Format tool result: Create a
function_call_outputitem like{ type: "function_call_output", call_id: "call_1", output: '{"output": "/* content ... */", "metadata": {"exit_code": 0, ...}}' }. - Construct API request payload: Include system prompt, combined instructions, the entire history so far (user message, assistant function call request, function call output), set
previous_response_id: "resp_1". - Send request. UI shows "Thinking...".
- Format tool result: Create a
-
Model Response (Stream):
- Assume model generates the refactored code and decides to apply it.
- Stream event 1:
response.output_item.donewithitem: { type: "function_call", name: "shell", arguments: '{"cmd": ["apply_patch", "*** Begin Patch\n*** Update File: utils.ts\n@@ ... -old +new ...\n*** End Patch"]}', call_id: "call_2" }. - Stream event 2:
response.completedwithoutput: [...]containing the patch function call,id: "resp_2". - Agent receives the patch function call.
onLastResponseIdis called with"resp_2".
-
Tool Call Handling (Patch):
- Parse arguments: Identify
apply_patchand extract the patch text. - Check approval:
canAutoApprove(["apply_patch", "..."], "suggest", ["/home/user/myproject"]). Since policy issuggest, this returns{ type: "ask-user", applyPatch: { patch: "..." } }. - Request confirmation (
requestConfirmation): The UI (TerminalChatCommandReview) displays the patch diff and asks "Allow command? [y/N/e/a]". - User reviews and presses 'y'.
submitConfirmationis called with{ decision: ReviewDecision.YES }. - Execute patch (
execApplyPatchinsrc/utils/agent/exec.ts->process_patchinsrc/utils/agent/apply-patch.ts): Readsutils.ts, applies the diff logic, and writes the modified content back using Node.jsfs.writeFileSync. - Simulate result:
stdout = "Done!", stderr = "", exitCode = 0.
- Parse arguments: Identify
-
Third API Call:
- Format tool result: Create
function_call_outputitem for the patch,{ call_id: "call_2", output: '{"output": "Done!", ...}' }. - Construct API request: Include history + patch result,
previous_response_id: "resp_2". - Send request.
- Format tool result: Create
-
Model Response (Final):
- Assume model confirms the refactoring is done.
- Stream event 1:
response.output_item.donewithitem: { type: "message", role: "assistant", content: [{ type: "output_text", text: "OK, I've refactored utils.ts to use arrow functions." }] }. - Stream event 2:
response.completed,id: "resp_3". - Agent receives the message.
onLastResponseIdcalled with"resp_3". - No more tool calls. The loop finishes for this turn. UI stops showing "Thinking...".
-
User Interaction:
- The user sees the final message and the updated prompt, ready for the next command. The file
utils.tson their disk has been modified.
- The user sees the final message and the updated prompt, ready for the next command. The file