System translated (Gemini)

A Deep Dive into Claude Code (Part 1): The Evolution of its Product Form and Technical Architecture Link to heading

Introduction: As the most powerful native terminal agent available today, Anthropic’s Claude Code is far more than just a “large model in a wrapper.” Through a deep reverse-engineering analysis of its underlying ~330 tool files, 100+ slash commands, and 146 UI components (written purely in TypeScript + React, running on Bun), we can get a glimpse into how a production-grade terminal agent is built from the ground up.

This article goes beyond a superficial feature review. We will conduct a deep-dive analysis of Claude Code’s industrial-grade design from the perspectives of its product architecture and core technical foundation.


1. Product Architecture: A Terminal Operating System Beyond REPL Link to heading

In its product definition, Claude Code completely breaks away from the monotonous “request-response (blocking REPL)” model of traditional CLI tools. Its core product architecture can be divided into the following key areas:

1.1 A Rich Application Disguised as a Terminal (Terminal as a React App) Link to heading

This is Claude Code’s most game-changing feature at the interaction level. Instead of using a conventional console.log to print text, it implements a complete React renderer and Yoga layout engine (Facebook’s implementation of Flexbox) directly within the terminal environment.

  • The streaming JSON returned by the large model is no longer a jumble of characters but is mapped to a virtual DOM in the terminal (e.g., ink-box, ink-text, ink-link).
  • It supports Double Buffering and Dirty Flag Cascade. This means it can render scrollable highlighted code blocks, smoothly updating progress bars, and even rich interactions like selecting text with a mouse drag, all within a standard terminal.

1.2 A Blazing-Fast Cold Start Architecture (The Race Against Time) Link to heading

As a terminal command that developers run countless times a day, Claude Code’s startup optimization is nothing short of extraordinary:

  • When you type claude in the terminal, even before the TypeScript module import has finished parsing (which takes about 135ms), its main entry point main.tsx immediately spawns a child process to concurrently read the OAuth Token from the macOS Keychain.
  • This means the time-consuming I/O operation and the code parsing time are perfectly overlapped, achieving a visually instantaneous startup.

1.3 KAIROS & ULTRAPLAN: The Persistent Background Brain Link to heading

Claude Code is not merely a passive, reactive tool. In the reverse-engineered code, we discovered two highly ambitious project codenames:

  • KAIROS: A persistent assistant mode. It allows Claude to “Auto-Dream” in the background. When idle, it periodically launches background sub-agents to automatically scan and consolidate conversation history (Memory Consolidation), creating a permanent indexed log.
  • ULTRAPLAN: A cloud-based planning mode. When faced with a refactoring task involving tens of thousands of lines of code, the CLI offloads the task to a cloud cluster (CCR) for up to 30 minutes of silent exploration, without tying up your local terminal. Once the exploration is complete, the plan is “teleported” back to the local client for execution.

2. Technical Architecture: A State Machine to Counter Uncertainty Link to heading

Large models are uncontrollable and divergent. Claude Code’s technical architecture is essentially a series of deterministic interceptors and compressors designed to “tame” this force.

2.1 The Core Engine: Query Engine and Query Loop Link to heading

The soul of the entire conversation resides in QueryEngine.ts and query.ts. It’s a persistent State Machine responsible for handling edge cases:

  • Seamless Resumption: Before making an API request, the engine forces the current context (Transcript) to be written to disk. Even if you kill the process with Ctrl+C midway, the conversation can be perfectly restored on the next launch.
  • Budget Control and Retries (Token Budget Continuation): When the model generates a long code output and is about to exhaust the token limit, the engine doesn’t interrupt the user. Instead, it silently injects a system-invisible instruction: “Continue directly, do not apologize, do not summarize,” thus smoothly bypassing the maximum token limit.

2.2 Context Compaction Mechanisms Link to heading

To combat context window limitations and memory loss, Claude Code has built an extremely complex, three-layered defense of compression:

  1. Micro-compaction: Sets a Time-To-Live (TTL) or character threshold for large file readouts and command-line grep results, truncating them immediately once the limit is exceeded.
  2. Snip Compaction: Crudely erases distant historical dialogue but always preserves a “Protected Tail” of the model’s most recent interactions.
  3. Proactive Summarization (Auto-Compaction): When the total token count approaches the final 13k of the context limit, the system automatically triggers a silent background request, prompting the model to summarize previous lengthy discussions into a concise summary (CompactBoundaryMessage).

2.4 Extensibility: Skills, Plugins, and the MCP Protocol Link to heading

As a next-generation entry point, Claude Code possesses powerful extension capabilities:

  • Its Tool Registry performs deferred loading (Deferred Tool Discovery) based on the model’s current needs. Heavy tools like LSP (for syntax tree completion) are kept on standby, and their schemas are only injected when the model explicitly states, “I need to analyze syntax.”
  • It deeply integrates the MCP (Model Context Protocol), enabling it to seamlessly convert the capabilities of any external system (like local database queries and cloud APIs) into tools the model can call via stdio or network protocols.

In the next installment, “Deep Dive into Claude Code (Part 2): Code Skeleton and Core Logic,” we will take a more micro-level look at its source code, examining how it uses BashTool and a permission system to prevent the model from catastrophic actions like deleting databases, as well as its futuristic Multi-Agent (nested multi-agent) derivation mechanism.