Core Concepts
This page explains what you get when you use Matrix — the features and workflows from a user's perspective. For internal implementation details, see Architecture.
Agent Basics
An AI agent is a Large Language Model (LLM) running in a loop with access to tools. You give it a goal, and it autonomously observes, reasons, acts, and repeats until the goal is achieved.
This is different from a chatbot. A chatbot is a single request-response. An agent keeps going autonomously — observing, thinking, acting — until it reaches its goal or gets stuck.
Tools
LLMs can only generate text. Tools bridge text generation to real-world actions. The LLM generates a structured "tool call" (e.g., bash("npm test")), the system executes it, and the output is fed back to the LLM for the next decision.
Tools aren't just convenience — they're how AI gets grounded in reality. Every tool call produces an objective result that the AI cannot hallucinate: a test passes or fails, a file exists or doesn't, a command succeeds or errors. This is what makes autonomous AI programming possible — tools are physical reality for AI.
Matrix gives agents the same tools a human developer uses:
| Category | Tools | Purpose |
|---|---|---|
| File operations | read_file, write_file, edit_file, list_files, search | Read and modify code |
| Commands | bash, background | Run tests, build, install deps |
| Task management | create_task, update_task, get_tree, get_task, done | Manage the task tree |
| Communication | send_message, yield, clarify | Coordinate with other agents and users |
| Cross-project | list_projects, send_message_to_project | Talk to agents in other projects |
| Lifecycle | fork_task_context, close_task, reset_task, delete_task, reorder_tasks | Manage agent lifecycle |
Plus any external tools connected via MCP (Model Context Protocol) servers.
Task Tree
When you give Matrix a goal, it doesn't just hand it to a single agent. Complex work gets decomposed into a task tree — a hierarchy of tasks that agents work on in parallel. Each worker runs tests independently in its own worktree, so the test-driven feedback loop operates at every level of the tree simultaneously.
Here's how it works:
- A root orchestrator receives your goal and reads the codebase to plan an approach.
- It creates sub-tasks, each assigned to a separate agent on its own git branch.
- Workers run in parallel — no waiting for one to finish before starting the next.
- When a worker finishes, the orchestrator reviews and merges its branch.
- If a worker fails, the orchestrator can retry with new instructions or restructure the approach.
Recursive Decomposition
The task tree is genuinely recursive — any agent can create sub-agents. A worker assigned "build the authentication system" can decide it's too complex and become a sub-orchestrator, spawning its own workers for JWT middleware, login endpoints, and session management. There's no hard limit on nesting depth.
Task Lifecycle
Tasks flow through these statuses:
| Status | Meaning |
|---|---|
| draft | Idea captured, not ready for execution. Cannot be started. |
| pending | Ready to execute, waiting to be started. |
| in_progress | Agent is actively working. |
| passed | Agent called done("passed") — work complete. |
| failed | Agent called done("failed") or was interrupted. Can be retried with new instructions. |
| closed | Branch merged, worktree cleaned up. Node preserved in tree for history. |
Persistent Tasks
Some tasks are designed to run periodically rather than once. Persistent tasks have their definitions stored in .mxd/tasks/<id>.json (git-tracked), and closing them resets their status to pending instead of closed — so they can run again in the next cycle.
Persistent tasks have two modes:
reset— Clean start each cycle. Session history is deleted on close, so the agent starts fresh.continue— Resume with context. Session history is kept on close, so the agent picks up where it left off.
This is useful for recurring quality checks, periodic code audits, or any task that should run regularly with the same definition.
Incremental Merge
The orchestrator doesn't have to wait for a worker to call done() before using its work. Workers are encouraged to commit early and commit often, sending progress updates via send_message. The orchestrator can merge individual commits from a worker's branch at any time — cherry-picking completed pieces while the worker continues on the rest.
This matters for large tasks. Instead of an all-or-nothing merge at the end, work flows upward continuously. If a worker completes 3 out of 5 sub-features and then fails on the 4th, the orchestrator already has the first 3 merged. It can assign the remaining work to a new agent without losing progress.
Failure Handling
When a worker agent fails:
- The worker explains why — not a stack trace, but a description written by an agent that understands the problem.
- The orchestrator decides what to do — resume with new instructions, reset for a fresh approach, or restructure the task tree.
- Context is preserved — a resumed worker keeps its full conversation history. No cold start.
Git Worktree Isolation
When multiple agents work in parallel, they need isolation. If two agents edit the same file at the same time in the same directory, they'll overwrite each other's changes.
Each agent gets its own git worktree — a separate directory linked to the same repository, with its own branch:
Repository (shared)
├── .worktrees/
│ ├── task-A-jwt-middleware/ ← Agent 1's worktree (branch A)
│ ├── task-B-login-endpoint/ ← Agent 2's worktree (branch B)
│ └── task-C-auth-tests/ ← Agent 3's worktree (branch C)
└── main working tree/ ← Orchestrator's worktree (main)Agents read and write freely without conflicting with each other. When done, branches merge back — just like developers using feature branches. No custom sync protocol needed — just git.
The base branch is configurable — it's stored on the root task node at project initialization. Worktrees are created from this base branch, and the system prompt is branch-agnostic (no hardcoded main assumption).
Memory System
.mxd/memory.md is a file in your project that persists institutional knowledge across agent sessions.
Agents write to memory during their work — pitfalls discovered, API quirks, architectural decisions, patterns that worked. When branches merge, memory merges through git. Higher-level agents curate the merged result, floating important knowledge to the top and trimming the noise.
If the test suite defines what the software does, memory explains why — why a particular approach was chosen, why an alternative was rejected, what the pitfalls are. Together, tests and memory form the project's complete institutional knowledge: one executable, one narrative.
Why This Matters
- Survives across sessions: Stop and restart — agents load memory on startup.
- Survives compaction: When conversations get compressed, memory stays intact on disk and gets re-read.
- Grows with the project: Every task adds new discoveries. Over time, memory becomes your project's accumulated wisdom.
- Natural selection: Sub-agents write freely. Parent agents curate at merge. By the time knowledge reaches the main branch, it's been filtered through multiple levels — the same way institutional knowledge works in human organizations.
Analogy
Memory is the team wiki that agents actually read and update. Not documentation that goes stale — living knowledge maintained by the agents who use it every day.
Context Compaction
Every LLM has a finite context window — the maximum amount of text it can "see" at once (roughly 200K tokens for Claude). Long sessions fill this up: system prompt, tool definitions, file contents, test output, dozens of back-and-forth cycles.
When the context gets too large, Matrix compresses the conversation into a structured checkpoint containing:
- What the current task is and what's been accomplished
- Key decisions made and why
- What still needs to happen
- Known issues and blockers
The conversation is replaced with this checkpoint, freeing up context for more work. The agent loses the exact wording of early messages but retains the essential knowledge to continue.
Analogy
Like a developer writing notes before taking a break — when they come back, they read their notes instead of trying to remember every detail of their morning session.
After compaction, memory.md is re-read from disk, so any updates the agent made during the session are included in the fresh context. For implementation details on the 7-section checkpoint format, see Architecture.
Cross-Project Communication
This is the feature that sets Matrix apart. Most AI coding tools are project-scoped — your API server and your frontend are separate universes. Matrix connects them.
One Daemon, All Projects
A single Matrix daemon (port 7433) manages every project on your machine. Register projects with mxd init, and the daemon tracks them all.
mxd init /path/to/api-server
mxd init /path/to/web-frontend
mxd init /path/to/shared-library
# All three projects, one daemon, one UI
# Target any project from anywhere with -p
mxd send -p api-server "Add rate limiting to all endpoints"Every project has its own task tree, agents, and memory. But they all live under the same roof.
Talking Across Projects
An agent in one project can message an agent in another:
send_message_to_project(projectId, "What's the API endpoint format for user authentication?")The message arrives in the other project's orchestrator queue. If no agent is running, one is automatically launched to respond. This isn't a request-response API — it's a conversation. The receiving agent can ask follow-ups, consult its own memory, or relay the question to its sub-tasks.
What This Enables
- Coordinated releases: An agent in one project triggers dependent updates across downstream projects.
- Knowledge sharing: Instead of maintaining API docs that go stale, the API project's agent IS the documentation. Other projects ask it directly.
- Multi-repo refactoring: Rename a concept across your entire codebase — not just one repository, but every repository that touches it.
- Cross-project awareness: When your shared library deprecates a function, downstream projects' agents start adapting their code in parallel.
The Dashboard
The web UI at localhost:7433 shows all projects side by side. Switch between them seamlessly. Watch agents across your entire codebase working simultaneously — the API server's agents writing endpoints while the frontend's agents build components that consume them.
Fork Context
When a new agent spawns for a sub-task, it normally starts cold — no knowledge of what the parent agent discovered. Context forking solves this.
fork_task_context copies the parent agent's full conversation history into the child's session. The child starts with everything the parent knows: files read, patterns discovered, decisions made. It has its own identity and task, but it doesn't waste time re-reading files the parent already explored.
This is especially powerful combined with compaction — a parent that has worked on a long task can fork its accumulated knowledge to a new child, even after multiple compaction cycles.
What It Looks Like in Practice
Simple Task
mxd send "Fix the bug in user authentication"
# Agent works autonomously — analyzes code, makes changes, runs tests, commitsComplex Multi-Task
mxd send "Refactor the payment module to use Stripe instead of PayPal"
# The orchestrator will:
# 1. Analyze the codebase
# 2. Create sub-tasks (remove PayPal, add Stripe types, implement API, update tests)
# 3. Spawn worker agents in parallel on separate git worktrees
# 4. Merge results and run the full test suiteInteractive Collaboration
mxd send "Build the new dashboard page"
# ... agent starts working ...
mxd send "Use Recharts instead of Chart.js, and make all charts responsive"
# Agent receives the message and adjusts its approachCross-Project Coordination
# In the API library project — make breaking changes
mxd send "Release v3: rename getUserById to getUser, change to camelCase"
# In each frontend project — migrate
mxd send -p web-frontend "Migrate from api-lib v2 to v3 — ask the api-lib project for details"
# Frontend agents use send_message_to_project to get migration details
# from the agent that actually made the changesFor how these features are built internally, see Architecture.For why Matrix takes this approach, see Why Matrix.Ready to set up? See Getting Started.