Why Every AI Agent Is Quietly Becoming a File System, Part 1: The Convergence Nobody's Talking About

This is Part 1 of "Why Every AI Agent Is Becoming a File System" — a two-part series on the oldest abstraction in computing quietly winning the AI infrastructure race, and what that means for the future of agent products.

I've been building an agent platform for the past year. During that time I've studied how dozens of teams — from Anthropic to solo open-source developers — solve the same fundamental problem: how does an AI agent know who it is, what it's learned, and what it's supposed to do?

The answers vary wildly on the surface. Different databases, different architectures, different product philosophies. But underneath all of it, I keep seeing the same pattern emerge. And it's one that nobody seems to be talking about.

Everyone is converging on files.

The Pattern That Keeps Showing Up

Claude Code — Anthropic's developer agent — stores project context in a file called CLAUDE.md. It sits in the root of your project. It's plain text, human-readable, and it tells the agent everything it needs to know about how to work in that codebase. Conventions, patterns, what to avoid, what matters. When you start a session, the agent reads the file. When it learns something new, the file gets updated.

OpenClaw — Jason Calacanis's open-source agent framework — does the same thing with different file names. SOUL.md holds the agent's identity and behavioral directives. MEMORY.md holds what the agent has learned over time. INSTRUCTIONS.md holds task-specific guidance. Three files in a directory. That's the agent.

Turso — the database company — shipped something called AgentFS. It's a virtual filesystem that gives agents persistent, file-based storage. The pitch is explicit: agents think in terms of reading and writing files, not querying databases.

Google's Agent-to-Agent protocol describes each agent as a JSON card — essentially a file that declares what the agent can do, how to talk to it, and what it knows about. The entire identity of an agent, expressed as a single structured document.

Anthropic's Model Context Protocol treats external data sources as resources — things an agent can read, like files in a filesystem. The abstraction isn't "query this API." It's "read this resource."

These are not adjacent teams copying each other. These are independent groups — big companies, small startups, solo developers — arriving at the same answer from completely different starting points.

Why This Is Surprising

If you asked most people in AI infrastructure what the state management layer for agents should look like, they'd say vector databases. Or knowledge graphs. Or structured SQL schemas with embeddings. The conversation for the past two years has been dominated by sophisticated data infrastructure — RAG pipelines, embedding models, retrieval strategies, chunking algorithms.

And none of that is wrong. You need databases. You need embeddings. You need retrieval. But those are the infrastructure layer — the engine under the hood. What's converging is the interface layer. The abstraction that agents actually interact with.

A recent survey by Arize AI found that 80% of agent failures in production are state management problems, not prompt quality problems. The agents are smart enough. They lose track of what they know. They forget context between runs. They can't find the thing they learned yesterday. The failure mode isn't intelligence — it's memory architecture.

And the solution that keeps emerging isn't a better database. It's a better abstraction over the database. An abstraction that looks exactly like what we've been using for 55 years: files in directories.

What Makes Files Win

This convergence isn't nostalgic. Nobody is choosing files because they miss the 1970s. Files keep winning because they have properties that agents desperately need — properties that more sophisticated-sounding alternatives don't provide as cleanly.

Files are human-readable. You can open CLAUDE.md in any text editor and see exactly what the agent knows. You can't do that with a vector embedding. When an agent makes a mistake because its context is wrong, you want to be able to read the context, understand the problem, and fix it. Files make that trivial. Databases make it an investigation.

Directories are natural scoping. An agent's workspace is a directory. Its memory is a subdirectory. Its knowledge about Slack is in one folder, its knowledge about email is in another. This isn't an imposed ontology — it's the one every developer already understands. When you need to organize information with clear boundaries and hierarchies, the directory tree is the oldest and most intuitive tool we have.

Files compose. You can version them with git. You can diff them. You can merge them. You can copy an agent's entire identity by copying a directory. You can fork an agent by forking its workspace. None of this requires special tooling. The entire ecosystem of developer tools — built over half a century — works out of the box.

Files are tool-agnostic. Every programming language, every framework, every operating system can read a file. There's no SDK to install, no client library to maintain, no API versioning to manage. The interface is read and write. It's the lowest common denominator in the best possible sense.

The Three-File Pattern

Across the implementations I've studied, a remarkably consistent structure keeps appearing. I've started calling it the Three-File Pattern, because that's what it comes down to regardless of what different teams name the files.

File one: Identity. Who is the agent? What are its behavioral constraints? What persona does it adopt? In Claude Code, this is CLAUDE.md. In OpenClaw, it's SOUL.md. In our own system, it's AGENT.md. The name changes. The function is identical — a plain-text document that defines what the agent is.

File two: Memory. What has the agent learned? What does it remember from previous runs? What patterns has it observed? This is the accumulated intelligence — the thing that makes an agent that's been running for three months more useful than one that started today.

File three: Task context. What is the agent working on right now? What's the current objective? What inputs are relevant to this specific run? This is the ephemeral layer — it changes every time the agent executes.

Identity. Memory. Task. Three files. That's the minimum viable agent state, and it keeps appearing independently across unrelated projects. When multiple teams converge on the same minimal abstraction without coordination, that's usually a signal that the abstraction is fundamental — not just convenient.

What This Isn't

I want to be clear about what I'm not arguing. I'm not saying agents should store everything in flat text files on a local disk. The infrastructure underneath can be anything — Postgres, SQLite, S3, a vector store with full-text search. The convergence is on the interface, not the infrastructure.

Think of it this way: your laptop presents everything as files and folders. Underneath, it's a complex filesystem with journaling, caching, block allocation, and indexing. You never think about any of that. You think about files and folders because that's the abstraction that works for humans.

The same pattern is emerging for agents. The agent thinks in files and directories. The infrastructure underneath can be as sophisticated as it needs to be. But the interface — the thing the agent reads and writes, the thing the developer inspects and debugs, the thing that carries identity and memory — is a file.

In Part 2: The Agent Operating System, I extend this pattern from individual agents to shared knowledge and explain why the result looks less like a database and more like an operating system.

Why Every AI Agent Is Quietly Becoming a File System, Part 1: The Convergence Nobody's Talking About

At a Glance

The Pattern That Keeps Showing Up

Why This Is Surprising

What Makes Files Win

The Three-File Pattern

What This Isn't

Series Navigation

Related Reading