The Agent Convergence, Part 1: Everyone's Shipping Agents — That's Not the Interesting Part

This is Part 1 of "The Agent Convergence" — a three-part series on why every company shipping agents simultaneously is less interesting than the architectural question underneath: what actually makes an agent useful for your specific work?

Jason Calacanis posted something this week that caught my attention. He noted that Notion, Google, and Microsoft all dropped respectable agents in the same week — and said he's going to focus on cross-platform, open source instead. His take: the big prize for startups is "owning your data, the front end, your corporate memory and refining proprietary skills."

He's right about the convergence. He's right about where the value lives. And I think he's asking exactly the right question — just from the wrong end.

The Week Everything Converged

Let me set the scene, because the timing matters.

March 9, 2026: Satya Nadella announces Microsoft Copilot Cowork. You hand off a task, it turns your request into a plan, executes it across your apps and files, grounded in your work data. Anthropic's Claude is the underlying model. Enterprise pricing at $99/user on the E7 tier.

Same week: Notion ships agents. Google ships agents. Notion's work within Notion. Google's work within Google Workspace. Microsoft's work within M365. Each one does roughly the same thing — takes a task, reasons about it, executes across connected tools, returns a result.

This is not a coincidence. The industry has converged on a single insight: the future of AI at work isn't answering questions. It's doing work. Not "what should I write in this email?" but "write the email, check my calendar, prep the meeting brief, and update the project tracker."

The form is settled. Agents are the work abstraction. The debate about whether AI should be a chatbot or an agent is over. Everyone chose agent.

That's Not the Interesting Part

Here's what I think gets missed in the excitement: convergence on the form tells you nothing about convergence on the architecture. And architecture is where the actual question lives.

Every one of these agent products has a fundamentally different answer to the same set of questions: How does the agent know about your work? How does it remember what happened last time? How does it decide what to do without being asked? How does it get better over time?

Microsoft's answer: Work IQ — their graph of your M365 data. The agent reads your files, emails, and calendar through Microsoft Graph, reasons about them, and acts. It's powerful if you live entirely in M365. It's a walled garden if you don't.

Notion's answer: your Notion workspace. The agent knows what's in Notion. That's it. If the relevant context lives in Slack or Gmail or Google Drive, the agent is blind to it.

Google's answer: your Google Workspace data. Same pattern, same limitation.

Jason's answer with OpenClaw: cross-platform, open source, local model. An always-on persistent agent with identity and memory. The bet is that the model runs locally, the data stays yours, and the connections span your whole stack.

Each of these is a bet on a different architecture. And I think the architecture question is more important than the model question, the UI question, or the pricing question.

The AC/DC Question

My take — and I'll be upfront that I'm building in this space, so I'm biased — is that AI is electricity. That's not a metaphor. It's a structural observation.

When electricity was new, the debate was about AC vs. DC — alternating current vs. direct current. Edison backed DC. Westinghouse and Tesla backed AC. Both worked. Both powered things. The question wasn't "will electricity win?" — that was obvious. The question was which standard would emerge for distributing it.

I think agents are in the same moment. The debate isn't "will agents replace chatbots for work?" — that's settled. The debate is about the distribution architecture. How does the power get from the model to your actual work?

Microsoft is building a closed grid: everything flows through M365. Google is building a closed grid: everything flows through Workspace. Jason is building an open grid: cross-platform, open source, local model.

But there's a question underneath all of those: what makes the electricity actually useful at the point of consumption? It's not the generator. It's not the wires. It's the transformer — the thing that takes raw power and converts it to the right voltage for the right device.

In the agent world, that transformer is context. Not generic context — not "here's your email inbox." Specific, accumulated, per-task context that makes the agent's output actually relevant to your specific work, your specific clients, your specific decisions.

What Jason Gets Right

Jason's instinct about where the value lives is sharp: "owning your data, the front end, your corporate memory and refining proprietary skills." Let me translate that into the architectural question:

Owning your data: The perception layer. How agents sense the external world. If your agent can only see one platform's data, your agent is only as smart as that one platform allows. Cross-platform perception isn't a feature — it's a prerequisite for agents that actually understand your work.

Corporate memory: Persistence. Not session memory (what happened in this chat) but operational memory (what happened across every execution, every interaction, every correction). Memory that compounds.

Proprietary skills: Per-agent specialization. Not one generic agent that does everything — specialized agents that each do one thing well, with their own instructions, their own memory, their own learned preferences.

This is the real question the convergence raises. Everyone has agents now. The differentiation isn't the agent — it's the infrastructure underneath the agent.

The Future Is Still Open

Here's why I think this matters for anyone building or buying in this space right now: the agent form factor is settled, but the architecture is not. We're in 1892, not 1920. AC and DC are both live standards. The winner hasn't been determined.

That means the interesting question for any company evaluating agents isn't "which agent should we use?" It's "which architecture is making the right bets about persistence, memory, cross-platform perception, and specialization?"

The model will change. GPT-5, Claude 4, whatever comes next — the raw intelligence keeps improving and the cost keeps dropping. The UI will change. Today it's a chat interface; tomorrow it might be ambient. The pricing will change.

But the architecture? The decisions about how agents persist, how they remember, how they sense the world, how they specialize? Those compound. And they compound in different directions depending on which bets you make.

In Part 2, I'll get into the specific architectural question I think matters most: why the difference between a session-based agent and a persistent agent isn't an incremental improvement — it's a category difference. And why most of the agents shipping this week are optimizing for the wrong thing.

This is Part 1 of "The Agent Convergence." Part 2: The Sleep-Wake Architecture explores why persistent agents that sleep between executions will outperform always-on or session-based alternatives. Part 3: Knowledge Accumulation as Competitive Moat makes the business case for why the 50th run should be meaningfully better than the 1st.

The Agent Convergence, Part 1: Everyone's Shipping Agents — That's Not the Interesting Part

At a Glance

The Week Everything Converged

That's Not the Interesting Part

The AC/DC Question

What Jason Gets Right

The Future Is Still Open

Series Navigation

Related Reading