The self-improving company

There is an idea going around right now that I cannot stop thinking about: the self-improving company. Tom Blomfield laid out the clearest version of it in a Y Combinator talk, and AI Jason has a good, practical companion to it on building proactive, self-improving agents. I want to write down what landed for me, because it is the direction I am already building FlatNine in, and putting it into words helped me see what I was missing.

The short version: most companies are adopting AI by bolting copilots onto the org chart. That is the wrong mental model. The interesting move is to build the company itself as a set of loops that get better on their own.

Copilots are the wrong goal

The standard AI plan inside a company is "make our engineers 20% more productive" or "add a copilot to the existing workflow." Blomfield's point, and I think he is right, is that this is aiming far too low. You are taking the existing structure as fixed and sanding 20% off the edges.

The bigger prize is not productivity, it is capability. Not "do the same work a bit faster," but "build an organization that can do things the old shape could not do at all." Once you frame it that way, the copilot stops being the goal. The loop becomes the goal.

What the loop actually is

A self-improving loop is not magic. It is five plain parts wired together:

Sensors. Everything the company does generates signal: support tickets, emails, cancellations, code changes, product telemetry. You capture it.
A policy layer. Clear rules for what the system is allowed to do on its own versus what needs a human to sign off.
Tools. The deterministic things the system can actually call: query the database, send the email, open the pull request.
A quality gate. Evals, safety checks, and human review on the high-stakes stuff, so the loop cannot quietly do damage.
A learning mechanism. When something fails, the system looks at why, and feeds the fix back in so the same failure does not happen twice.

The magic is only in closing it. A loop that observes, acts, notices what went wrong, and updates itself is a fundamentally different object from a chatbot that waits for you to prompt it. AI Jason frames the same shift as reactive versus proactive: most agents sit there until you poke them, while a proactive agent monitors, decides, and acts, and reviews its own logs on a schedule even when nothing has broken.

The prerequisite nobody wants to do: make everything legible

Here is the part that is unglamorous and also the whole game. None of this works if the company's knowledge lives in people's heads and ephemeral conversations.

Blomfield's line for it is blunt: if it is recorded, it happened to the AI. If it is not recorded, as far as the system is concerned, it never happened. So you record the office hours, the Slack threads, the decisions, the outcomes. Then you compress that firehose into something usable: he described regenerating YC's user manual from two thousand hours of recorded office hours into a living document.

I find this clarifying because it reframes "documentation" from a chore into the actual substrate the company runs on. The software you generate on top is disposable. Dashboards, scripts, internal tools: throw them away, regenerate them tomorrow. The durable asset is the data and the context underneath. Tools are cheap now. Context is the moat.

Burn tokens, not headcount

The economic version of all this is the phrase that stuck with me: burn tokens, not headcount. If your API bill does not make you a little uncomfortable, you are probably under-using the most leveraged tool you have ever had access to.

This is the part that maps cleanly onto how I already run things. I am mostly a solo operator. I do not have the option of throwing ten people at a problem, and I have stopped wanting it. The question I ask now is not "who do I hire for this," it is "what loop do I build so this runs without me."

The part I am already living

This stopped being theory for me a while ago. The clearest example: my agents write their own tools.

FlatNine runs on an ensemble of agents (Carla, my AI chief of staff, and a handful of others with their own jobs and their own memory). A while back one of them hit a wall: it was asked to summarize a YouTube video and had no way to pull a transcript. Instead of failing and waiting for me, it wrote the tool. There is a file in my system whose header literally says it was auto-generated by an agent, with the original request that triggered it recorded right there in the comments. It has been part of the toolkit ever since.

That is the loop, in miniature. A gap appeared, the system noticed, it built the missing capability, and the next time the task came around it just worked. No human in the middle. Multiply that across a year and you do not have a company that does the same things faster. You have a company that can do things this week that it could not do last week, and that nobody explicitly built.

The FlatNine Intelligence feeds are the same instinct pointed outward: standing sensors on the news, on new product launches, on what builders are starring, all feeding context back into the system continuously rather than me going to look.

Why this rhymes with orchestration

I wrote a while ago about lossless orchestration, and the argument there was that most multi-agent setups just copy the corporate org chart, CEO agent delegating to manager agents delegating to workers, and that hierarchy is a human crutch we do not actually need when memory and context can be shared losslessly.

The self-improving company is the same idea, seen from a higher altitude. If you stop recreating the org chart in software, what do you build instead? You build loops. The org chart was about routing scarce human attention. Once attention is no longer the bottleneck, the thing worth designing is not who reports to whom, it is how the system senses, acts, checks itself, and learns.

Where the humans go

This is not a story about no people. It is a story about people moving to the edge.

In a company shaped like this, the humans are not in the middle relaying information, which is most of what middle management actually does. They are at the boundary: the novel situations the loop has never seen, the high-stakes calls, the ethical judgment, the taste. Blomfield's framing is individual contributors who bring working prototypes instead of ideas, and a directly responsible individual for every outcome so accountability never gets diffuse. I would add one thing: someone still has to decide what is worth building a loop around at all. That part is not going anywhere.

The honest caveats

I do not want to oversell this. A loop that learns can also compound mistakes if the quality gate is weak, so the boring safety work is not optional. Legibility is genuinely hard and genuinely tedious, and it is the step everyone wants to skip. And most of what gets called a self-improving company today, mine included, is a handful of real loops surrounded by a lot of ordinary work that has not been turned into a loop yet.

But the direction is right, and it is the most exciting thing about building a company right now. You are no longer just building the product. You are building the thing that builds the product, and then teaching it to improve itself.

That is what I am pointing FlatNine at.

This is the kind of thing I think about while building FlatNine. The talks that prompted this: Tom Blomfield's How to build a self-improving company with AI and AI Jason's How to build proactive agents and a self-improving company. I post the builds @mikerubini.

Copilots are the wrong goal

What the loop actually is

The prerequisite nobody wants to do: make everything legible

Burn tokens, not headcount

The part I am already living

Why this rhymes with orchestration

Where the humans go

The honest caveats

Related Posts

How AI handles my email before I read it

I built my own Claude eval

I built a machine that watches what the best builders star