How We Built The Phaser Game Agent With Claude Managed Agents And Superserve

And yet, if you watch people talk to an AI about making games — and we have, a lot — that's exactly how they ask. "I want my character to shoot out a grappling hook and swing from the platform." Not "I'd like a distance joint with a raycast attachment point." A grappling hook. A thing they can picture.

Every game framework ever written, including the one we've spent over a decade building, was designed for humans — people who read docs, build a mental model over weeks, and learn to translate "grappling hook" into joints and raycasts themselves. That translation step is exactly what large language models are bad at when you hand them a sprawling, human-shaped API and wish them luck.

So we did something that felt slightly unhinged at the time: we took everything we'd learned building Phaser and rebuilt it from scratch, for an audience that isn't human.

That framework is Phaser AE — the Arcade Engine — and this post is about how it, a content pipeline, a fleet of Claude managed agents, and Superserve sandboxes fit together into the Phaser Game Agent: a thing you can ask for a game, and get one.

A framework shaped like the question

The core design decision behind Phaser AE is simple to state: the API maps to how an agent thinks about what it's doing, not how a renderer thinks about what it's drawing. An agent reasons in verbs and intentions — the ship shoots, the pickup bobs, the boss flies in, fights, dies — so that's the surface we built. A parallax starfield is new Starfield(this). A sprite flashing when hit is sprite.flash(). An enemy flying a loop-the-loop is a Path made of plain data. The genuinely hard, bug-prone stuff — swept collision, pooled sprites, input handling — lives inside the engine, behind a surface small enough for an agent to hold in its head all at once.

That smallness matters more than any feature. A big API doesn't just slow an agent down, it degrades it: too many functions, too little signal, hallucinated lookalikes. Small surface, one consistent idiom, and documentation written as part of the API — because the docs are what the model actually reads.

"Make me a game like Gradius"

The engine got us a long way. Then came the more interesting problem.

When people ask for a game, they overwhelmingly anchor on games they know: make me a Flappy Bird, a Breakout, a Gradius. We've watched that pattern for years — it's the default way humans communicate game ideas. The obvious-but-wrong answer is to teach the framework those games. Do that and the API footprint explodes, and you're right back to the context problem that kills agents on every big framework.

So we drew a hard line: the engine never learns games. The library does.

When we want the system to know a game, we write a game manifest — a structural decomposition of what the game is. Gradius isn't 10,000 lines of shooter code; it's a free-flying ship, scrolling lethal terrain, enemy waves on authored flight paths, a banked power-up meter, ground cannons, phased bosses, and a campaign script tying them together. Each piece is a block, named by its capabilities, never by the game it came from — there's no "Gradius ship" in our database, just actor.avatar.flight, ready for any horizontal shooter anyone ever asks for.

Blocks carry real, known-correct code, annotated with the gotchas it gets right — the kind you only find by building games, like a pooled bullet that must reset every field on launch or a recycled laser is still piercing. They're deliberately built as highly configurable leaf modules — data tables in, themed behaviour out — so an agent searching the library by capability ("the player shoots spreads of bullets") gets working components delivered straight into its sandbox as files, and its job collapses to the part it's genuinely good at: writing the glue that makes this game this game.

The internal loop: agents building the library

The library isn't hand-curated by us typing components into a database like monks. It's grown by a loop where Claude managed agents do most of the work and a human does the one thing humans are irreplaceable for:

An agent authors the manifest from a brief and reference screenshots, searching the library by capability so it reuses blocks instead of inventing duplicates. When we ran this for Gradius — completely cold, no hints — it found all five of the shooter-stack blocks we'd built and composed them correctly.
Agents build the missing components, each developed inside a playable test rig with headless checks gating the mechanics.
An agent builds the whole game from the manifest plus components — the honest test of both.
A human plays it. Automated checks can prove a meter wraps correctly; they can't tell you the landscape scrolls in ugly chunky steps while the sprites glide past it. (A real finding, caught in minutes of play, now permanently encoded in that block's gotcha list.)
We harvest — the reusable pieces are scrubbed of anything game-specific and published to the database, where every future game gets them for free.

The compounding is the reward. Our first game needed everything written from scratch; by Gradius, five of its nine major systems came straight off the shelf, and the build fed three improvements back into the blocks. The knowledge accumulates in exactly one place, designed for accumulation. The framework never grows.

The external loop: Superserve, and where the user fits

The part users see: they talk to the Phaser Game Agent, ask for a game, and a few minutes later they're playing it.

Every conversation runs inside a Superserve sandbox — an isolated VM baked from a template that already contains everything: Phaser AE prebuilt, the project scaffold ready, the toolchain installed. The agent wakes up inside a project that already compiles and does zero setup; every token goes on the game. Isolation buys safety in both directions — the agent can run anything it likes without touching anything that matters, and the sandboxes are deliberately blinkered: a build sandbox can read the content library but can never write to it. Publishing back is a separate, human-gated path.

Inside that sandbox the agent pulls components, generates all the art procedurally — sprites and tiles drawn from canvas primitives, fantastically and deliberately retro — synthesises its sound effects, commissions a chiptune soundtrack, wires it all together, and runs the game's acceptance tests. When it's green, the game publishes, and there's a URL you can play.

The bet

If there's one idea underneath all of this, it's the line we kept redrawing every time it would've been convenient to cheat: don't expand the framework to infinity — expand the components.

Twenty years of building a framework for people taught us what game code needs to get right. Almost all of that knowledge survives the jump to an audience of machines. You just have to stop organising it like a reference manual, and start organising it like an answer to the question someone's about to ask.

Even when that question is about grappling hooks.

The Phaser Game Agent is in active beta and due for release in June 2026. Contact us if you'd like to learn more.

Filter News By:

A framework shaped like the question

"Make me a game like Gradius"

The internal loop: agents building the library

The external loop: Superserve, and where the user fits

The bet