Everyone is talking about how agents can build anything now. And they mostly can. But an agent is only ever as good as the surface you point it at, and most platforms give it a terrible one: a dashboard full of buttons it can’t press, a “coming soon” page, a feature locked behind a UI that no agent will ever click through.
Neon is different, and that difference is the whole reason I keep reaching for it when I’m building with Claude Code.
Neon is API-first. In my experience, new capabilities show up in their API long before they’re a nice button somewhere, and a lot of things never become a button at all. The console you click around in is just one client of that API. Which means: if a primitive exists, your agent can reach it, read the docs for it, and compose whatever it needs out of it, before anyone ships you the polished version.
This post is me proving that to myself, out loud. I had an itch Neon doesn’t officially scratch, so instead of waiting, I pointed Claude Code at the API and built the missing feature. Twice in this post there’s a little interactive thing you can play with.. they’re not screenshots, they’re running the actual logic in your browser.
What “API-first” actually means for an agent
When I say a platform is good for agentic coding, I mean something very specific: the thing my agent wants to do is expressible as an API call, and the docs to do it are readable.
That’s it. That’s the whole bar, and it’s shocking how many tools fail it. An agent can’t “open settings and toggle the thing.” It can’t watch a launch video. It reads docs and it makes requests. So the best platform for an agent is the one whose entire surface is an API, with reference docs that read like a contract.
Neon is exactly that. Projects, branches, compute endpoints, autoscaling bounds, consumption metrics, quotas, billing.. all of it sits behind api-docs.neon.tech as plain REST. I leaned on this hard in Agentic Workflows: Branched Environments, where the entire preview-environment trick is just the agent calling the branch API on every pull request. No clicking. No waiting for a feature. Just primitives, composed.
So when I hit a wall recently, my instinct wasn’t “I wish Neon had this.” It was “the primitive is probably already in the API, let me go build the wall-climbing tool myself.”
How serverless compute actually bills you
Before the wall, I need you to understand why Neon is even billed the way it is, because it’s the thing that makes the rest of this post matter.
Most databases you’ve rented are billed flat. You pick a box (some vCPUs, some RAM, an RDS instance, a droplet, whatever), and you pay for that box every hour of every day whether you touch it or not. You provision for your peak, because you have to survive your worst Tuesday, and then you pay for that peak 24/7 even though your worst Tuesday happens for about twenty minutes a month. The other 729 hours, you’re paying for an idle machine to sit there and be ready. It’s a hotel room you booked for a year so you’d have somewhere to sleep three nights.
Serverless compute flips that. Neon doesn’t sell you a box, it sells you what you actually consume. Compute is metered in CU-hours (a CU, a “compute unit”, is about 1 vCPU and 4 GB of RAM), it autoscales up when load arrives and back down when it leaves, and crucially it can scale to zero when nothing is connected. Storage and egress are metered separately, on their own dials. You’re billed for the area under the curve, not for the height of the tallest spike.
Here’s the part people miss: done right, this is cheaper than flat, often dramatically. Think about what your databases actually do. A staging environment that gets used three hours a workday. A side project nobody visits at 4am. A preview branch that lives for the length of a pull request and then dies. A little agent experiment that runs in bursts. Under flat billing you pay for all of those as if they were busy 24/7. Under serverless they cost roughly what you use, and the idle time trends toward free. The same workload that needs an always-on box sized for peak might bill a fraction of that when only the active hours count.
But (and you can feel the “but” coming) the exact thing that makes it cheap makes it dangerous. Flat billing has a ceiling built in for free: the box can only do so much, and you already paid for it, so the worst case is “it gets slow.” Usage billing has no ceiling. A runaway query, a retry loop, an agent that decides to reindex everything, a traffic spike, a bit of egress someone’s scraping.. none of that hits a wall. It just scales up, serves the load, and bills you for it. The curve that trends toward free when you’re idle trends toward terrifying when you’re not.
So to safely capture the “cheaper than flat” win, you want two things: a floor (scale to zero, so idle is free) and a ceiling (a hard cap, so a bad night can’t bankrupt you). Neon gives you the floor for free. The ceiling is where my wall is.
The itch: a “Coming soon” I didn’t want to wait for
I run a lot of autonomous-ish things on Neon. If you’ve read the Clopus posts you know I’m comfortable handing a Claude Code instance the keys and letting it run. And given everything I just said about unbounded usage billing, you can see exactly what keeps me up at night: it isn’t the agent doing something weird, it’s the bill. An autoscaling Postgres that an agent (or a traffic spike, or a runaway loop) can push to 8 CU while I’m asleep is a genuinely scary object to own.
So I went looking for the ceiling: a hard spending limit. And Neon does have a spending limit. You set a monthly dollar number, and when you cross 80% and then 100% of it, your org admins get an email. Genuinely useful.. but it’s the soft version. It never actually stops anything. Your database keeps happily burning money while you read the email.
Here’s the part I want to be fair about, because it’s the whole mood of this post: the hard version is clearly on the way. Open the spending-limit dialog and there’s a “Suspend projects” toggle sitting right there.. wearing a “Coming soon” badge. They’re building exactly the feature I want.
I just didn’t want to wait for the changelog.
That’s the honest framing here. This isn’t “Neon is missing something, boo.” The feature is coming. But I wanted it today, running tonight, before my next autonomous experiment, and an alert is not a cap. I wanted a number that, when crossed, turns the compute off. And because Neon is API-first, “coming soon” for the dashboard doesn’t mean “coming soon” for me. So I went and looked at what the API already exposes.
The levers the API hands you
This is the part that made me grin. The console gives you exactly one spend control, and it’s the toothless one. The API gives you the levers to build a real one. Click through these, it’s the actual difference that made the tool possible:
Read that left-to-right and the story tells itself. The console: an alert. The API: a way to read true spend, a way to set a per-project cap that suspends compute, and a way to flip an endpoint off the instant you decide to. The feature didn’t exist, but every ingredient for it did. That gap, between “the product doesn’t have it” and “the API absolutely lets you build it,” is the gap agentic coding lives in.
So I didn’t file a feature request. I opened Claude Code.
Building the thing
The opening prompt was basically just me describing the itch and telling it where to read:
Neon’s native spending limit is alert-only, it emails you but never stops compute. I want a small tool that reads a project’s real usage from the Neon API, converts it to dollars, and actually suspends compute when I cross a monthly budget I set. Read the API reference first (api-docs.neon.tech), figure out which endpoints expose usage and which ones can stop a project, then build it.
It read the docs, found the consumption endpoints and the quota field, and had a working first pass fast. Convert usage to dollars, set a compute_time_seconds quota equal to the dollars-of-compute I had left, let Neon’s own metering pull the trigger. Clean.
Then I actually stress-tested it, and learned the lesson the hard way.
The per-project usage field you get from GET /projects/{id} (the obvious, no-org-id-required one) lags, and can freeze badly. In live testing it stuck at one value while real usage was around 70x higher. And because the project quota is gated by that same frozen meter, a quota I’d set 5x under what I was actually spending simply never tripped. The “cap” did nothing. I was over budget and the safety net just sat there.
That’s the kind of thing you only find by running it for real, and it’s exactly where API-first pays off a second time. The fix wasn’t to wait for Neon to make the field accurate. It was to route around it, because the API already had better roads:
The per-project usage field is freezing, it’s reading ~70x under real usage, so the quota never suspends. Switch the usage source to the consumption_history/v2 API (it needs an org_id but it’s invoice-accurate), and stop trusting the quota as the trigger.. disable the endpoint directly the moment accurate spend crosses the budget.
Two API calls I hadn’t even needed in v1 turned a broken backstop into a real cap:
consumption_history/v2for truth: billing-grade usage that matches the invoice, not the field that lies.PATCH endpoint disabled=truefor the stop: meter-independent, immediate, no waiting for Neon’s metering to catch up.
npx neon-budget check # read-only: what am I spending vs budget?
npx neon-budget enforce # apply the cap (quota + endpoint disable)
npx neon-budget watch -i 15m # run it on a loop
The whole thing is a few hundred lines, it’s on GitHub, and I want to be clear about why it exists: not because I’m smart, but because Neon’s surface let an agent and me build, in an afternoon, a capability the product doesn’t ship. That’s leverage.
Play with it
Here’s the budget logic, running live in your browser. Drag the sliders: set your plan, your average compute, how many hours it’s actually active (this is the serverless dial from earlier.. push it below 730 and watch what you save versus an always-on box), storage, egress. Then pick a monthly budget and a policy. It computes the real Neon cost and shows you exactly what neon-budget would do: just alert, throttle the autoscaling ceiling down in tiers, or hard-stop and disable the endpoint.
This isn’t a mockup, it’s the same cost math and the same decision function that runs in the tool, ported straight into the page. Push the compute up, watch the hard-stop trip. Drop egress under 500 GB and watch that line go to zero (it’s included). The dollars to compute_time_seconds quota it prints is the literal value the tool PATCHes onto your project.
Why this is the real reason Neon wins for agents
Step back from spending caps for a second, because the tool is just the example.
The reason Neon is so good for agentic coding isn’t any single feature. It’s that the feature you want is almost always already expressible, because they build API-first. When the primitive lands in the API early, the gap between “Neon doesn’t do that yet” and “I have it running” is one good prompt and an afternoon, not a roadmap.
That changes how you work. You stop treating the product’s feature list as your ceiling. The branch API gave me preview environments before I “had” preview environments. The consumption and endpoint APIs gave me a hard spending cap that, strictly speaking, the product still doesn’t offer. None of that required Neon to ship me anything new. It required them to have been honest enough to expose the platform as an API, and an agent willing to read the docs.
That’s the bet I’m making with agentic coding in general: the winners will be the platforms that are complete at the API layer, because that’s the only layer an agent can actually build on. Neon is the most complete one I’ve found. So I keep building things it doesn’t have yet, and so far, it keeps already having the pieces.
Closing
Don’t wait for the button. If the primitive is in the API, it’s already yours: point your agent at the docs and go build the feature. With Neon, more often than not, the hard part is already done for you, sitting quietly in the API reference while everyone else waits for the changelog.
Thank you for reading!