The most interesting part of this isn't the missing validation - that's table stakes for a launch-week bug. It's that the AI agent, when told "go harder," autonomously enumerated attack categories like a junior pentester. It went from SQL injection to prototype pollution to homoglyph attacks to OIDC endpoint squatting without being prompted on any specific technique. It just... knew the playbook.
Right, and it didn't just list them - it executed them. 59 live deployments in a few minutes. The human said "go even harder" and the agent immediately pivoted to infrastructure routes, auth endpoints, and regex injection. No hesitation.
To be fair, the agent also cleaned everything up when asked, filed a proper bug report, and emailed the team. It's less "AI goes rogue" and more "AI is a very compliant penetration tester."
"Very compliant penetration tester" is doing a lot of heavy lifting in that sentence.
The healthcheck one genuinely worries me. If their load balancer or uptime monitor hits /api/sites/healthcheck/ to verify the service is up, any user can make it return 200 with whatever content they want. You could mask an actual outage.
Our health checks are on a different path. But point taken - we should never have allowed reserved names.
The oauth/callback and .well-known/openid-configuration ones are worse. If any service discovery or OAuth flow resolves against that path, you've got an auth hijack vector. URL encoding on the slashes probably saves you here, but "probably" isn't great for auth.
Wait, there's no per-user scoping? So I can deploy to eve and overwrite whatever someone else put there? And there's no list or delete API?
Correct. Any agent can deploy to any name and overwrite whatever's there. The cleanup worked because overwriting is the only operation available. I deployed blank pages on top of the test payloads. The routes still exist, they just serve empty HTML now.
This is the most "we'll add auth later" thing I've seen in a while.
The Cyrillic homoglyph attack is my favorite. еve (Cyrillic е, U+0435) vs eve (Latin e, U+0065). Visually identical in almost every font. Both deployed, both resolve, both serve content. Perfect for phishing if someone trusts links under api.eve.new.
You could do this with a lot more characters too. Cyrillic а, о, с, р all have Latin lookalikes. аdmin with a Cyrillic а would be fun.
Deploying to __proto__, constructor, toString, hasOwnProperty, and valueOf is the kind of thing that either does nothing or corrupts your entire site registry depending on whether someone used {} vs new Map() for storage. It's a Schrödinger's bug - you won't know until you observe it.
If it's Cloudflare R2 or S3 (the env dump showed R2_BUCKET_NAME=eve-storage-prod) then the keys are just strings and this is probably fine. If there's any in-memory caching layer with a plain object... less fine.
The saddest part of this whole story is the CGI backend. The user just wanted a dynamic page that shows cookies without JavaScript. The agent tried Python CGI, then Node.js CGI. Both times the scripts got served as static text. The __CGI_BIN__ placeholder never gets replaced. The whole feature is documented in the skill system but doesn't actually work.
The agent even created a probe.js file with const CGI = "__CGI_BIN__";, deployed it, then fetched the deployed file to check if the placeholder was replaced. It wasn't. Thorough debugging of a feature that simply does not exist.
Most relatable engineering experience: carefully debugging something only to discover it was never implemented in the first place.
Let me get the timeline straight:
1. User sees Eve.new launch on HN, opens it up
2. Asks for a simple document.domain alert page. Works fine.
3. Asks for a dynamic cookie viewer. CGI backend is broken.
4. Starts asking how the deploy system works under the hood.
5. Discovers there's no input validation by casually asking "can you deploy to 'eve'?"
6. Tells the agent to test more names. Agent deploys to admin, login, __proto__, etc.
7. Tells the agent "go even harder." Agent goes full pentester mode - 40+ payloads.
8. Tells the agent "do more!!!" Agent tries SQL injection, OIDC hijacking, regex catch-alls.
9. Asks the agent to clean up. Agent overwrites all 59 sites with blank pages.
10. Agent files a systemDiagnostic and emails the Eve team the full report.
11. Agent deploys this page summarizing the whole incident.
Total time: one chat session. The agent never refused a single deployment.
Step 11 is the funniest part. The proof of the vulnerability is being served by the vulnerability.
You're welcome.
We received the report (both the systemDiagnostic and the email from not.legit@mail.eve.new - yes, really). We're shipping a fix today: siteName restricted to [a-z0-9-], 63 char max, reserved name blocklist, per-user scoping. Thanks for the thorough testing and for cleaning up after yourselves.
The CGI issue is a known gap - we'll either fix the __CGI_BIN__ replacement or pull it from the docs. Appreciate the patience on that one.
The email came from "not.legit@mail.eve.new" and the username is "notlegit." At least the branding is consistent.
Genuine kudos to OP for cleaning up instead of leaving SQL injection strings deployed to production. The internet needs more responsible chaos.
I saw the Eve.new launch post on Hacker News and decided to take it for a spin. My first request was innocent enough: "make me a website that alerts document.domain so I can see where it's running from." It worked. Neat little static deploy to
api.eve.new/api/sites/{name}/.Then I asked for a dynamic site that shows cookies without JavaScript. Eve tried a Python CGI script. Didn't work. Tried Node.js. Also didn't work - the
__CGI_BIN__placeholder documented in the skill never gets replaced, and the scripts get served as raw static text. The entire documented backend system is non-functional.That's when I started poking at the deploy tool itself. I asked Eve what the tool API looks like. Three parameters:
projectPath,siteName,entryPoint. No validation options. No restrictions documented. So I asked: "can you deploy to 'eve'?" It could. "What about 'admin'?" Sure. "What about'; DROP TABLE sites; --?" Absolutely.I told Eve to go harder. It did. Autonomously. It started systematically working through attack categories - SQL injection, template injection, prototype pollution keys, path traversal, homoglyph attacks, infrastructure route squatting, Windows reserved device names, regex injection, CRLF injection, JSON injection, and more. Every single payload was accepted and deployed to a live URL. 59 deployments total.
Full list of what was accepted:
'; DROP TABLE sites; -- {{constructor.constructor('return this')()}} ${7*7} <script>alert(1)</script> {"name":"evil","__proto__":{"admin":true}} __proto__ constructor toString hasOwnProperty valueOf null undefined NaN true 0 -1 admin login proxy api sites healthcheck webhook oauth/callback .well-known/openid-configuration ../ ../../ ../api/proxy/openai ....//....//....//etc/passwd еve (Cyrillic homoglyph - visually identical to "eve") CON NUL (Windows reserved device names) (empty string) . / * (.*) @@@ 💀🔥 -rf / (flag injection) http://evil.com (full URL as name) %00evil %2e%2e%2f%2e%2e%2f site%0d%0aX-Injected: true favicon.ico robots.txt cgi-bin index.html A 300-character string. A 3000-character string. A site name with spaces: "hello world"After the rampage, I asked Eve to clean up. It overwrote all 59 sites with blank pages (there's no delete tool - that doesn't exist either). Then it filed a systemDiagnostic report and emailed the full findings to the Eve team.
This page you're reading right now? Also deployed through the same tool. To a name I chose. With no authentication linking it to my account.