AI Agents defeat obfuscated JavaScript in 10 minutes
An LLM agent that runs the obfuscated code defeats default-profile JS obfuscation in minutes — including the VM-mode bytecode that everyone calls "the safe one". I wrote the long version on the AfterPack blog with the exact prompts, the 883-line script that failed, and the recovered source. This is the shorter version, focused on how the two runs actually went and why they went that way.
A few weeks ago I argued that minified JavaScript was never really hidden, and a fair number of people pushed back with: that wasn't obfuscation, that was minification — different thing, much weaker. Fair. So I went and tried it on real obfuscation: an open-source obfuscator in its strongest "VM obfuscation" mode, and a commercial enterprise product. Both vendors' own published demo files.
The setup each time was deliberately unfair to me: I handed Claude Code an obfuscated file with a four-paragraph prompt — "deobfuscate this, iterate as long as you need, write whatever temp files you want, give me the closest-to-original source plus notes on the techniques" — and I didn't tell it which obfuscator produced the file. No pre-processing, no pre-built tooling.
Run one: nine layers, a custom VM, a wrong first move
First target: 1,587 lines, 68 KB of obfuscated output from the open-source obfuscator's VM mode (~194× the 13-line calculatePrice input it publishes on its landing page), recovered to source in ~10 minutes. Underneath: nine composable defense layers wrapped around a ~1,500-line custom stack-based VM. Claude read the file four times, recognized the kind of obfuscation it was looking at from the structure alone, and wrote a six-step plan.
The plan, written before the real work started.
The first attempt failed, and the way it failed is the useful bit. Claude wrote an 883-line deobfuscate.js that statically reimplemented the pipeline — RC4 string decryption, base64, the binary deserializer. It recovered the ~500 encrypted string calls and the environment-fingerprint value, then hit a custom zigzag-varint bytecode format, guessed the wrong version byte, and produced garbage. Reimplementing the deserializer from the outside was the wrong move.
So it stopped reimplementing. A 48-line instrument2.js spliced a few logging hooks into a copy of the obfuscated file, ran that, and let the obfuscator decode its own bytecode at runtime — handing back the function name, parameters, locals, the constants [0.15, 100, 1, "calculatePrice"], and the per-function keys blockKey=54, jumpKey=9643, seKey=4168320119.
Don't fight the deserializer — let the obfuscator run it for you.
A disassemble.js turned the captured 22-instruction bytecode into named opcodes (PUSH_CONST, LOAD_ARG, MUL, SUB, GT, JMP_FALSE, RETURN…) and rebuilt the function:
function calculatePrice(price, quantity) {
const taxRate = 0.15;
const threshold = 100;
let total = price * quantity;
if (total > threshold) {
total = total * (1 - taxRate);
}
return total;
}
console.log(calculatePrice(10, 20)); // → 170
Identical output to the original on every input, boundary case included. The names don't fully survive — argument order came back as price, quantity instead of quantity, unitPrice, the SECRET_* constants came back as taxRate/threshold — because those are inferred from behavior, not recovered. The numeric values 0.15 and 100 came through untouched, since the VM needs them in the bytecode to execute. About ten minutes, start to finish.
Run two: a commercial product, different primitives, same arc
For the commercial enterprise obfuscator I used their published demo — a sprite-atlas module they ship to advertise their default protection — and gave Claude the same shape of prompt, with thinking turned up. On the first read it had the structure: a namespace registry, a URL-encoded XOR-keyed string blob, a 3D index table of opaque integer tags, a self-replacing decoder that primes for exactly eight calls then degrades to a direct lookup, and a control-flow-flattened state machine building one BACKGROUND object. Entirely different primitives from the first run — no bytecode VM, no RC4, no anti-debug timing — but the same six steps translated almost directly, and this run didn't need the instrumentation pivot at all. Static analysis took 24,620 bytes down to:
var BACKGROUND = {
HILLS: { x: 5, y: 5, w: 1280, h: 480 },
SKY: { x: 5, y: 495, w: 1280, h: 480 },
TREES: { x: 5, y: 985, w: 1280, h: 480 },
};
24 KB in, five lines out, about twenty minutes.
Worth noting: this vendor has a public blog post arguing AI can't reverse their obfuscation. Its reasoning — GPT-4 can't execute code, it chokes past ~1 KB, it declines when it can't statically deobfuscate — is accurate for a chatbot. An agent that runs code is a different threat than the one that post evaluates.
Why both runs went the way they did
- The inverse is in the bundle. Every layer ships its own undo — the string decoder, the opcode-shuffle key, the source values for the environment fingerprint — because without it the program can't run. The family is recoverable in principle; cost is the only variable.
- Layers stack, they don't interleave. Composed as
t₁ ∘ t₂ ∘ … ∘ tₙ, the inverse is the reverse composition. LLMs invert individual families fluently — string-array rotation, RC4, base64, XOR-with-constant, seeded Fisher-Yates — sonlayers costs roughlynmodel-turns, not2ⁿ. - A VM is one point of failure. The interpreter that turns bytecode into behavior sits in the bundle. Instrument its dispatch loop once and every function it will run is decoded. Anti-debug timing only fires on slow human stepping; full-speed instrumentation never trips it.
- The skill floor dropped. Reversing control-flow flattening by hand is decades-old work. What changed is who can do it and how fast — and the original names, the one thing reverse-engineers used to invent from scratch, now come back from semantic context. Tools like webcrack, javascript-deobfuscator and HumanifyJS already covered the static and renaming halves; this is the same direction as Google's production CASCADE deobfuscator and the JsDeObsBench work. The one honest caveat — which Elastic's security team keeps making and I'll repeat — is that LLM-recovered code can be confidently wrong, so you verify it runs.
If you want the full nine-layer breakdown, the prompts I used, the 883-line script that didn't work, and the disassembly, it's all in the original post.
So what
The thing that's actually exposed is anything where "expensive to read" was the defense: license checks, trial-vs-paid gates, token validators, anti-fraud heuristics, ranking and recommendation logic, browser-game anti-cheat, DRM. A competitor who needed a senior reverse-engineer and a week now needs an API key and a coffee break. A lot of obfuscation was bought on a cost-of-attack equation that assumed days of expert work per layer; that equation just changed.
I haven't tested the paid add-ons that sit on top of these defaults — environment-bound execution locks, self-defending integrity checks, anti-debug traps, domain-bound execution gates — and those add real obstacles; nor have I tested transforms that scatter and entangle the inverses so they aren't separable into peelable layers. Both are follow-ups. For the defaults of mainstream obfuscators, including a commercial one, I think the claim stands. And yes — I'm building a modern obfuscator at AfterPack, in Rust, on the premise that the only durable answer is transforms whose inverses are scattered and entangled across the bundle deeply enough that recovery becomes combinatorial (n^m), not linear (n). I'm aware that's exactly what you'd expect the person who built it to say, which is why the prompts and recovered code are public; run it yourself before you decide what your bundle is worth hiding.
Originally published on afterpack.dev. Written by Nikita Savchenko.
