Devramp

Join Waitlist

Code change accuracy improved
55% in preliminary experiment.

We recently ran our first controlled experiment to measure how Devramp improves agent accuracy. Using Claude Code (Sonnet-4) on a 100K-line Go codebase, we tested eight representative pull requests (10–40 files each). For each PR, we generated structured summaries: a 300-word explanation of functional and technical changes (with no file or symbol references), distilled down to a 30-word prompt. With Devramp, the agent's outputs matched human reference diffs far more closely. On average, accuracy improved by 55%, and variability dropped by 18 percentage points — making results both more reliable and more consistent.

Result

100k LOC, GO

Date

23rd Sept 2025

Agent/Model

Claude Code (Sonnet-4)

Accuracy

+55

Variability

-18

Date

Codebase

Agent/Model

/devramp

Accuracy

/devramp

Variability

23rd Sept 2025

100k LOC, GO

Claude Code (Sonnet-4)

+55%

-18pp

Subscribe for Updates

Subscribe to be notified when we post new experiments, we'll also inform you of new blog posts.

Frequently Asked Questions

Ready to make AI work in your complex code base?

Without context AI stumbles in complex codebases.
Devramp makes it work!

Join Waitlist