Cursor just did something most developer tool companies never attempt: they built their own frontier-level coding model. Composer 2 is now available in Cursor, and the value proposition is striking — frontier intelligence at $0.50/M input and $2.50/M output tokens, making it what Cursor calls "a new, optimal combination of intelligence and cost."
The Benchmarks
Composer 2 posts significant jumps across every benchmark Cursor tracks:
- CursorBench: 61.3 (up from 44.2 for Composer 1.5 and 38.0 for Composer 1)
- Terminal-Bench 2.0: 61.7 (up from 47.9)
- SWE-bench Multilingual: 73.7 (up from 65.9)
These improvements come from Cursor's first continued pretraining run, which provides a stronger base for scaling their reinforcement learning. From that foundation, the model is trained on long-horizon coding tasks through RL, enabling it to solve complex challenges requiring hundreds of sequential actions.
Two Speed Tiers
Composer 2 ships in two variants at the same intelligence level. The standard tier runs at $0.50/M input and $2.50/M output. A faster variant — now the default — costs $1.50/M input and $7.50/M output, which Cursor notes is still cheaper than other fast models at comparable quality. On individual plans, Composer usage draws from a standalone pool with generous included usage.
Why It Matters
The real story isn't any single benchmark number — it's that a developer tools company is now building and shipping its own frontier models. Cursor controls the entire stack from model training to IDE experience, which means they can optimize for exactly how developers work rather than adapting to what third-party model providers offer. With each Composer iteration showing steep improvement curves (38 → 44 → 61 on CursorBench), the trajectory suggests this gap will only widen.