The advent of AI-powered coding tools like Cursor and GitHub Copilot has revolutionized software engineering workflows, with the promise of increasing productivity through automated code generation, bug fixing, and change testing. These tools leverage AI models from industry giants like OpenAI, Google DeepMind, Anthropic, and xAI, which have shown remarkable performance improvements in recent years. However, a recent study by the non-profit AI research group METR challenges the notion that AI coding tools universally enhance productivity for seasoned developers.
In a randomized controlled trial, METR recruited 16 experienced open-source developers to complete 246 real tasks on large code repositories they frequently contribute to. The researchers divided the tasks into two groups: "AI-allowed" and "AI-forbidden." The developers were allowed to use cutting-edge AI coding tools like Cursor Pro for the AI-allowed tasks, while the other half required them to work without AI assistance.
Contrary to expectations, the developers predicted that using AI coding tools would reduce their completion time by 24%. However, the study revealed a surprising outcome: allowing AI increased completion time by 19%, making developers slower when using AI tooling. Notably, only 56% of the developers in the study had prior experience with Cursor, the primary AI tool offered. While 94% of participants had used web-based LLMs in their coding workflows, some were exposed to Cursor for the first time in this study. Despite being trained on using Cursor, METR's findings cast doubt on the purported universal productivity gains promised by AI coding tools in 2025.
METR researchers propose several potential explanations for why AI slowed down developers rather than speeding them up. First, developers spent significantly more time prompting AI and waiting for responses when using AI-powered coding tools, as opposed to actively coding. Additionally, AI tends to struggle in large, complex code bases, which were used in this test.
The study's authors are cautious not to draw strong conclusions from these findings, acknowledging that they do not believe AI systems currently fail to speed up many or most software developers. Other large-scale studies have shown that AI coding tools do indeed accelerate software engineer workflows. The authors also emphasize that AI progress has been substantial in recent years, and they would not expect the same results even three months from now. METR has also found that AI coding tools have significantly improved their ability to complete complex, long-horizon tasks in recent years.
While this research offers another reason to be skeptical of the promised gains of AI coding tools, it is essential to consider other studies that have shown these tools can introduce mistakes and, in some cases, security vulnerabilities. As the AI landscape continues to evolve, it is crucial to approach these tools with a balanced perspective, recognizing their potential benefits and limitations.