Welcome to 2027.dev
2027.dev evaluates how well AI agents can use your developer tool. We run automated agents against your docs and product, then generate detailed reports with scores, friction points, and recommendations.Get Started
Run your first eval in minutes
MCP Server
Connect to Claude Code, Cursor, or any MCP client
GitHub Integration
Auto-run evals on every PR
Understanding Reports
Scores, dimensions, and what they mean
How it works
- Scan your domain — our agent browses your docs and product to understand what you offer
- Generate eval prompt — we create a task that tests the core workflow of your tool
- Run the eval — an AI agent (Claude Code + browser) attempts the task from scratch
- Get your report — detailed scoring across multiple dimensions with actionable recommendations
What we measure
- Documentation quality — can the agent find what it needs?
- API usability — are the APIs intuitive and well-designed?
- Error handling — does the product give useful error messages?
- Integration flow — how smooth is the setup-to-working experience?
- Tool discoverability — can AI agents discover and use your tools?
Quick MCP setup
No npm install needed. One command:evals_.
Manual config file setup (Claude Code, Cursor, other clients)
Manual config file setup (Claude Code, Cursor, other clients)
Add to your Claude Code config (
~/.claude/claude_desktop_config.json) or Cursor config (.cursor/mcp.json):