Skip to main content

Welcome to 2027.dev

2027.dev evaluates how well AI agents can use your developer tool. We run automated agents against your docs and product, then generate detailed reports with scores, friction points, and recommendations.

Get Started

Run your first eval in minutes

MCP Server

Connect to Claude Code, Cursor, or any MCP client

GitHub Integration

Auto-run evals on every PR

Understanding Reports

Scores, dimensions, and what they mean

How it works

  1. Scan your domain — our agent browses your docs and product to understand what you offer
  2. Generate eval prompt — we create a task that tests the core workflow of your tool
  3. Run the eval — an AI agent (Claude Code + browser) attempts the task from scratch
  4. Get your report — detailed scoring across multiple dimensions with actionable recommendations

What we measure

  • Documentation quality — can the agent find what it needs?
  • API usability — are the APIs intuitive and well-designed?
  • Error handling — does the product give useful error messages?
  • Integration flow — how smooth is the setup-to-working experience?
  • Tool discoverability — can AI agents discover and use your tools?

Quick MCP setup

No npm install needed. One command:
claude mcp add 2027-evals --transport http https://2027.dev/evals/api/mcp --header "Authorization: Bearer evals_YOUR_API_KEY"
Get your API key at 2027.dev/evalsSettingsAPI Keys. Keys start with evals_.
Add to your Claude Code config (~/.claude/claude_desktop_config.json) or Cursor config (.cursor/mcp.json):
{
  "mcpServers": {
    "2027-evals": {
      "url": "https://2027.dev/evals/api/mcp",
      "headers": {
        "Authorization": "Bearer evals_YOUR_API_KEY"
      }
    }
  }
}
See the full MCP setup guide for more details.