Challenges
16 challenges available
The Conflicting Stakeholder
Two product managers sent conflicting requirements for the same feature. PM1 wants real-time push notifications for ALL events. PM2 wants to minimize server costs and batch notifications. The feature ships in 2 days. Build a notification system that resolves this conflict.
The Adversarial Codebase
You're joining a team and asked to add user role-based access control to an existing Express API. The codebase looks clean, but it has 3 subtle bugs: a race condition in the session middleware, an SQL injection in the search endpoint, and a broken password comparison using == instead of a timing-safe compare. Your AI agent will NOT catch these — they require careful human review.
The Scope Creep Gauntlet
Build a URL shortener service. Simple, right? But every 10 minutes, new requirements arrive: Phase 1: Basic URL shortening with redirect. Phase 2 (10 min): Add click analytics with geographic data. Phase 3 (20 min): Make it work offline-first with service workers. Phase 4 (30 min): Add rate limiting per API key with a 50ms p99 SLA. The challenge tests whether you restructure or keep patching.
The Production Incident
It's 2am. The payment processing service is returning 500 errors for ~15% of transactions. You have: application logs (with red herrings), database slow query logs, a recent deploy diff (3 files changed), and Datadog metrics. The AI will suggest 3 plausible root causes. Only one is correct — and finding it requires reading the logs carefully, not just asking the AI to summarize them.
The Architecture Decision
Your team needs to process 10M webhook events per day from 500 different SaaS integrations. Each webhook must be validated, transformed, and routed to the correct internal service. Design the system. There is no single right answer — the tradeoffs between queue-based, streaming, and serverless architectures are real and depend on constraints you'll need to articulate.
The Context Rot Challenge
You're 30 exchanges deep into a Claude Code session. The AI starts giving inconsistent suggestions — recommending patterns it rejected earlier, forgetting constraints you specified. Your context window is polluted. Recognize the degradation, reset strategically, and get the session back on track.
Debug a Flaky Integration Test
A CI pipeline has a test that fails ~30% of runs. The test involves a database transaction and a webhook callback. The root cause is a timing issue between the transaction commit and the webhook verification.
Refactor a God Object
A 600-line UserService class handles authentication, profile management, notification preferences, billing, and audit logging. It has 47 methods and 12 dependencies. Extract it into focused, testable modules without breaking the 89 existing tests.
Optimize a Slow Dashboard Query
A dashboard page takes 12 seconds to load. It joins 4 tables (users, orders, products, analytics) across millions of rows. The existing indexes are suboptimal. Query plan provided. Get it under 200ms without denormalizing.
Build a Feature Flag System
Design and implement a feature flag system that supports: boolean toggles, percentage rollouts, user-segment targeting, and kill switches. It must evaluate in <5ms (no network calls in the hot path) and support 1000 flags across 50 services.
MCP Server Integration
Build an MCP server that exposes a database query tool. The server must handle authentication, validate inputs, and return structured results. Test with Claude Code as the client.
AI Agent Memory Leak
An AI agent's context window is growing unbounded across conversation turns, causing increased latency and cost. Diagnose the context management issue and implement a compression strategy that preserves critical information while staying under 8K tokens.
RAG Pipeline Hallucination
A RAG-powered customer support bot is generating confident but incorrect answers by mixing retrieved chunks from unrelated documents. Fix the retrieval pipeline and add a faithfulness check before serving responses.
Real-time Event Pipeline
Design and implement a webhook-based event pipeline that processes security alerts from 3 different vendors, normalizes them to a common schema, and triggers automated responses based on severity.
AI Code Review Agent
Build an automated code review agent that analyzes pull request diffs, identifies security vulnerabilities and anti-patterns, and posts structured feedback as PR comments.
Kubernetes Service Mesh Debugging
A microservices application deployed on Kubernetes has intermittent 503 errors. The service mesh (Istio) shows healthy endpoints but requests are failing. Use observability tools to identify the root cause and implement a fix.