DevOps, Agile & Scrum Metrics Interview Questions: DORA, Velocity, and EVM (2026)
Project management metrics are essential for measuring team performance, identifying bottlenecks, and driving continuous improvement. This guide covers key metrics across DevOps, Agile, Scrum, and traditional project management methodologies, along with common interview questions you'll encounter for senior engineering and management roles.
DevOps Metrics (DORA Metrics)
The DORA (DevOps Research and Assessment) metrics are the gold standard for measuring DevOps performance:
1. Deployment Frequency
What it measures: How often your team deploys code to production.
| Performance Level | Frequency |
|---|---|
| Elite | On-demand (multiple times per day) |
| High | Weekly to monthly |
| Medium | Monthly to every 6 months |
| Low | Less than once per 6 months |
Interview Question: "Your team deploys once a month. What steps would you take to increase deployment frequency?"
Strong Answer: "I'd start by analyzing what's blocking more frequent deployments. Common issues include: manual testing bottlenecks (implement automated testing), large batch sizes (break work into smaller increments), lack of CI/CD automation (invest in pipeline infrastructure), and fear of deployments (implement feature flags and automated rollbacks). I'd measure current cycle time, identify the biggest constraint, and focus improvement efforts there."
2. Lead Time for Changes
What it measures: Time from code commit to production deployment.
| Performance Level | Lead Time |
|---|---|
| Elite | Less than 1 hour |
| High | 1 day to 1 week |
| Medium | 1 week to 1 month |
| Low | More than 1 month |
Interview Question: "How would you reduce lead time from 2 weeks to 2 days?"
Strong Answer: "I'd map the value stream to identify wait times vs. work times. Typically, most lead time is waiting—for code review, QA sign-off, deployment windows. Solutions include: implementing trunk-based development, automating code review gates, parallel testing, and enabling self-service deployments. I'd set up metrics dashboards to track each stage and focus on eliminating the longest wait times first."
3. Mean Time to Recovery (MTTR)
What it measures: Average time to restore service after an incident.
| Performance Level | MTTR |
|---|---|
| Elite | Less than 1 hour |
| High | Less than 1 day |
| Medium | 1 day to 1 week |
| Low | More than 1 week |
Interview Question: "Your team's MTTR is 8 hours. How would you bring it under 1 hour?"
Strong Answer: "Fast recovery requires: quick detection (comprehensive monitoring and alerting), rapid diagnosis (centralized logging, distributed tracing, runbooks), and fast remediation (automated rollbacks, feature flags, infrastructure as code). I'd implement on-call rotations with clear escalation paths, conduct regular incident response drills, and ensure every incident has a blameless post-mortem to prevent recurrence."
4. Change Failure Rate
What it measures: Percentage of deployments causing production failures.
| Performance Level | Failure Rate |
|---|---|
| Elite | 0-15% |
| High | 16-30% |
| Medium | 31-45% |
| Low | 46-60% |
Interview Question: "30% of your deployments fail. What's your improvement plan?"
Strong Answer: "High failure rates indicate quality issues earlier in the pipeline. I'd implement: comprehensive automated testing (unit, integration, e2e), staging environments that mirror production, canary deployments to catch issues early, and enhanced code review practices. I'd also analyze failed deployments to identify patterns—are failures from specific services, teams, or types of changes? Then target the root causes."
Agile Metrics
Velocity
What it measures: Story points completed per sprint.
Velocity = Sum of story points completed in sprint
Example:
Sprint 1: 21 points
Sprint 2: 24 points
Sprint 3: 18 points
Average Velocity: 21 points
Interview Question: "A stakeholder asks you to commit to 40 story points next sprint when your average velocity is 25. How do you respond?"
Strong Answer: "I'd explain that velocity is a planning tool, not a performance target. Committing to 40 points would likely result in either incomplete work or quality shortcuts. Instead, I'd discuss: prioritizing the most valuable 25 points of work, identifying if there are blockers we could remove to sustainably increase velocity, or whether we need more team capacity. Inflating commitments destroys trust and predictability."
Sprint Burndown
What it measures: Remaining work throughout the sprint.
Interview Question: "Your burndown chart shows work increasing mid-sprint. What does this indicate and how do you address it?"
Strong Answer: "Increasing work mid-sprint signals scope creep or poor estimation. I'd investigate: Are requirements being added after sprint planning? Are stories poorly defined, requiring more work than estimated? Is technical debt being discovered? Solutions include: stricter sprint commitment practices, better story refinement, and breaking down large stories. I'd also facilitate a team discussion to understand the pattern."
Cumulative Flow Diagram
What it measures: Work items in each stage over time.
Interview Question: "Your cumulative flow diagram shows a widening 'In Progress' band. What's happening?"
Strong Answer: "A widening band indicates a bottleneck—work is entering that stage faster than it's leaving. For 'In Progress,' this typically means too much WIP (work in progress). I'd implement WIP limits to force completion before starting new work, identify why items are stuck (waiting for review, blocked dependencies), and potentially swarm on items to clear the backlog. The goal is smooth flow, not maximizing starts."
Cycle Time
What it measures: Time from work started to work completed.
Cycle Time = Completion Date - Start Date
Example:
Story started: Monday 9am
Story completed: Wednesday 3pm
Cycle Time: 2.25 days
Interview Question: "Your team's average cycle time is 8 days for stories. How would you reduce it?"
Strong Answer: "I'd analyze where time is spent. Often it's: waiting for code review (implement async reviews, pair programming), waiting for QA (shift-left testing, developer-owned quality), context switching (limit WIP), and large story size (break stories smaller). I'd visualize the workflow, measure time in each stage, and systematically attack the biggest delays. Smaller stories almost always reduce cycle time."
Scrum-Specific Metrics
Sprint Goal Success Rate
What it measures: Percentage of sprints where the sprint goal is achieved.
Interview Question: "Your team achieves sprint goals only 40% of the time. How do you improve this?"
Strong Answer: "Low goal achievement suggests problems with planning, scope, or execution. I'd examine: Are goals too ambitious? Is scope creeping mid-sprint? Are there external dependencies causing delays? Solutions include: creating focused, achievable sprint goals; protecting the sprint from scope changes; improving estimation with planning poker; and ensuring stories are truly 'ready' before sprint planning. The team should also reflect on this in retrospectives."
Sprint Predictability
Predictability = (Completed Points / Committed Points) × 100%
Target: 80-100%
Example:
Committed: 30 points
Completed: 24 points
Predictability: 80%
Interview Question: "How do you balance predictability with ambition in sprint planning?"
Strong Answer: "I'd use yesterday's weather—commit to what we actually delivered in recent sprints, not what we hope to deliver. This builds stakeholder trust through reliability. For ambition, I'd identify stretch goals that are valuable if completed but not committed. Over time, as the team improves processes and removes impediments, sustainable velocity naturally increases. Predictability enables better business planning."
Defect Escape Rate
What it measures: Bugs found in production vs. found during development.
Defect Escape Rate = (Production Bugs / Total Bugs Found) × 100%
Target: < 10%
Interview Question: "40% of bugs are being found in production. What's your quality strategy?"
Strong Answer: "High escape rates indicate quality issues in our development process. I'd implement: definition of done that includes testing requirements, automated testing at multiple levels (unit, integration, e2e), code review checklists that include test coverage, and QA involvement earlier in the process. I'd also analyze escaped defects to find patterns—are they in specific areas, from certain types of changes, or missing test scenarios?"
Traditional Project Management Metrics
Earned Value Management (EVM)
Key EVM Metrics:
Planned Value (PV) = Budgeted cost of scheduled work
Earned Value (EV) = Budgeted cost of completed work
Actual Cost (AC) = Actual cost of completed work
Schedule Variance (SV) = EV - PV
Positive = ahead of schedule
Negative = behind schedule
Cost Variance (CV) = EV - AC
Positive = under budget
Negative = over budget
Schedule Performance Index (SPI) = EV / PV
> 1.0 = ahead of schedule
< 1.0 = behind schedule
Cost Performance Index (CPI) = EV / AC
> 1.0 = under budget
< 1.0 = over budget
Interview Question: "Your project has SPI of 0.8 and CPI of 0.9. Interpret this and recommend actions."
Strong Answer: "SPI 0.8 means we're delivering 80% of planned work—we're behind schedule. CPI 0.9 means we're spending more than budgeted for work completed—we're over budget. The project is in trouble on both dimensions. I'd: reassess remaining scope to identify what can be cut or deferred, investigate why productivity is below plan (resource issues, requirements churn, technical problems), reset stakeholder expectations with revised forecasts, and implement tighter controls on remaining work."
Resource Utilization
Utilization Rate = (Billable Hours / Available Hours) × 100%
Target varies by role:
- Developers: 70-80% (need slack for learning, meetings)
- Consultants: 80-90% (billable target)
- Managers: 50-60% (coordination overhead)
Interview Question: "Your team's utilization is at 95%. Is this good or bad?"
Strong Answer: "95% utilization is actually concerning. It means the team has almost no slack for: handling unexpected issues, technical debt reduction, learning and improvement, helping teammates, or innovation. High utilization leads to burnout, quality issues, and inability to absorb any variation. I'd target 70-80% for sustainable performance, using the remaining capacity for improvement work and handling normal variability."
On-Time Delivery
On-Time Delivery Rate = (Projects Delivered On Time / Total Projects) × 100%
Interview Question: "Only 50% of your projects deliver on time. How do you improve this?"
Strong Answer: "I'd analyze why projects are late: scope creep, poor estimation, resource constraints, or external dependencies. Solutions include: better upfront requirements gathering, estimation techniques that account for uncertainty (PERT, Monte Carlo), buffer management, clear change control processes, and earlier identification of risks. I'd also examine if deadlines are being set realistically based on scope and resources, or if they're arbitrary targets."
Team Health Metrics
Team Morale / eNPS
What it measures: Employee satisfaction and likelihood to recommend the workplace.
Interview Question: "Your team's eNPS dropped from +30 to -10 over two quarters. How do you investigate and address this?"
Strong Answer: "This is a significant decline requiring immediate attention. I'd: conduct 1:1s to understand individual concerns, run an anonymous survey to identify specific issues, examine what changed (new leadership, project pressures, process changes), and look at adjacent metrics (turnover, sick days). Once root causes are identified, I'd create an action plan, communicate transparently about what we're doing to improve, and follow up regularly. Morale issues left unaddressed compound quickly."
Code Review Turnaround
Average Review Time = Sum of (Review Complete - PR Opened) / Number of PRs
Target: < 24 hours for first review
Interview Question: "Code reviews are taking 3 days on average. How does this impact the team and how would you fix it?"
Strong Answer: "Slow reviews cause context switching (developers move to new work then must return), longer cycle times, larger review batches (making reviews harder), and frustration. I'd implement: SLAs for review response times, smaller PR sizes, pairing to reduce review need, rotating review duties, dedicated review time blocks, and tooling that surfaces stale PRs. The goal is fast feedback loops."
Metrics Anti-Patterns
Common Mistakes
Interview Question: "What are the dangers of measuring individual developer productivity?"
Strong Answer: "Measuring individuals creates perverse incentives: gaming metrics (inflating story points, padding commits), reduced collaboration (helping others hurts your numbers), and cherry-picking easy work. It ignores that software development is a team sport—the best developers often make others more productive through mentoring, architecture decisions, and unblocking work. I'd measure team outcomes and flow metrics instead of individual output."
Goodhart's Law
"When a measure becomes a target, it ceases to be a good measure."
Interview Question: "How do you prevent metrics from being gamed?"
Strong Answer: "I'd use multiple complementary metrics that are hard to game simultaneously—for example, both velocity AND quality metrics. I'd focus on outcome metrics (customer satisfaction, business results) rather than activity metrics (lines of code, PRs merged). I'd treat metrics as diagnostic tools for improvement, not performance targets. And I'd regularly rotate which metrics we emphasize to prevent gaming any single measure."
Building a Metrics Dashboard
Recommended Metrics by Role:
Engineering Manager:
- DORA metrics (deployment frequency, lead time, MTTR, change failure rate)
- Sprint predictability
- Team capacity utilization
- Code review turnaround
- Defect escape rate
Product Manager:
- Velocity/throughput
- Cycle time
- Sprint goal success rate
- Customer-reported issues
- Feature adoption rates
Executive/Stakeholder:
- On-time delivery rate
- Budget variance
- Team satisfaction (eNPS)
- Customer satisfaction (NPS)
- Business outcome metrics
Conclusion
Effective metrics drive improvement when used thoughtfully. Focus on balanced scorecards that measure multiple dimensions, emphasize team outcomes over individual output, and remember that metrics are tools for learning and improvement—not weapons for blame. In interviews, demonstrate that you understand both what to measure and the human dynamics around measurement.