How this page works. Edit a field — the formula, the diagram and the live preview update IMMEDIATELY with the draft (the preview is recomputed by the real engine on the real data, nothing is saved yet). Press Save to apply the settings everywhere — there is exactly one configuration shared with the Overview. Defaults are exactly what the metrics document defines; deviations from a document value are flagged deviates from the standard document. Source: “[WIP]: Core Engineering metrics” in Confluence ↗
Standardization rules — Definition of Done & Epics (applies to every metric)
Jira and ADO will be the main systems we need to rely on in order to track the metrics. Before that is described, we need to define a certain set of rules for how we approach work classification, specifically on Jira.
Definition of Done — Several metrics depend on the way we interpret work to be done. Each team should have a clear DoD (Definition of Done) and that should translate naturally to Jira. Usually a good DoD should be "live and enabled", meaning the development reached the production environment and is enabled (feature toggle or AB test is switched on). Typical states on Jira that translate to work done: Done; Resolved; Closed.
⚠ "Release on Stage" shouldn't be treated as work done. "Release on Live" shouldn't be treated as work done if the flow itself already includes the states defined above.
Epics usage on Jira — Epics are usually used to track teams delivery, however some Product Owners are treating Epics as initiative aggregators to join all work done pre and post initiative delivery. This means that an Epic can be in progress while all work items below that Epic are done. This kind of approach can destroy how an engineering team tracks delivery oriented metrics. For such reasons, the tracking tool should have an option to exclude Epics from its metric compute logic (per project).
ADS workflow — real statuses & how the current rules treat them
BacklogApproved for SprintApprovedTo DoNewCommitedOpenSelected for Development1.5dIn Progress72.2dFailed On TestingPending ReviewReady To TestIn TestingOn HoldReady For StagingReady For Review35.7dCODE REVIEW3.6dWaiting for ClientReady For CanaryReady For LiveReady To Merge5dIn Deployment32.3dRemoved✓ DoneReady
Every team has its own status names; the document's standard names are only the defaults. Legend: green = time counts, grey = waiting (passive list), light = intake/new, ✓ = done.
Data quality — real gaps in the data (not hidden) (6)
- • 5 of 7 bugs have no severity value (fields: customfield_10050, customfield_10511, customfield_10093)
- • 0 of 7 bugs carry a prod/pre-prod environment label — the Prod vs Pre-Prod split is unavailable
- • 29 completed tickets have no associated PR (key in PR title/branch)
- • 4 tickets only have PRs opened before work start — TTFP anomaly bucket
- • IMC: 1 incidents mapped to ADS in window (0 linked to ADS issues) · 5 routed to other teams · 0 test
- • IMC environment / remediation / deployment-link are not ingested — the canonical CFR (C5) inputs are unavailable; ask the IMC process to fill them in Jira first
Delivery & Velocity
“The Lead Time should measure the time between when the ticket was created until it is done. The purpose is to measure the full end-to-end customer experience.”
Full text from the document
The Lead Time should measure the time between when the ticket was created until it is done. The purpose is to measure the full end-to-end customer experience.
Rules:
• Done aligned with Team's DoD
• Waiting time should not be excluded. Waiting time can be:
– Non working days (Weekends, bank holidays and vacations)
– On Hold/Blocked states
• Present time in days
Open the source page in Confluence ↗
Formula — follows the fields below
lead = done_at − created_at // calendar days — weekends & holidays counted
done_at = moment of ENTERING a done status [Done, Resolved, Closed] (last such entry in window)
// time spent INSIDE the done status is NOT counted
cohort − issue types [Epic]
done_at = moment of ENTERING a done status [Done, Resolved, Closed] (last such entry in window)
// time spent INSIDE the done status is NOT counted
cohort − issue types [Epic]
How it is measured — updates as you type
Live preview — real data, saved settings
Result with these settings: median 2 days over 43 tickets
Real example: ADS-1324 — done 2026-06-03 = 2.0 days (median-nearest real ticket)
“The Cycle Time should measure the time from when the ticket is put in progress until it is done. The purpose is to measure the actual work that was done.”
Full text from the document
The Cycle Time should measure the time from when the ticket is put in progress until it is done. The purpose is to measure the actual work that was done. Active work is usually the term used for cycle time; however, the word active gets easily misinterpreted. Cycle time is supposed to include waiting time that many times appears after it started. Effective hours/days spent on concrete engineering work without any waiting time is typically something else that's not traditional cycle time (please check Active Work Time).
Rules:
• Same rules as Lead Time
• In Jira, active work usually starts when setting a ticket to the next states: In Progress / Any other?
• If the ticket goes back to New, the time it remains in New should not be included when the ticket goes past In Progress again.
Open the source page in Confluence ↗
Formula — follows the fields below
cycle = done_at − first(status of Jira category “In Progress”) // done_at = ENTRY into a done status; time inside it not counted − Σ(time in “new”-category statuses after start)
// calendar days — weekends & holidays counted
// calendar days — weekends & holidays counted
How it is measured — updates as you type
Live preview — real data, saved settings
Result with these settings: median 1.7 days
Real example: ADS-1311 — lead 16.7 d, cycle 7.7 d
“Active Work Time tracks only when the team is actively working, removing all waiting time from the cycle time. It accurately reflects the work required by the Engineering teams.”
Full text from the document
Active Work Time tracks only when the team is actively working, removing all waiting time from the cycle time. It accurately reflects the work required by the Engineering teams. Besides the waiting time already covered on the Cycle Time and Lead Time there are 2 other layers of waiting time that happen every day.
State transitions — All Jira statuses and how they are meant to be used either represent active time, passive time (wait time), or a mix of both. Examples:
• In Progress — mostly active
• Code Review — a mix of active and passive. Most of the time, moving a ticket for review means waiting on someone else to review a PR. The active work on that state is the actual time each engineer spent reviewing the PR.
• On Hold/Blocked — passive time as the team can't proceed
• Ready To Test — passive time; there is no active time here
• In Testing — similar to In Progress and is mostly active time
• Ready to merge — mostly passive; normally a residual active time
• Release on Stage / Release on Live — mix of active and passive
Working hours — The Engineering teams don't work 24/7 and per definition this metric already excludes non-working days and should also exclude non-working hours on working days, typically 8 or 9 hours of work. We are currently not tracking the time spent on each ticket in each working day. It's possible that someone is actually working on multiple tickets or even involved in meetings and other activities that aren't trackable or not worth it. So this is an attempt of standardizing 8/9 hours of work per day.
Rules:
• Done following the same rules as Lead Time and Cycle Time
• All waiting time defined in Lead Time and Cycle Time should be excluded
– Vacations can potentially benefit from an integration with Bamboo HR
– On Hold/Block assume such states need to be mapped
• Exclude the following states defined in the State Transitions section: On Hold/Blocked; Failed on QA; Ready To Test; Ready To Merge; Any other?
• Implement a logic to comply with the Working Hours section so that non-working hours are excluded. This doesn't need to be perfect, but at least on cycle times of multiple days, it will allow removing most of the time when people are not actually working.
• Present time in days
Open the source page in Confluence ↗
Formula — follows the fields below
active_hours = Σ_workdays min(hours in non-passive statuses, 8h)
shown = active_hours ÷ 24 // unit: calendar days — same scale as Lead/Cycle, active ≤ cycle holds
passive = [On Hold, Blocked, Failed on QA, Ready To Test, Ready To Merge]
workday = weekday ∧ ¬assignee-vacation (BambooHR) ∧ ¬country-holiday
shown = active_hours ÷ 24 // unit: calendar days — same scale as Lead/Cycle, active ≤ cycle holds
passive = [On Hold, Blocked, Failed on QA, Ready To Test, Ready To Merge]
workday = weekday ∧ ¬assignee-vacation (BambooHR) ∧ ¬country-holiday
How it is measured — updates as you type
Live preview — real data, saved settings
Result with these settings: median 0.7 days (chart scale: hours ÷ 24) over 43 tickets
Real example: ADS-1311 — active 2.3 d chart-scale = 56h (Code Review stays active: the document calls it a mix)
- • ADS's real workflow statuses under the current settings — time COUNTS in: CODE REVIEW, Failed On Testing, In Deployment, In Progress, In Testing, Pending Review, Ready, Ready For Canary, Ready For Live, Ready For Review, Ready For Staging, Removed, Waiting for Client
- • excluded as WAITING (passive list): On Hold, Ready To Merge, Ready To Test
- • excluded as intake/bounce ("new"-category): Approved, Approved for Sprint, Backlog, Commited, New, Open, Selected for Development, To Do
- • BambooHR live data feeding the vacation/holiday rule: 91 employees (90 linked to Jira by work email) · 1152 approved vacation requests · 56 public-holiday days (?? 1, IE 4, PT 29, US 22)
“The Time to First PR measures the time between when a ticket was put in progress until an associated PR is opened.”
Full text from the document
The Time to First PR measures the time between when a ticket was put in progress until an associated PR is opened. The intention is to measure how quickly engineers can turn a ticket into a reviewable implementation.
Rules:
• Same rules as Cycle Time
Open the source page in Confluence ↗
Formula — follows the fields below
ttfp = first(associated PR.created_at ≥ first(status of Jira category “In Progress”)) − first(status of Jira category “In Progress”) − Σ(time back in “new”-category statuses) // calendar days — weekends & holidays counted; same rules as Cycle Time
associated ⇔ PR title or source branch mentions the issue key
// only-earlier-PR tickets = anomaly bucket (excluded, disclosed) — spec M5 / T007
associated ⇔ PR title or source branch mentions the issue key
// only-earlier-PR tickets = anomaly bucket (excluded, disclosed) — spec M5 / T007
How it is measured — updates as you type
Live preview — real data, saved settings
Result with these settings: median 0 days · 10 measured, 29 without a PR, 4 PR-before-start anomalies excluded
Real example: ADS-1311 — started 2026-06-04, first PR 2026-06-04 = 0.0 d
“Lead Time for Changes measures the time it takes for a committed code change to successfully run in production.”
Full text from the document
Lead Time for Changes is a core DevOps Research and Assessment metric that measures the time it takes for a committed code change to successfully run in production. The intent is to measure the actual responsiveness of our delivery system, not just the active engineering effort.
Essentially the Lead Time for Changes will behave like the Cycle time, the only difference is that it starts to count when an associated commit is pushed instead of the ticket being set to In Progress.
Rules:
• Same rules as Cycle Time
(From the meeting: "the definition was not… the hours from pull request merge to deployment to production… it's the time of the first commit until that commit lands in production. It's different from the merge.")
Open the source page in Confluence ↗
Formula — follows the fields below
ltc = first(prod deploy after merge, result ∈ [succeeded]) − first_commit
prod ⇔ env name matches [production, prod, prd, live, stable] (lower tokens win first) // calendar days — weekends & holidays counted
prod ⇔ env name matches [production, prod, prd, live, stable] (lower tokens win first) // calendar days — weekends & holidays counted
How it is measured — updates as you type
Live preview — real data, saved settings
Result with these settings: median 3.2 days over 208 changes
Real example: PR #32888 (jarvis-admin) — first commit 2026-05-22 → production 2026-06-08 = 16.8 d
Quality & Reliability
Prod / Pre-Prod Bug Rate
MUST HAVEProd Bug Rate (MUST HAVE) + Pre-Prod Bug Rate (NICE TO HAVE), p.6 ↗“The Prod Bug Rate measures the number of bugs that were reported in the production environment, broken down per severity; Pre-Prod is the same for non-productive environments.”
Full text from the document
Prod Bug Rate — The Prod Bug Rate measures the number of bugs that were reported in the production environment. On top of it, it should also break down this number per severity. These bugs can be created by the Incident Management process or any team member. The important part is that it should account for production bugs.
This metric should also offer a way to look at the number of bugs in a rate fashion, like the average number of bugs per week.
Pre-Prod Bug Rate — This is essentially the same as the Prod Bug Rate but for all bugs found on non-productive environments.
Open the source page in Confluence ↗
Formula — follows the fields below
bugs = count(type ∈ [Bug, Incident], created in window); rate = bugs ÷ weeks
severity = first present of [customfield_10050, customfield_10511, customfield_10093]
env: prod labels [production, prod] · pre-prod [pre-prod, preprod, pre-production, staging, qa] · none → unknown
severity = first present of [customfield_10050, customfield_10511, customfield_10093]
env: prod labels [production, prod] · pre-prod [pre-prod, preprod, pre-production, staging, qa] · none → unknown
How it is measured — updates as you type
Live preview — real data, saved settings
Result with these settings: 7 bugs · 1.6/week · environment unknown (no env labels)
Real example: ADS-1318 — severity 1 - Critical, reported 2026-06-01 · severity split: 1 - Critical: 1 · 2 - High: 1 · Unspecified: 5
“CFR measures the percentage of software deployments to production that result in a degraded service and require immediate remediation.”
Full text from the document
Change Failure Rate (CFR) is a core DORA metric that measures the percentage of software deployments to production that result in a degraded service, impairment, or outage and subsequently require immediate remediation. It evaluates the quality and stability of your delivery pipeline, ensuring that speed (measured by Lead Time) does not come at the expense of stability.
A deployment is a failure if it requires:
• An emergency rollback to a previous version.
• A hotfix or urgent "fix-forward" patch shipped outside the normal cadence, for example, a major incident (P0 or P1) triggered by the Incident Management process but only if the cause was effectively from the code that was shipped.
Rules:
• Change Failure Rate is a percentage metric calculated as the (number of failed deployments / total number of production deployments) x 100
• Failed deployments obtained as mentioned before
Open the source page in Confluence ↗
Formula — follows the fields below
cfr = failed ÷ eligible_production_deploys × 100 // eligible = result ∈ success ∪ failure; canceled/notDeployed excluded
failed ⇔ result ∈ [failed] ∨ manual flag // bridge until Incident Management
failed ⇔ result ∈ [failed] ∨ manual flag // bridge until Incident Management
How it is measured — updates as you type
Live preview — real data, saved settings
Result with these settings: 0 failed / 13 prod deploys = 0%
Flow & Predictability
“The Throughput should measure the number of completed tickets within a timeframe as well as the average number of completed tickets per week.”
Full text from the document
The Throughput should measure the number of completed tickets within a timeframe as well as the average number of completed tickets per week. Usually the latter is more useful to explain what's the usual flow of the team on a week basis.
Rules:
• If Epics are excluded from delivery, they should not count for the throughput.
(From the meeting: "you split this by agents and humans, which is pretty cool"; "having an average of how many tickets per week… we think more of weekly basis".)
Open the source page in Confluence ↗
Formula — follows the fields below
throughput = count(done in window) ; per_week = count ÷ weeks
split: assignee ∈ AI agent roster → agent, else human
split: assignee ∈ AI agent roster → agent, else human
How it is measured — updates as you type
Live preview — real data, saved settings
Result with these settings: 43 tickets · 10/week — agents 8 / humans 35
“The Deployment Frequency measures how often a team/project successfully releases code to the production environment — total plus Daily / Weekly / Monthly rates.”
Full text from the document
The Deployment Frequency measures how often a team/project successfully releases code to the production environment. It serves as a primary indicator of the team's overall throughput, agility and batch size. Like Throughput it should show the total number of successful production deployments for a given timeframe as well as a rate basis. There should be 3 types of rate:
• Daily
• Weekly
• Monthly
(From the meeting: "even if it said 50 deployments in those 30 days, it doesn't tell me what usually I think about, which is number of deploys per week or per day… per day, per week, per month — it's really interesting to have those numbers right there.")
Open the source page in Confluence ↗
Formula — follows the fields below
df = count(prod deploys, result ∈ [succeeded])
rates: ÷ days · ÷ weeks · ÷ (30-day months)
rates: ÷ days · ÷ weeks · ÷ (30-day months)
How it is measured — updates as you type
Live preview — real data, saved settings
Result with these settings: 13 prod deploys · 0.4/day · 3/week · 13/month
“Measures how the flow of a team can be impacted by unexpected work; the classification depends on whether the team reserves maintenance capacity.”
Full text from the document
The Planned vs Unplanned Work measures how the flow of a team can be impacted by unexpected work. The distinction between planned and unplanned work separates value-generating progress from unpredictable operational friction. Unplanned work is any unexpected task, disruption, or emergency that breaks the current iteration or sprint focus. It represents the operational tax paid for system instability, technical debt, or shifting priorities.
Maintenance vs Unplanned Work — Usually, teams should reserve capacity for maintenance work (per sprint, per week), which means that if there is unplanned work, like an urgent issue to fix, this should be treated as planned work if the effort falls under the reserved capacity. If more capacity is needed for the same or other tasks, then it becomes Unplanned work.
Rules:
• If the team does not make any capacity planning for maintenance, any kind of deviation, anything that wasn't planned, should be interpreted as unplanned work (bugs, support requests, stakeholders requests, …). For this scenario, capture unplanned work by: Status Bug, Incident; Label: unplanned_work
• If the team reserves capacity for maintenance, then unplanned work should only be accountable after the reserved capacity has been maxed out. For this scenario, capture unplanned work by: Label: unplanned_work
Open the source page in Confluence ↗
Formula — follows the fields below
unplanned ⇔ type ∈ [Bug, Incident] ∨ label ∈ [unplanned_work]
How it is measured — updates as you type
Live preview — real data, saved settings
Result with these settings: 40 planned / 3 unplanned = 7% (no-reserved-capacity)
Real example: ADS-1318 (Bug) counted unplanned under the current mode
PR Throughput + PR Acceptance Rate
NICE TO HAVEAI DRIVENPR Throughput + PR Acceptance Rate (NICE TO HAVE, AI DRIVEN), p.8 ↗“PR Throughput = merged PRs within a timeframe plus the average merged per week; PR Acceptance Rate = merged vs abandoned.”
Full text from the document
PR Throughput — The PR Throughput should measure the number of merged PRs within a timeframe as well as the average number of merged PR per week.
PR Acceptance Rate — The PR Acceptance rate should measure the percentages of merged PRs vs the ones that were abandoned.
Open the source page in Confluence ↗
Formula — follows the fields below
pr_throughput = count(merged in window) ; per_week = merged ÷ weeks
acceptance = merged ÷ (merged + abandoned) × 100
acceptance = merged ÷ (merged + abandoned) × 100
How it is measured — updates as you type
Live preview — real data, saved settings
Result with these settings: 213 merged · 49.7/week · 12 abandoned → acceptance 94.7%
“The AI Participation Rate should measure the percentages of tickets done that had an AI Agent Assigned.”
Full text from the document
The AI Participation Rate rate should measure the percentages of tickets done that had an AI Agent Assigned.
Open the source page in Confluence ↗
Formula — follows the fields below
participation = count(done ∧ assignee ∈ AI agents) ÷ count(done) × 100
How it is measured — updates as you type
Live preview — real data, saved settings
Result with these settings: 8 of 43 done tickets = 18.6%
Real example: ADS-1336 (Task) — done by an AI agent
Source of every rule on this page: “[WIP]: Core Engineering metrics” — Confluence ↗ + the metrics meeting of 2026-06-08. Quotes are verbatim.