Core Engineering Metrics
How this page works. Edit a field — the formula, the diagram and the live preview update IMMEDIATELY with the draft (the preview is recomputed by the real engine on the real data, nothing is saved yet). Press Save to apply the settings everywhere — there is exactly one configuration shared with the Overview. Defaults are exactly what the metrics document defines; deviations from a document value are flagged deviates from the standard document. Source: “[WIP]: Core Engineering metrics” in Confluence ↗
Standardization rules — Definition of Done & Epics (applies to every metric)
Jira and ADO will be the main systems we need to rely on in order to track the metrics. Before that is described, we need to define a certain set of rules for how we approach work classification, specifically on Jira. Definition of Done — Several metrics depend on the way we interpret work to be done. Each team should have a clear DoD (Definition of Done) and that should translate naturally to Jira. Usually a good DoD should be "live and enabled", meaning the development reached the production environment and is enabled (feature toggle or AB test is switched on). Typical states on Jira that translate to work done: Done; Resolved; Closed. ⚠ "Release on Stage" shouldn't be treated as work done. "Release on Live" shouldn't be treated as work done if the flow itself already includes the states defined above. Epics usage on Jira — Epics are usually used to track teams delivery, however some Product Owners are treating Epics as initiative aggregators to join all work done pre and post initiative delivery. This means that an Epic can be in progress while all work items below that Epic are done. This kind of approach can destroy how an engineering team tracks delivery oriented metrics. For such reasons, the tracking tool should have an option to exclude Epics from its metric compute logic (per project).

ADS workflow — real statuses & how the current rules treat them

BacklogApproved for SprintApprovedTo DoNewCommitedOpenSelected for Development1.5dIn Progress72.2dFailed On TestingPending ReviewReady To TestIn TestingOn HoldReady For StagingReady For Review35.7dCODE REVIEW3.6dWaiting for ClientReady For CanaryReady For LiveReady To Merge5dIn Deployment32.3dRemovedDoneReady

Every team has its own status names; the document's standard names are only the defaults. Legend: green = time counts, grey = waiting (passive list), light = intake/new, ✓ = done.

Data quality — real gaps in the data (not hidden) (6)
  • 5 of 7 bugs have no severity value (fields: customfield_10050, customfield_10511, customfield_10093)
  • 0 of 7 bugs carry a prod/pre-prod environment label — the Prod vs Pre-Prod split is unavailable
  • 29 completed tickets have no associated PR (key in PR title/branch)
  • 4 tickets only have PRs opened before work start — TTFP anomaly bucket
  • IMC: 1 incidents mapped to ADS in window (0 linked to ADS issues) · 5 routed to other teams · 0 test
  • IMC environment / remediation / deployment-link are not ingested — the canonical CFR (C5) inputs are unavailable; ask the IMC process to fill them in Jira first

Delivery & Velocity

The Lead Time should measure the time between when the ticket was created until it is done. The purpose is to measure the full end-to-end customer experience.
Full text from the document
The Lead Time should measure the time between when the ticket was created until it is done. The purpose is to measure the full end-to-end customer experience. Rules: • Done aligned with Team's DoD • Waiting time should not be excluded. Waiting time can be: – Non working days (Weekends, bank holidays and vacations) – On Hold/Blocked states • Present time in days Open the source page in Confluence ↗
Formula — follows the fields below
lead = done_at − created_at  // calendar days — weekends & holidays counted
done_at = moment of ENTERING a done status [Done, Resolved, Closed] (last such entry in window)
// time spent INSIDE the done status is NOT counted
cohort − issue types [Epic]
How it is measured — updates as you type
LEAD created → DoneCYCLE “In Progress” → Done · −time in NewACTIVE only working time in active statuses⏱ clock stops at ENTRY into the done status — time inside it is not countedBacklogIn ProgressactiveBacklogsubtractedIn ProgressactiveFailed On …activeSat–SuncountedPending Re…activeReady To T…waitingIn TestingactiveDoneRemovedafter done — not measuredBottom strip = the ticket’s Jira lifecycle (statuses never disappear). A bracket = time INCLUDED in the metric; a gap between brackets = excluded by the current settings.Lead starts at creation; Cycle and Active start at the firstIn Progress”. Passive now: On Hold, Blocked, Failed on QA, Ready To Test, Ready To Merge.
Live preview — real data, saved settings
Result with these settings: median 2 days over 43 tickets
Real example: ADS-1324 — done 2026-06-03 = 2.0 days (median-nearest real ticket)
Settings used by this metric — the diagram, formula and preview above follow them live; Save applies them everywhere
The Cycle Time should measure the time from when the ticket is put in progress until it is done. The purpose is to measure the actual work that was done.
Full text from the document
The Cycle Time should measure the time from when the ticket is put in progress until it is done. The purpose is to measure the actual work that was done. Active work is usually the term used for cycle time; however, the word active gets easily misinterpreted. Cycle time is supposed to include waiting time that many times appears after it started. Effective hours/days spent on concrete engineering work without any waiting time is typically something else that's not traditional cycle time (please check Active Work Time). Rules: • Same rules as Lead Time • In Jira, active work usually starts when setting a ticket to the next states: In Progress / Any other? • If the ticket goes back to New, the time it remains in New should not be included when the ticket goes past In Progress again. Open the source page in Confluence ↗
Formula — follows the fields below
cycle = done_at − first(status of Jira category “In Progress”)  // done_at = ENTRY into a done status; time inside it not counted − Σ(time in “new”-category statuses after start)
// calendar days — weekends & holidays counted
How it is measured — updates as you type
LEAD created → DoneCYCLE “In Progress” → Done · −time in NewACTIVE only working time in active statuses⏱ clock stops at ENTRY into the done status — time inside it is not countedBacklogIn ProgressactiveBacklogsubtractedIn ProgressactiveFailed On …activeSat–SuncountedPending Re…activeReady To T…waitingIn TestingactiveDoneRemovedafter done — not measuredBottom strip = the ticket’s Jira lifecycle (statuses never disappear). A bracket = time INCLUDED in the metric; a gap between brackets = excluded by the current settings.Lead starts at creation; Cycle and Active start at the firstIn Progress”. Passive now: On Hold, Blocked, Failed on QA, Ready To Test, Ready To Merge.
Live preview — real data, saved settings
Result with these settings: median 1.7 days
Real example: ADS-1311 — lead 16.7 d, cycle 7.7 d
Settings used by this metric — the diagram, formula and preview above follow them live; Save applies them everywhere
Active Work Time tracks only when the team is actively working, removing all waiting time from the cycle time. It accurately reflects the work required by the Engineering teams.
Full text from the document
Active Work Time tracks only when the team is actively working, removing all waiting time from the cycle time. It accurately reflects the work required by the Engineering teams. Besides the waiting time already covered on the Cycle Time and Lead Time there are 2 other layers of waiting time that happen every day. State transitions — All Jira statuses and how they are meant to be used either represent active time, passive time (wait time), or a mix of both. Examples: • In Progress — mostly active • Code Review — a mix of active and passive. Most of the time, moving a ticket for review means waiting on someone else to review a PR. The active work on that state is the actual time each engineer spent reviewing the PR. • On Hold/Blocked — passive time as the team can't proceed • Ready To Test — passive time; there is no active time here • In Testing — similar to In Progress and is mostly active time • Ready to merge — mostly passive; normally a residual active time • Release on Stage / Release on Live — mix of active and passive Working hours — The Engineering teams don't work 24/7 and per definition this metric already excludes non-working days and should also exclude non-working hours on working days, typically 8 or 9 hours of work. We are currently not tracking the time spent on each ticket in each working day. It's possible that someone is actually working on multiple tickets or even involved in meetings and other activities that aren't trackable or not worth it. So this is an attempt of standardizing 8/9 hours of work per day. Rules: • Done following the same rules as Lead Time and Cycle Time • All waiting time defined in Lead Time and Cycle Time should be excluded – Vacations can potentially benefit from an integration with Bamboo HR – On Hold/Block assume such states need to be mapped • Exclude the following states defined in the State Transitions section: On Hold/Blocked; Failed on QA; Ready To Test; Ready To Merge; Any other? • Implement a logic to comply with the Working Hours section so that non-working hours are excluded. This doesn't need to be perfect, but at least on cycle times of multiple days, it will allow removing most of the time when people are not actually working. • Present time in days Open the source page in Confluence ↗
Formula — follows the fields below
active_hours = Σ_workdays min(hours in non-passive statuses, 8h)
shown = active_hours ÷ 24  // unit: calendar days — same scale as Lead/Cycle, active ≤ cycle holds
passive = [On Hold, Blocked, Failed on QA, Ready To Test, Ready To Merge]
workday = weekday ∧ ¬assignee-vacation (BambooHR) ∧ ¬country-holiday
How it is measured — updates as you type
LEAD created → DoneCYCLE “In Progress” → Done · −time in NewACTIVE only working time in active statuses⏱ clock stops at ENTRY into the done status — time inside it is not countedBacklogIn ProgressactiveBacklogsubtractedIn ProgressactiveFailed On …activeSat–SuncountedPending Re…activeReady To T…waitingIn TestingactiveDoneRemovedafter done — not measuredBottom strip = the ticket’s Jira lifecycle (statuses never disappear). A bracket = time INCLUDED in the metric; a gap between brackets = excluded by the current settings.Lead starts at creation; Cycle and Active start at the firstIn Progress”. Passive now: On Hold, Blocked, Failed on QA, Ready To Test, Ready To Merge.cap 8hMon8h counted, 2h over capTue6h countedWedvacation (BambooHR)Thu8h counted, 1h over capFri3h countedSatweekendSunweekendOnly hours in non-passive statuses count, per day, capped at 8h; weekends, the assignee’s vacations and country holidays drop out. This diagram explains the CHART scale: hours ÷ 24 → days comparable with Lead/Cycle. The unit switch re-expresses the same hour total elsewhere (÷8 = working days, or raw hours) — the diagram’s logic doesn’t change.
Live preview — real data, saved settings
Result with these settings: median 0.7 days (chart scale: hours ÷ 24) over 43 tickets
Real example: ADS-1311 — active 2.3 d chart-scale = 56h (Code Review stays active: the document calls it a mix)
  • ADS's real workflow statuses under the current settings — time COUNTS in: CODE REVIEW, Failed On Testing, In Deployment, In Progress, In Testing, Pending Review, Ready, Ready For Canary, Ready For Live, Ready For Review, Ready For Staging, Removed, Waiting for Client
  • excluded as WAITING (passive list): On Hold, Ready To Merge, Ready To Test
  • excluded as intake/bounce ("new"-category): Approved, Approved for Sprint, Backlog, Commited, New, Open, Selected for Development, To Do
  • BambooHR live data feeding the vacation/holiday rule: 91 employees (90 linked to Jira by work email) · 1152 approved vacation requests · 56 public-holiday days (?? 1, IE 4, PT 29, US 22)
Settings used by this metric — the diagram, formula and preview above follow them live; Save applies them everywhere

Time To First PR

NICE TO HAVEAI DRIVENTime To First PR (NICE TO HAVE, AI DRIVEN), p.5
The Time to First PR measures the time between when a ticket was put in progress until an associated PR is opened.
Full text from the document
The Time to First PR measures the time between when a ticket was put in progress until an associated PR is opened. The intention is to measure how quickly engineers can turn a ticket into a reviewable implementation. Rules: • Same rules as Cycle Time Open the source page in Confluence ↗
Formula — follows the fields below
ttfp = first(associated PR.created_at ≥ first(status of Jira category “In Progress”)) − first(status of Jira category “In Progress”) − Σ(time back in “new”-category statuses) // calendar days — weekends & holidays counted; same rules as Cycle Time
associated ⇔ PR title or source branch mentions the issue key
// only-earlier-PR tickets = anomaly bucket (excluded, disclosed) — spec M5 / T007
How it is measured — updates as you type
TIME TO FIRST PRNew / BacklogIn Progressticket put in progressPR openedkey in PR title / branchStops at the FIRST pull request that references the ticket key. Same day rules as Cycle Time.
Live preview — real data, saved settings
Result with these settings: median 0 days · 10 measured, 29 without a PR, 4 PR-before-start anomalies excluded
Real example: ADS-1311 — started 2026-06-04, first PR 2026-06-04 = 0.0 d

Lead Time for Changes

NICE TO HAVEDORALead Time for Changes (DORA, NICE TO HAVE), p.5
Lead Time for Changes measures the time it takes for a committed code change to successfully run in production.
Full text from the document
Lead Time for Changes is a core DevOps Research and Assessment metric that measures the time it takes for a committed code change to successfully run in production. The intent is to measure the actual responsiveness of our delivery system, not just the active engineering effort. Essentially the Lead Time for Changes will behave like the Cycle time, the only difference is that it starts to count when an associated commit is pushed instead of the ticket being set to In Progress. Rules: • Same rules as Cycle Time (From the meeting: "the definition was not… the hours from pull request merge to deployment to production… it's the time of the first commit until that commit lands in production. It's different from the merge.") Open the source page in Confluence ↗
Formula — follows the fields below
ltc = first(prod deploy after merge, result ∈ [succeeded]) − first_commit
prod ⇔ env name matches [production, prod, prd, live, stable] (lower tokens win first)  // calendar days — weekends & holidays counted
How it is measured — updates as you type
LEAD TIME FOR CHANGES (not from the merge!)first commitwork starts countingPR mergednot the startwaiting for a releasedeploy → production / pr…result ∈ [succeeded]Pairs team-wide: the change lands with the team’s FIRST successful production deploy after the merge.
Live preview — real data, saved settings
Result with these settings: median 3.2 days over 208 changes
Real example: PR #32888 (jarvis-admin) — first commit 2026-05-22 → production 2026-06-08 = 16.8 d

Quality & Reliability

The Prod Bug Rate measures the number of bugs that were reported in the production environment, broken down per severity; Pre-Prod is the same for non-productive environments.
Full text from the document
Prod Bug Rate — The Prod Bug Rate measures the number of bugs that were reported in the production environment. On top of it, it should also break down this number per severity. These bugs can be created by the Incident Management process or any team member. The important part is that it should account for production bugs. This metric should also offer a way to look at the number of bugs in a rate fashion, like the average number of bugs per week. Pre-Prod Bug Rate — This is essentially the same as the Prod Bug Rate but for all bugs found on non-productive environments. Open the source page in Confluence ↗
Formula — follows the fields below
bugs = count(type ∈ [Bug, Incident], created in window); rate = bugs ÷ weeks
severity = first present of [customfield_10050, customfield_10511, customfield_10093]
env: prod labels [production, prod] · pre-prod [pre-prod, preprod, pre-production, staging, qa] · none → unknown
How it is measured — updates as you type
type ∈ Bug/Incidentcreated in windowseverityfields: customfield_10050, customfield_1051…prod: labels production, prodpre-prod: pre-prod, preprod, pre-…no label → environment unknownsplit honestly unavailable
Live preview — real data, saved settings
Result with these settings: 7 bugs · 1.6/week · environment unknown (no env labels)
Real example: ADS-1318 — severity 1 - Critical, reported 2026-06-01 · severity split: 1 - Critical: 1 · 2 - High: 1 · Unspecified: 5
Settings used by this metric — the diagram, formula and preview above follow them live; Save applies them everywhere
CFR measures the percentage of software deployments to production that result in a degraded service and require immediate remediation.
Full text from the document
Change Failure Rate (CFR) is a core DORA metric that measures the percentage of software deployments to production that result in a degraded service, impairment, or outage and subsequently require immediate remediation. It evaluates the quality and stability of your delivery pipeline, ensuring that speed (measured by Lead Time) does not come at the expense of stability. A deployment is a failure if it requires: • An emergency rollback to a previous version. • A hotfix or urgent "fix-forward" patch shipped outside the normal cadence, for example, a major incident (P0 or P1) triggered by the Incident Management process but only if the cause was effectively from the code that was shipped. Rules: • Change Failure Rate is a percentage metric calculated as the (number of failed deployments / total number of production deployments) x 100 • Failed deployments obtained as mentioned before Open the source page in Confluence ↗
Formula — follows the fields below
cfr = failed ÷ eligible_production_deploys × 100  // eligible = result ∈ success ∪ failure; canceled/notDeployed excluded
failed ⇔ result ∈ [failed] ∨ manual flag  // bridge until Incident Management
How it is measured — updates as you type
pre-prodlower tier — ignoredpreprodlower tier — ignoredpre-producti…lower tier — ignoredproduction / prod / prdsuccess ∈ [succeeded] · failure ∈ [failed] or manual flagDF + CFR count HERETier from the environment name; lower-tier tokens match FIRST, so “pre-production” can never count as “prod”. Monthly rate uses 30 days.
Live preview — real data, saved settings
Result with these settings: 0 failed / 13 prod deploys = 0%
Settings used by this metric — the diagram, formula and preview above follow them live; Save applies them everywhere

Flow & Predictability

Throughput

MUST HAVEAI DRIVENThroughput (MUST HAVE, AI DRIVEN), p.7
The Throughput should measure the number of completed tickets within a timeframe as well as the average number of completed tickets per week.
Full text from the document
The Throughput should measure the number of completed tickets within a timeframe as well as the average number of completed tickets per week. Usually the latter is more useful to explain what's the usual flow of the team on a week basis. Rules: • If Epics are excluded from delivery, they should not count for the throughput. (From the meeting: "you split this by agents and humans, which is pretty cool"; "having an average of how many tickets per week… we think more of weekly basis".) Open the source page in Confluence ↗
Formula — follows the fields below
throughput = count(done in window) ; per_week = count ÷ weeks
split: assignee ∈ AI agent roster → agent, else human
How it is measured — updates as you type
done in windowstatus ∈ [Done, Resolved, Closed]− Epicexcluded types drop outAI agentsassignee ∈ agent accountshumanscount + per weekcount ÷ weeks in windowSame cohort feeds Throughput, Planned/Unplanned and AI Participation — one definition of “done in window”.
Live preview — real data, saved settings
Result with these settings: 43 tickets · 10/week — agents 8 / humans 35

Deployment Frequency

MUST HAVEAI DRIVENDeployment Frequency (MUST HAVE, AI DRIVEN), p.7
The Deployment Frequency measures how often a team/project successfully releases code to the production environment — total plus Daily / Weekly / Monthly rates.
Full text from the document
The Deployment Frequency measures how often a team/project successfully releases code to the production environment. It serves as a primary indicator of the team's overall throughput, agility and batch size. Like Throughput it should show the total number of successful production deployments for a given timeframe as well as a rate basis. There should be 3 types of rate: • Daily • Weekly • Monthly (From the meeting: "even if it said 50 deployments in those 30 days, it doesn't tell me what usually I think about, which is number of deploys per week or per day… per day, per week, per month — it's really interesting to have those numbers right there.") Open the source page in Confluence ↗
Formula — follows the fields below
df = count(prod deploys, result ∈ [succeeded])
rates: ÷ days · ÷ weeks · ÷ (30-day months)
How it is measured — updates as you type
pre-prodlower tier — ignoredpreprodlower tier — ignoredpre-producti…lower tier — ignoredproduction / prod / prdsuccess ∈ [succeeded] · failure ∈ [failed] or manual flagDF + CFR count HERETier from the environment name; lower-tier tokens match FIRST, so “pre-production” can never count as “prod”. Monthly rate uses 30 days.
Live preview — real data, saved settings
Result with these settings: 13 prod deploys · 0.4/day · 3/week · 13/month
Settings used by this metric — the diagram, formula and preview above follow them live; Save applies them everywhere

Planned vs Unplanned Work

MUST HAVEPlanned vs Unplanned Work (MUST HAVE), p.8
Measures how the flow of a team can be impacted by unexpected work; the classification depends on whether the team reserves maintenance capacity.
Full text from the document
The Planned vs Unplanned Work measures how the flow of a team can be impacted by unexpected work. The distinction between planned and unplanned work separates value-generating progress from unpredictable operational friction. Unplanned work is any unexpected task, disruption, or emergency that breaks the current iteration or sprint focus. It represents the operational tax paid for system instability, technical debt, or shifting priorities. Maintenance vs Unplanned Work — Usually, teams should reserve capacity for maintenance work (per sprint, per week), which means that if there is unplanned work, like an urgent issue to fix, this should be treated as planned work if the effort falls under the reserved capacity. If more capacity is needed for the same or other tasks, then it becomes Unplanned work. Rules: • If the team does not make any capacity planning for maintenance, any kind of deviation, anything that wasn't planned, should be interpreted as unplanned work (bugs, support requests, stakeholders requests, …). For this scenario, capture unplanned work by: Status Bug, Incident; Label: unplanned_work • If the team reserves capacity for maintenance, then unplanned work should only be accountable after the reserved capacity has been maxed out. For this scenario, capture unplanned work by: Label: unplanned_work Open the source page in Confluence ↗
Formula — follows the fields below
unplanned ⇔ type ∈ [Bug, Incident] ∨ label ∈ [unplanned_work]
How it is measured — updates as you type
completed ticketreserves capacity: NOsetting below — flip it, this flow changestype Bug / Incident OR labellabel: unplanned_work
Live preview — real data, saved settings
Result with these settings: 40 planned / 3 unplanned = 7% (no-reserved-capacity)
Real example: ADS-1318 (Bug) counted unplanned under the current mode
Settings used by this metric — the diagram, formula and preview above follow them live; Save applies them everywhere

PR Throughput + PR Acceptance Rate

NICE TO HAVEAI DRIVENPR Throughput + PR Acceptance Rate (NICE TO HAVE, AI DRIVEN), p.8
PR Throughput = merged PRs within a timeframe plus the average merged per week; PR Acceptance Rate = merged vs abandoned.
Full text from the document
PR Throughput — The PR Throughput should measure the number of merged PRs within a timeframe as well as the average number of merged PR per week. PR Acceptance Rate — The PR Acceptance rate should measure the percentages of merged PRs vs the ones that were abandoned. Open the source page in Confluence ↗
Formula — follows the fields below
pr_throughput = count(merged in window) ; per_week = merged ÷ weeks
acceptance = merged ÷ (merged + abandoned) × 100
How it is measured — updates as you type
PR closed in windowcompleted = mergedthroughput + per weekabandonedacceptance = merged ÷ (merged + abandoned)Open PRs are not counted — they could still go either way.
Live preview — real data, saved settings
Result with these settings: 213 merged · 49.7/week · 12 abandoned → acceptance 94.7%

AI Participation Rate

NICE TO HAVEAI DRIVENAI Participation Rate (NICE TO HAVE, AI DRIVEN), p.8
The AI Participation Rate should measure the percentages of tickets done that had an AI Agent Assigned.
Full text from the document
The AI Participation Rate rate should measure the percentages of tickets done that had an AI Agent Assigned. Open the source page in Confluence ↗
Formula — follows the fields below
participation = count(done ∧ assignee ∈ AI agents) ÷ count(done) × 100
How it is measured — updates as you type
ALL done tickets in windowassignee ∈ AI agent accountsmanaged roster (7 real agents)human assigneeparticipation = agents ÷ all × 100
Live preview — real data, saved settings
Result with these settings: 8 of 43 done tickets = 18.6%
Real example: ADS-1336 (Task) — done by an AI agent

Source of every rule on this page: “[WIP]: Core Engineering metrics” — Confluence ↗ + the metrics meeting of 2026-06-08. Quotes are verbatim.