The document's stated goal: prove the impact of AI agents. Same team, same window, same rules — tickets assigned to AI agents vs humans, side by side. No judgement coloring; medians, not sums, so a few big tickets can't skew it.
AI impact — by the manager's own formula
iVerbatim from the strategy note: AI creates positive impact when SIMULTANEOUSLY (1) AI Participation, PR Acceptance, Throughput and Deployment Frequency GROW, and (2) CFR and Prod Bug Rate do NOT grow (ideally fall) — 'faster AND fewer rollbacks/incidents'. Comparison = this window vs the previous window of the same length, whole team. Honest caveats: bug environment labels are absent, so ALL reported bugs stand in for Prod Bug Rate; short windows (30d) are noisy — judge on 90d+; the manager himself said 'this is hard to prove'.not enough data in this windowAI Participationimust grow
Share of completed tickets assigned to AI agents — the document's direct adoption indicator. For the verdict it must GROW: ✓ = this window's % is higher than the previous one's.Reading the card: bold = THIS window, gray after ← = the PREVIOUS window of the same length. The check answers one question only: «AI Participation ↑?»must grow
Share of completed tickets assigned to AI agents — the document's direct adoption indicator. For the verdict it must GROW: ✓ = this window's % is higher than the previous one's.Reading the card: bold = THIS window, gray after ← = the PREVIOUS window of the same length. The check answers one question only: «AI Participation ↑?»must grow
—←—no data
PR Acceptanceimust grow
Merged ÷ (merged + abandoned) PRs, whole team — the quality-of-output signal among the adoption indicators. ✓ = the share of PRs that actually land is higher than last window.Reading the card: bold = THIS window, gray after ← = the PREVIOUS window of the same length. The check answers one question only: «PR Acceptance ↑?»must grow
Merged ÷ (merged + abandoned) PRs, whole team — the quality-of-output signal among the adoption indicators. ✓ = the share of PRs that actually land is higher than last window.Reading the card: bold = THIS window, gray after ← = the PREVIOUS window of the same length. The check answers one question only: «PR Acceptance ↑?»must grow
—←—no data
Throughputimust grow
Completed tickets in the window (the cohort). ✓ = more tickets done than in the previous window — the team got faster.Reading the card: bold = THIS window, gray after ← = the PREVIOUS window of the same length. The check answers one question only: «Throughput ↑?»must grow
Completed tickets in the window (the cohort). ✓ = more tickets done than in the previous window — the team got faster.Reading the card: bold = THIS window, gray after ← = the PREVIOUS window of the same length. The check answers one question only: «Throughput ↑?»must grow
0←0did not grow ✗
Deployment Frequencyimust grow
Successful production deployments in the window. ✓ = shipped to prod more often than last window.Reading the card: bold = THIS window, gray after ← = the PREVIOUS window of the same length. The check answers one question only: «Deployment Frequency ↑?»must grow
Successful production deployments in the window. ✓ = shipped to prod more often than last window.Reading the card: bold = THIS window, gray after ← = the PREVIOUS window of the same length. The check answers one question only: «Deployment Frequency ↑?»must grow
—←—no data
CFRimust not grow
GUARDRAIL: % of prod deploys that failed (incl. manual flags). Speed must not cost stability — ✓ = CFR did NOT rise vs the previous window (flat or lower is good); ✗ = it rose.Reading the card: bold = THIS window, gray after ← = the PREVIOUS window of the same length. The check answers one question only: «CFR не ↑?»must not grow
GUARDRAIL: % of prod deploys that failed (incl. manual flags). Speed must not cost stability — ✓ = CFR did NOT rise vs the previous window (flat or lower is good); ✗ = it rose.Reading the card: bold = THIS window, gray after ← = the PREVIOUS window of the same length. The check answers one question only: «CFR не ↑?»must not grow
—←—no data
Bug Rate (all reported)imust not grow
GUARDRAIL: bugs reported in the window (all reported — env labels are absent, so this stands in for Prod Bug Rate). ✓ = not more bugs than the previous window; ✗ = more.Reading the card: bold = THIS window, gray after ← = the PREVIOUS window of the same length. The check answers one question only: «Bug Rate (all reported) не ↑?»must not grow
GUARDRAIL: bugs reported in the window (all reported — env labels are absent, so this stands in for Prod Bug Rate). ✓ = not more bugs than the previous window; ✗ = more.Reading the card: bold = THIS window, gray after ← = the PREVIOUS window of the same length. The check answers one question only: «Bug Rate (all reported) не ↑?»must not grow
0←0did not worsen ✓
| Metric | agents ↗ | humans ↗ |
|---|---|---|
| ThroughputiWhat it means The Throughput should measure the number of completed tickets within a timeframe as well as the average number of completed tickets per week. (Throughput (MUST HAVE, AI DRIVEN), p.7)How it's computed throughput = count(done in window) ; per_week = count ÷ weeksThe numbers shownTickets that ENTERED a done status inside the window (last such entry; Epics excluded) · per-week rate · split into AI agents / humans by the assignee roster.Example 30 tickets done in the window, 4 assigned to AI agents → 30 total · agents 4 / humans 26.Full rules, document text & live example → | 0 0/week | 0 0/week |
| Lead Time (median)iWhat it means The Lead Time should measure the time between when the ticket was created until it is done. The purpose is to measure the full end-to-end customer experience. (Lead Time (MUST HAVE), p.3)How it's computed lead = done_at − created_at // calendar days — weekends & holidays countedThe numbers shownHeadline = MEDIAN days over every ticket completed in the window; “View tickets” lists each ticket with its own lead time.Example Created Mon 09:00 → entered Done Thu 09:00 = 3.0 d. Calendar days: a weekend in between would count too.Full rules, document text & live example → | — | — |
| Cycle Time (median)iWhat it means The Cycle Time should measure the time from when the ticket is put in progress until it is done. The purpose is to measure the actual work that was done. (Cycle Time (MUST HAVE), p.3)How it's computed cycle = done_at − first(status of Jira category “In Progress”) // done_at = ENTRY into a done status; time inside it not counted − Σ(time in “new”-category statuses after start)The numbers shownMedian days from the FIRST entry into an In-Progress-category status to done. Tickets that were never started have no cycle (they still count in Lead).Example In Progress Tue 09:00 → Done Fri 09:00 = 3.0 d. If it bounced back to To Do for 1.0 d in between → 2.0 d.Full rules, document text & live example → | — | — |
| Active Work Time (median)iWhat it means Active Work Time tracks only when the team is actively working, removing all waiting time from the cycle time. It accurately reflects the work required by the Engineering teams. (Active Work Time (NICE TO HAVE), p.4–5)How it's computed active_hours = Σ_workdays min(hours in non-passive statuses, 8h)The numbers shownMedian of REAL hands-on time, shown in CALENDAR days: per workday, hours in non-passive statuses (capped at 8h) are summed, then ÷ 24. Weekends, the assignee's BambooHR vacations and country holidays contribute nothing. Same unit as Lead/Cycle — that's why active ≤ cycle always holds and the gap to Cycle IS the waiting. A full 8h work day shows as 0.3 d — small on purpose. Other units (working days, hours) are available via the unit switch/admin default.Example 5h Tue + 9h Wed (capped to 8h) + 3h Thu = 16h ÷ 24 ≈ 0.7 d of active work inside a 3-day cycle. Same ticket in other units: 2 wd · 16h.Full rules, document text & live example →cal. dayswork days°hours | — | — |
| Time To First PR (median)iWhat it means The Time to First PR measures the time between when a ticket was put in progress until an associated PR is opened. (Time To First PR (NICE TO HAVE, AI DRIVEN), p.5)How it's computed ttfp = first(associated PR.created_at ≥ first(status of Jira category “In Progress”)) − first(status of Jira category “In Progress”) − Σ(time back in “new”-category statuses) // calendar days — weekends & holidays counted; same rules as Cycle TimeThe numbers shownMedian days from work start to the FIRST linked PR opened at/after the start. “N PR” = measured tickets; “no PR” = done without any linked PR; “PR-before-start” = tickets whose only PRs predate the start — an anomaly, excluded from the median and disclosed.Example Started Tue 09:00, PR mentioning the key opened Wed 15:00 → 1.2 d. If the only PR was opened on Monday (before the start) → anomaly bucket, not 0 days.Full rules, document text & live example → | — | — |
| Planned vs UnplannediWhat it means Measures how the flow of a team can be impacted by unexpected work; the classification depends on whether the team reserves maintenance capacity. (Planned vs Unplanned Work (MUST HAVE), p.8)How it's computed unplanned ⇔ type ∈ [Bug, Incident] ∨ label ∈ [unplanned_work]The numbers shown% = unplanned ÷ all completed × 100; “P / U” = planned and unplanned counts. Unplanned = Bug/Incident type OR the unplanned_work label (when the team reserves capacity for bugs, only the label counts).Example 30 done tickets, 6 are Bug/Incident or labeled unplanned_work → unplanned 20%.Full rules, document text & live example → | — 0 / 0 | — 0 / 0 |
PR authorship — agents vs humans
iA PR counts as an agent's when its ADO author matches an active entry of the AI-agent roster (the same roster that splits tickets). Terminal PRs (merged or abandoned) closed in the window; same half-open window rule as everywhere. Acceptance = merged ÷ (merged + abandoned) — a quality signal: how often the author's PRs actually land.| agents ↗ | humans ↗ | |
|---|---|---|
| Merged | 0 0/week | 0 0/week |
| Abandoned | 0 | 0 |
| Acceptance | — | — |
Bugs and deploys are not shown: they are not attributable to an assignee.