Shift-Left Quality
Quality is part of done, not part of QA.
Quality is part of done, not part of QA. Accessibility, security, performance, observability: all in the gate, all run while the change is fresh, none deferred to a later sprint or a later team.
Opinion
The team that defers accessibility to a separate sprint is the team that ships an inaccessible product. Same shape with security: defer to a pen-test next quarter and the vulnerability ships. Same shape with performance: defer to a perf review and the regression ships. The shift-left tradition has thirty years of data behind it.2Software Engineering Economics (Prentice Hall, 1981). The cost-of-defect curve. Forty years on the data has not aged; every shift-left argument descends from this source. The operational pattern in this repo is the same shape: a11y violations as build-failing in CI, security checks via Dependabot and CodeQL, perf budgets as Lighthouse thresholds.
The bit I want to plant a flag on is that shift-left is the strategy and P4 Continuous Quality Feedback is the tactic. P5 says move the check earlier; P4 says give the engineer the loop to run the check at the keystroke. Neither row works alone: shift-left without the editor loop is a CI complaint waiting to happen; the editor loop without shift-left is a fast loop on the wrong thing.
For the agent, shift-left is what stops the model from generating accessible-looking but inaccessible markup, secure-looking but insecure code, performant-looking but slow code. The a11y MCP, the security MCP, the perf-budget script: the agent runs them as it writes, not after. CRITIC and the evaluator-optimizer pattern again: the model corrects against the tool signal, not against introspection.3“Reflexion: Language Agents with Verbal Reinforcement Learning”, NeurIPS 2023. The agent grounded in machine-checkable feedback improves; the agent grading itself does not. Generalised by Madaan (Self-Refine) and Gou (CRITIC). Shift-left is the principle that decides which signals belong in the loop.
Copy a note and link
Grab this short comment and drop it into a PR comment or an LLM chat to prompt the right change.
Quality is part of done, not part of QA. Move every check earlier — accessibility, security, performance, observability. From QA into PR, from PR into CI, from CI into the editor. The earlier a defect is found, the cheaper it is to fix; Boehm's cost-of-defect curve is the load-bearing source. /tenet/shift-left-quality/P5
AI eyes only
Rule: quality is part of done, not part of QA. Run quality checks at edit time.
Reject: deferring accessibility, security, performance checks to a later stage. Reject: shipping a PR that fails an a11y, security, or perf-budget check. Reject: relying on QA to catch what tooling can.
Generate: invoke the a11y MCP, security scanner, and perf-budget script at edit time, not pre-merge. Treat each as a build gate, not a PR comment. Fix on first signal.
Diagnostic: at every edit boundary, run the closest available quality check. If a check can run in seconds and is not in the loop, the loop is wrong.
Why?
- Boehm's curve drives the case: every shift-left check is one stage earlier, one order of magnitude cheaper. The curve has not aged; the principle holds across accessibility, security, performance, and observability.
- Quality ships by default, not by retrofit. The team that runs the a11y check at PR time ships accessible products; the team that defers ships an inaccessible one until the next remediation sprint.
- Pair with P4 Continuous Quality Feedback: P5 is the strategy (move every check earlier), P4 is the tactic (run every check at the keystroke). Neither row works alone.
- The agent grounds itself in tool signals, not introspection. Shift-left is the principle that decides which signals belong in the agent's loop — the a11y MCP, the security MCP, the perf-budget script.
- QA becomes a coach, not a gate. The Crispin / Gregory framing keeps the QA discipline intact while moving the per-change check into the developer's loop.
- No remediation sprints. The accessibility / security / performance backlog never grows because the regressions never landed; the cost of the principle is paid in continuous small effort, not in occasional large remediation.
- Compliance is a by-product. WCAG, SOC 2, performance budgets — the audit at the end finds nothing to fix because the gate at the front caught it.
Origins
The term comes from Larry Smith. “Shift-Left Testing” ( Dr. Dobb's Journal, September 2001) named the practice of moving testing left in the SDLC.1“Shift-Left Testing”, Dr. Dobb’s Journal, September 2001. The article that named the practice. Cites Boehm’s cost-of-defect curve as the load-bearing source. The intellectual root is Barry Boehm's 1981 cost-of-defect curve, which Smith cites and which every modern shift-left argument descends from.2Software Engineering Economics (Prentice Hall, 1981). The cost-of-defect curve. Forty years on the data has not aged; every shift-left argument descends from this source. The principle is the same in both: every minute earlier you find a defect is a multiplier on the cost.
The DevSecOps tradition is the security branch of the same idea. Gartner's 2017 “DevSecOps Should Be the Default for All Application Development” coined the term and made the case that security checks belong in the developer loop.6“DevSecOps Should Be the Default for All Application Development”, Gartner Research G00318180, 2017. The security branch of shift-left; the term DevSecOps coined here as the operational form. Snyk, Dependabot, and CodeQL are the operational instances; the principle is the same Smith named in 2001 with security as the headline example.
The accessibility branch belongs to the W3C and Deque. The W3C's WCAG (1999– present) is the standard;7Web Content Accessibility Guidelines (WCAG) 2.2, 2023. The standard the a11y branch of shift-left aims at; the canonical reference for what an accessibility check should verify. Deque's axe-core (2015–) is the dominant developer-loop tool that brings WCAG checks into the editor and the test suite.8axe-core (github.com/dequelabs/axe-core, 2015–). The dominant developer-loop tool that brings WCAG checks into the editor, the test suite, and the agent loop. Bruce Lawson and Remy Sharp's Introducing HTML5 (New Riders, 2011) is the early bridge book between web development and a11y discipline.9Introducing HTML5 (New Riders, 2011). The early bridge book between web development and a11y discipline; the source that connected the WCAG standard to working developer practice.
The performance branch belongs to Lara Hogan and Andy Hogan's Designing for Performance (O'Reilly, 2014).10Designing for Performance: Weighing Aesthetics and Speed (O’Reilly, 2014). The performance-budget framing — an explicit upper bound on page weight, render time, or interactive time as a shift-left analogue for performance. Performance budgets — an explicit upper bound on page weight, render time, or interactive time — are the shift-left analogue for performance. Lighthouse and Web Vitals (Google, 2018–) are the operational instances; the principle is again the one Smith named.
DORA / Forsgren is the empirical backstop across all four branches. Accelerate (2018) names shift-left security as one of the practices that distinguishes high-performing teams.11Accelerate: The Science of Lean Software and DevOps (IT Revolution, 2018). Names shift-left security as one of the practices that distinguishes high-performing teams; the empirical backstop across the four branches. The 2024 DORA report continues to list these as drivers; the data has not moved against the principle.
Quotes
Test sooner, test more often, test in many places. The cost of finding a defect is proportional to how long it has been in the system; the corrective is to find it earlier.
It is much more expensive to make major changes to a product after it has been built than to make those changes during the design phase.
A performance budget is a clearly defined limit on a particular performance metric. The budget is a constraint that drives design and engineering decisions before they are made.
Accessibility is a feature, not an audit. The earlier in the lifecycle you check, the cheaper the fix.
Evidence
Twenty external sources, ranked by author authority. The first five are the canon; expand to see the rest, including the qualifiers and the named opposers. Each links out to its primary source.
- 01“Shift-Left Testing”SupportsDr. Dobb’s Journal, September 2001. The article that named the practice; cites Boehm’s cost-of-defect curve as the load-bearing source.
- 02Software Engineering EconomicsSupportsThe cost-of-defect curve. Forty years on the data has not aged; every shift-left argument descends from this source.
- 03Coined the DevSecOps term as the security branch of shift-left. Industry-influential; backs the case at the executive scale.
- 04The accessibility standard. The shift-left a11y branch aims at WCAG; axe-core, Pa11y, and Lighthouse are the developer-loop tools that enforce it.
- 05axe-coreSupportsThe dominant developer-loop tool that brings WCAG checks into the editor, the test suite, and the agent loop.
Twenty sources, three stances. The supporters are Smith, Boehm, Gartner, W3C, and Deque: the shift-left / DevSecOps / a11y canon. The qualifiers further down push the line that shift-left without team capability is just blame transfer. The opposers argue that shift-left collapses the QA function into development; the steelman the case has to address.
Enforcement
Apply these rules in eslint.config.mjs. The full enforcement across every tenet lives on the implementation page.
| Rule | Tool | Catches |
|---|---|---|
| eslint-plugin-jsx-a11y | eslint-plugin-jsx-a11y | static a11y violations in JSX — missing alt text, invalid ARIA attributes, role mismatches. |
| axe-core (vitest integration) | axe-core | runtime a11y violations against rendered components. Pair with @testing-library/react and toHaveNoViolations. |
| Lighthouse CI | Lighthouse CI | performance, accessibility, SEO, and best-practice budgets at PR time. Fail the build when a budget is breached. |
| eslint-plugin-security | eslint-plugin-security | the OWASP-flavoured static rules — eval, unsafe regex, object injection — at the keystroke. |
| CodeQL | CodeQL | the deeper security scans at PR time. Ships with GitHub Advanced Security. |
| Snyk Open Source | Snyk | dependency vulnerabilities at install time and at PR time. The DevSecOps loop in the editor. |
| Dependabot | Dependabot | automated dependency-update PRs. Closes the patch loop without remediation sprints. |
| Web Vitals SDK | Web Vitals | Core Web Vitals in production. The shift-right complement; closes the loop the perf budget at PR time can’t cover. |
eslint.config.mjsconfiguration snippet
import tseslint from 'typescript-eslint';
import jsxA11y from 'eslint-plugin-jsx-a11y';
import security from 'eslint-plugin-security';
export default tseslint.config({
files: ['**/*.{ts,tsx,jsx}'],
plugins: { 'jsx-a11y': jsxA11y, security },
rules: {
...jsxA11y.configs.recommended.rules,
'security/detect-object-injection': 'error',
'security/detect-eval-with-expression': 'error',
'security/detect-unsafe-regex': 'error',
}
});AI rules
.cursor/rules/p5-shift-left-quality.mdc---
description: Prickles P5 — Shift-Left Quality
globs: "**/*.{ts,tsx,js,jsx,py,java,php,html}"
alwaysApply: false
---
## Prickles P5 — Shift-Left Quality
Move every check earlier in the SDLC. From QA into PR; from PR into CI; from CI into the editor.
Accessibility, security, performance, observability — all in the gate, all run while the change is fresh.
Shift-left without team capability is blame transfer; pair the move with training, tooling, and an actionable signal the engineer can fix in the editor.
For agents: configure the a11y MCP, the security MCP, and the perf-budget script so the model grounds every change in the same shift-left signals the human gets.Repo layout, CI, and ESLint wiring for these paths live on /implementation — not repeated on every tenet.
Counter-argument
The strongest steelman is Rebecca Wirfs-Brock's and the modern QA-as-craft tradition: QA is a discipline distinct from development, and shift-left, taken to its limit, hands a specialist's job to a generalist who lacks the training.4“Principles of Testing” (consulting deck, wirfs-brock.com). The QA-as-craft steelman: QA is a discipline distinct from development, and shift-left taken to its limit hands a specialist’s job to a generalist. The Crispin / Gregory qualification is sharper: shift-left without team capability is blame transfer; the engineer who has not been trained to test for a11y will not catch the failure the QA specialist would have.5Agile Testing: A Practical Guide for Testers and Agile Teams (Addison-Wesley, 2009). The qualifier: shift-left without team capability is blame transfer. The fix is to keep the QA discipline and put it in the developer’s hand. The opposing position deserves a fair hearing: shift-left can become a euphemism for “we let the QA team go.”
Counter-argument retort
Wirfs-Brock's point is real for the team that does shift-left without investing in the capability. Shift-left isn't a cost-cutting move; it is a capability-building move. The reply is to keep the QA discipline and put it in the developer's hand, not to eliminate it. The Crispin / Gregory Agile Testing tradition is the corrective: QA becomes a coach, not a gate.5Agile Testing: A Practical Guide for Testers and Agile Teams (Addison-Wesley, 2009). The qualifier: shift-left without team capability is blame transfer. The fix is to keep the QA discipline and put it in the developer’s hand.
The deeper qualification is that shift-left works only when the check is actionable for the developer. An a11y violation the developer can't interpret is a CI failure they will route around; an axe-core message that names the rule, the element, and the fix is a check they will act on. The fix is to invest in the tooling, not to retreat from the principle. Pair this with P4 Continuous Quality Feedback and the loop gives the developer the actionable signal at the keystroke.
For the agent, shift-left is the principle that decides which signals belong in the loop. The model that has run the a11y MCP at edit time produces accessible code; the model that hasn't produces accessible-looking code that fails axe-core. The same shape applies across security and performance. Without shift-left, the agent grades itself against its priors and ships the regression at machine speed.
In production work, the row that fires the a11y check, the security scan, and the perf budget at PR time is the row whose product ships accessible, secure, and performant by default rather than by retrofit. The compounding effect is large: a year of shift-left reduces the year-end regression backlog to near zero; a year of shift-right produces the backlog that becomes a quarter of remediation work.
Notes
- [1]Larry Smith — “Shift-Left Testing”, Dr. Dobb’s Journal, September 2001. The article that named the practice. Cites Boehm’s cost-of-defect curve as the load-bearing source.
- [2]Barry W. Boehm — Software Engineering Economics (Prentice Hall, 1981). The cost-of-defect curve. Forty years on the data has not aged; every shift-left argument descends from this source.
- [3]Noah Shinn et al. — “Reflexion: Language Agents with Verbal Reinforcement Learning”, NeurIPS 2023. The agent grounded in machine-checkable feedback improves; the agent grading itself does not. Generalised by Madaan (Self-Refine) and Gou (CRITIC).