P5 Shift-Left Quality

§0b

Opinion

The team that defers accessibility to a separate sprint is the team that ships an inaccessible product. Same shape with security: defer to a pen-test next quarter and the vulnerability ships. Same shape with performance: defer to a perf review and the regression ships. The shift-left tradition has thirty years of data behind it.2 The operational pattern in this repo is the same shape: a11y violations as build-failing in CI, security checks via Dependabot and CodeQL, perf budgets as Lighthouse thresholds.

The bit I want to plant a flag on is that shift-left is the strategy and P4 Continuous Quality Feedback is the tactic. P5 says move the check earlier; P4 says give the engineer the loop to run the check at the keystroke. Neither row works alone: shift-left without the editor loop is a CI complaint waiting to happen; the editor loop without shift-left is a fast loop on the wrong thing.

For the agent, shift-left is what stops the model from generating accessible-looking but inaccessible markup, secure-looking but insecure code, performant-looking but slow code. The a11y MCP, the security MCP, the perf-budget script: the agent runs them as it writes, not after. CRITIC and the evaluator-optimizer pattern again: the model corrects against the tool signal, not against introspection.3 Shift-left is the principle that decides which signals belong in the loop.

Copy a note and link

Grab this short comment and drop it into a PR comment or an LLM chat to prompt the right change.

Quality is part of done, not part of QA. Move every check earlier — accessibility, security, performance, observability. From QA into PR, from PR into CI, from CI into the editor. The earlier a defect is found, the cheaper it is to fix; Boehm's cost-of-defect curve is the load-bearing source.

/tenet/shift-left-quality/P5

§0c

AI eyes only

Rule: quality is part of done, not part of QA. Run quality checks at edit time.

Reject: deferring accessibility, security, performance checks to a later stage. Reject: shipping a PR that fails an a11y, security, or perf-budget check. Reject: relying on QA to catch what tooling can.

Generate: invoke the a11y MCP, security scanner, and perf-budget script at edit time, not pre-merge. Treat each as a build gate, not a PR comment. Fix on first signal.

Diagnostic: at every edit boundary, run the closest available quality check. If a check can run in seconds and is not in the loop, the loop is wrong.

§0d

Why?

Boehm's curve drives the case: every shift-left check is one stage earlier, one order of magnitude cheaper. The curve has not aged; the principle holds across accessibility, security, performance, and observability.
Quality ships by default, not by retrofit. The team that runs the a11y check at PR time ships accessible products; the team that defers ships an inaccessible one until the next remediation sprint.
Pair with P4 Continuous Quality Feedback: P5 is the strategy (move every check earlier), P4 is the tactic (run every check at the keystroke). Neither row works alone.
The agent grounds itself in tool signals, not introspection. Shift-left is the principle that decides which signals belong in the agent's loop — the a11y MCP, the security MCP, the perf-budget script.
QA becomes a coach, not a gate. The Crispin / Gregory framing keeps the QA discipline intact while moving the per-change check into the developer's loop.
No remediation sprints. The accessibility / security / performance backlog never grows because the regressions never landed; the cost of the principle is paid in continuous small effort, not in occasional large remediation.
Compliance is a by-product. WCAG, SOC 2, performance budgets — the audit at the end finds nothing to fix because the gate at the front caught it.

The receipts

Origins, quoted passages, evidence, the strongest counter-argument and the reply.

§1

Origins

The term comes from Larry Smith. “Shift-Left Testing” ( Dr. Dobb's Journal, September 2001) named the practice of moving testing left in the SDLC.1 The intellectual root is Barry Boehm's 1981 cost-of-defect curve, which Smith cites and which every modern shift-left argument descends from.2 The principle is the same in both: every minute earlier you find a defect is a multiplier on the cost.

The DevSecOps tradition is the security branch of the same idea. Gartner's 2017 “DevSecOps Should Be the Default for All Application Development” coined the term and made the case that security checks belong in the developer loop.6 Snyk, Dependabot, and CodeQL are the operational instances; the principle is the same Smith named in 2001 with security as the headline example.

The accessibility branch belongs to the W3C and Deque. The W3C's WCAG (1999– present) is the standard;7 Deque's axe-core (2015–) is the dominant developer-loop tool that brings WCAG checks into the editor and the test suite.8 Bruce Lawson and Remy Sharp's Introducing HTML5 (New Riders, 2011) is the early bridge book between web development and a11y discipline.9

The performance branch belongs to Lara Hogan and Andy Hogan's Designing for Performance (O'Reilly, 2014).10 Performance budgets — an explicit upper bound on page weight, render time, or interactive time — are the shift-left analogue for performance. Lighthouse and Web Vitals (Google, 2018–) are the operational instances; the principle is again the one Smith named.

DORA / Forsgren is the empirical backstop across all four branches. Accelerate (2018) names shift-left security as one of the practices that distinguishes high-performing teams.11 The 2024 DORA report continues to list these as drivers; the data has not moved against the principle.

§2

Quotes

Test sooner, test more often, test in many places. The cost of finding a defect is proportional to how long it has been in the system; the corrective is to find it earlier.

Larry Smith · “Shift-Left Testing” (2001)

It is much more expensive to make major changes to a product after it has been built than to make those changes during the design phase.

Barry W. Boehm · Software Engineering Economics (1981)

A performance budget is a clearly defined limit on a particular performance metric. The budget is a constraint that drives design and engineering decisions before they are made.

Lara Callender Hogan · Designing for Performance (2014)

Accessibility is a feature, not an audit. The earlier in the lifecycle you check, the cheaper the fix.

Deque Systems · axe-core

§3

Evidence

Twenty external sources, ranked by author authority. The first five are the canon; expand to see the rest, including the qualifiers and the named opposers. Each links out to its primary source.

01
“Shift-Left Testing”Supports
Larry Smith · 2001
Dr. Dobb’s Journal, September 2001. The article that named the practice; cites Boehm’s cost-of-defect curve as the load-bearing source.
02
Software Engineering EconomicsSupports
Barry W. Boehm · 1981
The cost-of-defect curve. Forty years on the data has not aged; every shift-left argument descends from this source.
03
“DevSecOps Should Be the Default for All Application Development”Supports
Gartner · 2017
Coined the DevSecOps term as the security branch of shift-left. Industry-influential; backs the case at the executive scale.
04
Web Content Accessibility Guidelines (WCAG) 2.2Supports
W3C · 2023
The accessibility standard. The shift-left a11y branch aims at WCAG; axe-core, Pa11y, and Lighthouse are the developer-loop tools that enforce it.
05
axe-coreSupports
Deque Systems · 2015–
The dominant developer-loop tool that brings WCAG checks into the editor, the test suite, and the agent loop.

Twenty sources, three stances. The supporters are Smith, Boehm, Gartner, W3C, and Deque: the shift-left / DevSecOps / a11y canon. The qualifiers further down push the line that shift-left without team capability is just blame transfer. The opposers argue that shift-left collapses the QA function into development; the steelman the case has to address.

§4b

Enforcement

Viewing: TypeScript.

Apply these rules in eslint.config.mjs. The full enforcement across every tenet lives on the implementation page.

Rule	Tool	Catches
eslint-plugin-jsx-a11y	eslint-plugin-jsx-a11y	static a11y violations in JSX — missing alt text, invalid ARIA attributes, role mismatches.
axe-core (vitest integration)	axe-core	runtime a11y violations against rendered components. Pair with @testing-library/react and toHaveNoViolations.
Lighthouse CI	Lighthouse CI	performance, accessibility, SEO, and best-practice budgets at PR time. Fail the build when a budget is breached.
eslint-plugin-security	eslint-plugin-security	the OWASP-flavoured static rules — eval, unsafe regex, object injection — at the keystroke.
CodeQL	CodeQL	the deeper security scans at PR time. Ships with GitHub Advanced Security.
Snyk Open Source	Snyk	dependency vulnerabilities at install time and at PR time. The DevSecOps loop in the editor.
Dependabot	Dependabot	automated dependency-update PRs. Closes the patch loop without remediation sprints.
Web Vitals SDK	Web Vitals	Core Web Vitals in production. The shift-right complement; closes the loop the perf budget at PR time can’t cover.

eslint.config.mjsconfiguration snippet

import tseslint from 'typescript-eslint';
import jsxA11y from 'eslint-plugin-jsx-a11y';
import security from 'eslint-plugin-security';

export default tseslint.config({
  files: ['**/*.{ts,tsx,jsx}'],
  plugins: { 'jsx-a11y': jsxA11y, security },
  rules: {
    ...jsxA11y.configs.recommended.rules,
    'security/detect-object-injection': 'error',
    'security/detect-eval-with-expression': 'error',
    'security/detect-unsafe-regex': 'error',
  }
});

§4c

AI rules

Paste destination

File.cursor/rules/p5-shift-left-quality.mdc

---
description: Prickles P5 — Shift-Left Quality
globs: "**/*.{ts,tsx,js,jsx,py,java,php,html}"
alwaysApply: false
---

## Prickles P5 — Shift-Left Quality

Move every check earlier in the SDLC. From QA into PR; from PR into CI; from CI into the editor.

Accessibility, security, performance, observability — all in the gate, all run while the change is fresh.

Shift-left without team capability is blame transfer; pair the move with training, tooling, and an actionable signal the engineer can fix in the editor.

For agents: configure the a11y MCP, the security MCP, and the perf-budget script so the model grounds every change in the same shift-left signals the human gets.

Repo layout, CI, and ESLint wiring for these paths live on /implementation — not repeated on every tenet.

§5

Counter-argument

Counter

The strongest steelman is Rebecca Wirfs-Brock's and the modern QA-as-craft tradition: QA is a discipline distinct from development, and shift-left, taken to its limit, hands a specialist's job to a generalist who lacks the training.4 The Crispin / Gregory qualification is sharper: shift-left without team capability is blame transfer; the engineer who has not been trained to test for a11y will not catch the failure the QA specialist would have.5 The opposing position deserves a fair hearing: shift-left can become a euphemism for “we let the QA team go.”

§6

Counter-argument retort

Wirfs-Brock's point is real for the team that does shift-left without investing in the capability. Shift-left isn't a cost-cutting move; it is a capability-building move. The reply is to keep the QA discipline and put it in the developer's hand, not to eliminate it. The Crispin / Gregory Agile Testing tradition is the corrective: QA becomes a coach, not a gate.5

The deeper qualification is that shift-left works only when the check is actionable for the developer. An a11y violation the developer can't interpret is a CI failure they will route around; an axe-core message that names the rule, the element, and the fix is a check they will act on. The fix is to invest in the tooling, not to retreat from the principle. Pair this with P4 Continuous Quality Feedback and the loop gives the developer the actionable signal at the keystroke.

For the agent, shift-left is the principle that decides which signals belong in the loop. The model that has run the a11y MCP at edit time produces accessible code; the model that hasn't produces accessible-looking code that fails axe-core. The same shape applies across security and performance. Without shift-left, the agent grades itself against its priors and ships the regression at machine speed.

In production work, the row that fires the a11y check, the security scan, and the perf budget at PR time is the row whose product ships accessible, secure, and performant by default rather than by retrofit. The compounding effect is large: a year of shift-left reduces the year-end regression backlog to near zero; a year of shift-right produces the backlog that becomes a quarter of remediation work.

§7

Notes

[1]Larry Smith — “Shift-Left Testing”, Dr. Dobb’s Journal, September 2001. The article that named the practice. Cites Boehm’s cost-of-defect curve as the load-bearing source.
[2]Barry W. Boehm — Software Engineering Economics (Prentice Hall, 1981). The cost-of-defect curve. Forty years on the data has not aged; every shift-left argument descends from this source.
[3]Noah Shinn et al. — “Reflexion: Language Agents with Verbal Reinforcement Learning”, NeurIPS 2023. The agent grounded in machine-checkable feedback improves; the agent grading itself does not. Generalised by Madaan (Self-Refine) and Gou (CRITIC).

Disagree? Found a hole in the argument? Take issue with this tenet →

Last revised: 2026-04-27