P4 Continuous Quality Feedback

§0b

Opinion

The editor is the cheapest place to fail and the PR is the most expensive. Boehm's curve is forty years old and the data underneath it has not improved with age: a defect found at design costs one unit; in code, ten; in test, one hundred; in production, a thousand.1 The team that runs the lint-and-types loop on save is the team that ships cleaner PRs; the team that runs them only on push is the team whose PRs sit in review until Friday because someone has to read past the noise.

What I want to plant a flag on is that the principle survived the IDE upheavals of the last decade. ESLint started as a one-shot CLI; it now runs in the editor with a server, with the LSP, with watchers, with hooks, with MCPs the agent reads at edit time. Each step closer to the keystroke is the same point made finer: the gate that fires inside the keystroke is the gate the engineer never argues with. Pair this with P3 Definition of Done and the loop has both ends: the editor catches the easy failures, the DoD verifies the hard ones at merge time.

For the agent the principle is harder still. An LLM will generate plausible code at machine speed and treat the PR review as the first feedback. The fix is to make the lint, type-check and tests available to the agent as it writes (the eslint MCP, the type-check MCP, the test runner). The model that has run the lint server before declaring the work done is the model that produces a PR a human can read in two minutes; the model that has not run the lint is the model whose PR is rejected at the gate.2

Copy a note and link

Grab this short comment and drop it into a PR comment or an LLM chat to prompt the right change.

Fail in the editor, not at PR review. Lint while you write, type-check on save, run the test against the file you just edited, pre-push gates close the loop. Defect cost grows with feedback latency; the corrective is to move every check as close to the keystroke as the toolchain allows.

/tenet/continuous-quality-feedback/P4

§0c

AI eyes only

Rule: the feedback loop is the gate. Run lint and tests after every edit, not at the PR.

Reject: editing multiple files before running the linter. Reject: writing tests at the end of the change. Reject: treating CI as the first signal.

Generate: after every edit, run the lint MCP (or equivalent) and the affected tests. After every batch of edits, run the full quality script. Paste the failures back into the plan and fix.

Diagnostic: time between edit and check should be measured in seconds, not minutes. A batch of edits without checks is pattern-matching, not engineering.

§0d

Why?

Defect cost grows with feedback latency. Boehm's 1981 curve has not aged; every minute earlier you fail is a multiplier on the eventual cost.
PRs move through review faster when the editor caught the easy failures. The reviewer spends the hour on the design, not the trailing whitespace.
Editor-level MCPs let the agent ground itself in lint, types, and tests as it writes. The agent gets the same loop the human does; CRITIC and the evaluator-optimizer pattern make this load-bearing.
The loop closes at the merge boundary via P3 Definition of Done. The continuous loop catches the small; the DoD verifies the large; the team merges with confidence on Friday afternoons.
The continuous loop is where new lint rules earn their keep. The rule that fires three times this week is the rule that adds value; the rule that never fires is debt to remove.
The continuous loop is the operationalisation of P5 Shift-Left Quality. Every check earlier is a check cheaper; the loop is where shift-left actually happens.
The loop is habit-forming. The engineer who lives in the loop stops noticing the lint warnings as friction; they become the rhythm of the work, not interruptions to it.

The receipts

Origins, quoted passages, evidence, the strongest counter-argument and the reply.

§1

Origins

The intellectual root is Barry Boehm's Software Engineering Economics (1981). Boehm's data set the cost-of-defect curve that every modern feedback-latency argument cites: a defect found at design costs one unit; in code, ten; in test, one hundred; in production, a thousand.1 The implication is direct: every minute of feedback latency is a multiplier on the eventual cost; the corrective is to compress latency at every stage.

The toolchain root is older than the term. Stephen Johnson's Lint, a C program checker (Bell Labs, 1978) is the original.5 ESLint (2013) is the modern incarnation, but the lineage is straight: a side-by-side analyser that runs while the engineer writes. The IDE-feedback-loop research from Microsoft — Murphy-Hill, Zimmermann, Bird and Nagappan's 2015 study on how developers read compiler error messages — sharpens the point: feedback the developer can act on at the keystroke is the cheapest correction available.4

The continuous-integration tradition is the team-level version. Kent Beck's Extreme Programming Explained (1999) names the ten-minute build as the load-bearing practice;6 Martin Fowler's “Continuous Integration” (2006) generalises the idea;7 Humble and Farley's Continuous Delivery (2010) extends it to the whole pipeline.8 The DORA / Forsgren research programme makes feedback latency one of the measured drivers of delivery performance.9

The agent layer is the recent extension. Editor MCPs — eslint MCP, sonar MCP, a11y MCP — let the agent run the linter mid-edit; pre-tool hooks on the Edit tool can block on lint errors; the CLAUDE.md tells the agent to run the loop after every change. Anthropic's evaluator-optimizer pattern formalises the loop;2 CRITIC and Reflexion provide the empirical evidence that the agent grounded in a tool signal improves where the agent grounding itself does not.

§2

Quotes

The longer a defect exists in a software product, the more expensive it becomes to fix. The cost grows roughly tenfold for every phase between introduction and removal.

Barry W. Boehm · Software Engineering Economics (1981)

Continuous Integration is a software development practice where members of a team integrate their work frequently. Each integration is verified by an automated build to detect integration errors as quickly as possible.

Martin Fowler · Continuous Integration (2006)

Keep the build to ten minutes. If it takes longer, fix the build, not the team. The build is the heartbeat of the team; if it stops, the team stops.

Kent Beck · Extreme Programming Explained (1999)

Fast feedback was associated with both better software-delivery performance and higher organisational performance. The shorter the feedback loop, the more the team learns.

Forsgren, Humble & Kim · Accelerate (2018)

§3

Evidence

Twenty external sources, ranked by author authority. The first five are the canon; expand to see the rest, including the qualifiers and the named opposers. Each links out to its primary source.

01
Software Engineering EconomicsSupports
Barry W. Boehm · 1981
The cost-of-defect curve. Forty years on the data has not aged; the principle still drives every modern feedback-latency argument.
02
Lint, a C Program CheckerSupports
Stephen C. Johnson · 1978
The original lint. The lineage every modern editor-loop linter inherits; one of the oldest tools whose principle survives unchanged.
03
Extreme Programming ExplainedSupports
Kent Beck · 1999, 2004
The ten-minute build. The team-level version of the editor loop; the practice that proved a fast feedback cycle was operationally possible at scale.
04
“Continuous Integration”Supports
Martin Fowler · 2006
The canonical generalisation of Beck’s ten-minute build. Frames CI as a continuous-feedback discipline at the team scale.
05
Continuous DeliverySupports
Jez Humble & David Farley · 2010
Extends CI to the whole pipeline. Each pipeline stage is a feedback loop; the discipline is to compress the latency of each.

Twenty sources, three stances. The supporters are Boehm, Johnson's lint paper, Beck, Fowler on Continuous Integration, and Humble & Farley: the cost-of-defect / continuous-integration canon. The qualifiers further down push the line that feedback quality matters as much as feedback speed. The opposers argue that constant feedback fragments attention; the steelman the case has to address.

§4b

Enforcement

Viewing: TypeScript.

Apply these rules in eslint.config.mjs. The full enforcement across every tenet lives on the implementation page.

Rule	Tool	Catches
ESLint Language Server	ESLint Language Server	lint failures at the keystroke level. The editor underlines them; the engineer fixes them before the file is saved.
tsserver	TypeScript LSP (tsserver)	type-level contract failures at the keystroke. The canonical instance of the principle for typed JavaScript.
vitest --watch	vitest --watch	test failures on save. Re-runs only the tests affected by the changed file.
Prettier (format on save)	Prettier (format on save)	format violations at save time. Removes the format conversation from PR review entirely.
lint-staged	lint-staged	the per-commit subset. Runs ESLint, Prettier and the affected tests on the staged files; sub-second loop.
Husky pre-push	Husky pre-push	the team-level second loop. Runs npm run quality on every push; the merge button doesn’t accept a red pre-push.
eslint MCP	eslint MCP	the agent-level loop. The model runs the linter mid-edit and corrects before declaring done.
@typescript-eslint/no-floating-promises	typescript-eslint	promises returned without an await. The classic class of bug that surfaces only at runtime; surfaces in the editor.

eslint.config.mjsconfiguration snippet

import tseslint from 'typescript-eslint';

export default tseslint.config({
  files: ['**/*.{ts,tsx}'],
  languageOptions: {
    parserOptions: { project: true },
  },
  rules: {
    '@typescript-eslint/no-floating-promises': 'error',
    '@typescript-eslint/no-misused-promises': 'error',
  }
});

§4c

AI rules

Paste destination

File.cursor/rules/p4-continuous-feedback.mdc

---
description: Prickles P4 — Continuous Quality Feedback
globs: "**/*.{ts,tsx,js,jsx,py,java,php}"
alwaysApply: false
---

## Prickles P4 — Continuous Quality Feedback

Lint while you write. Type-check on save. Run the test against the file you just edited. Pre-push gates close the loop.

Every check that catches an error in PR review should run in the editor or pre-push first. The cost of catching it later is exponential.

Tune the latency, don't fragment attention: lint server runs on save, not on every keystroke; the test runner watches the file you saved, not the whole suite.

For agents: configure the eslint MCP, the type-check MCP, and the test runner so the agent grounds every change in tool feedback before declaring it done.

Repo layout, CI, and ESLint wiring for these paths live on /implementation — not repeated on every tenet.

§5

Counter-argument

Counter

The strongest steelman is the deep-work reading drawn from Cal Newport: every editor notification is a context switch, every interruption is a cost, and the team optimised for constant feedback is the team that produces shallow PRs.3 Murphy-Hill's research adds the qualifier inside the row: feedback that the developer can't act on is feedback that produces frustration, not behaviour change.4 Ousterhout's broader critique is that constant feedback substitutes for design thinking; the engineer who lives in the loop never steps back to ask whether the unit is the right unit.

§6

Counter-argument retort

Newport's deep-work point lands when the editor is configured to fire on every keystroke without batching. The reply is to tune the latency, not abandon the principle: the lint server runs on save, not on every character; the test runner watches the file you saved, not the whole suite. The corrective is configuration, not retreat from continuous feedback.

Murphy-Hill's qualifier is the more useful warning: feedback the developer can't act on is noise.4 The fix is to read every linter rule and either turn it on with intent or turn it off; warnings that are never read are debt. Pair this with TA1 Linter as Law and the rule is sharp: a warning is a decision the team has not yet made; an error is a rule the team has agreed to enforce.

Ousterhout's broader critique — that constant feedback substitutes for design thinking — is true for the engineer who lives in the loop and never steps back. The reply is the same as for any tool: the tool serves the design, not the other way round. The lint loop is fast feedback on the small; the spec, the AC, and the architecture review are the large; both belong in the engineer's day, not one or the other.

In production work, the row that fires the lint and the type-check on save is the row whose PR queue moves through review in hours rather than days. P3 Definition of Done closes the loop at the merge boundary; P4 closes it at the keystroke. The two compose into a single discipline whose first job is to make defects visible early.

§7

Notes

[1]Barry W. Boehm — Software Engineering Economics (Prentice Hall, 1981). The cost-of-defect curve: a defect found at design costs one unit; in code, ten; in test, one hundred; in production, a thousand. Forty years on, the data has not aged.
[2]Noah Shinn et al. — “Reflexion: Language Agents with Verbal Reinforcement Learning”, NeurIPS 2023. The agent grounded in machine-checkable feedback improves; the agent grading itself does not. Generalised by Madaan (Self-Refine) and Gou (CRITIC).
[3]Cal Newport — Deep Work: Rules for Focused Success in a Distracted World (Grand Central, 2016). The deep-work steelman against constant feedback. The reply is to tune the latency, not abandon the principle.

Disagree? Found a hole in the argument? Take issue with this tenet →

Last revised: 2026-04-27