TS7 Test Isolation

§0b

Opinion

I've had the “but my unit needs the database” argument with engineers at every level for years and the answer is the same one Khorikov landed on in print: it does not.1 A unit that needs the database is not a unit. The instinct to reach for a real Postgres connection inside a Vitest test comes from the right place (TS4 Real-Dependency E2E says that real DB does belong somewhere) but it belongs in the E2E layer, not the unit one. Trying to do both inside one test layer is what produces the seven-second test suite that nobody runs locally.

The mocking-lens framing is the sharper one. Sandi Metz's 3×3 matrix2 names exactly where mocks belong: outgoing commands at the boundary, never sent-to-self. Freeman and Pryce gave it the most-quoted line in the literature, only mock types you own, and read seriously, “types you own” means “the port at the edge.”3 When a test mocks five private collaborators, the test is not failing; it is screaming that F1 Single Responsibility has been broken upstream. The mock count is a design diagnostic.

The TS4 ↔ TS7 tension is real and it is the testing pillar's equivalent of the F3 DRY ↔ S1 Wait-for-Three tension in style. Both rules are true at their respective layers. Unit tests touch nothing; E2E tests touch everything you ship. The boundary between them is what counts as the unit. Conflating the two is what produces tests that are slow because they connect to a real DB, brittle because they mock private helpers, and useless because they prove neither isolation nor integration. Pick a layer; honour its contract; trust the other layer to do its job. Pair the rule with TS6 Behaviour Testing and the unit shape becomes obvious: the unit is whatever a caller can address without reaching past your code.

Copy a note and link

Grab this short comment and drop it into a PR comment or an LLM chat to prompt the right change.

Test the unit. Mock the edges. Trust the framework. Unit tests touch nothing outside the unit — no DB, no network, no clock, no internal collaborators. If you mock your own code, the code is wrong; the unit is what you wrote, everything else is replaced or trusted.

/tenet/test-isolation/TS7

§0c

AI eyes only

Rule: mock at process boundaries only. Real code talks to real code inside the system.

Reject: vi.mock of an internal collaborator. Reject: mocking adjacent modules to “speed up” the unit test. Reject: tests that pass against an empty implementation.

Generate: real implementations for everything inside the system. Mock only outbound process boundaries (HTTP, database, filesystem, queue). Use dependency injection at the boundary so the real and the fake share the same interface.

Diagnostic: every vi.mock target is a process boundary. If it targets an internal module, the test is testing the mock, not the system.

§0d

Why?

Tests run in seconds. A hermetic unit test starts and finishes inside a single Vitest worker; no service container, no migration, no fixture seeding. The whole suite stays cheap to run on every save.
Failures are deterministic. No flake from network jitter, clock drift, leftover database rows, or environment variable surprise. When the test fails, the cause is in the test or in the unit — nowhere else.
Mock count is a design diagnostic. A unit that needs five mocks is signalling that F1 Single Responsibility is broken upstream — the rule turns a testing pain point into design feedback.
The TS4 ↔ TS7 contrast clarifies the suite. Unit tests own “the unit is correct” proof; TS4 Real-Dependency E2E owns “the assembled system is correct” proof. No layer carries both questions, so neither bottlenecks.
Stop testing the framework. React already tests its renderer; the ORM already tests its query builder; the HTTP client already parses its responses. Trusting their tests cuts coverage of code you don't own and focuses coverage on code you do.
Reviews focus on the right question. Reviewers stop asking “is this mock correct?” and start asking “is the unit doing the right thing?” — the question that actually matters.
Coding agents stop mocking everything. With the rule in CLAUDE.md and a lint guard against vi.mock on local imports, the agent writes the test against the real implementation first.

The receipts

Origins, quoted passages, evidence, the strongest counter-argument and the reply.

§1

Origins

Three rules folded into one. The hermetic half traces to Google's testing blog — Hermetic Servers7 in 2012, Mike Bland's 2014 post on goto fail and Heartbleed8 as a culture-of-test essay, and Andrew Trenk's Test Sizes taxonomy10 that put hermeticity in the title row of every test-pyramid diagram inside Google. Bazel uses the same word as a first-class build property; the vocabulary crossed from internal Google culture into the wider tooling ecosystem.9

The boundary half is older. Sandi Metz's Magic Tricks of Testing2 at RailsConf 2013 introduced the canonical 3×3 matrix: incoming query, incoming command, outgoing query, outgoing command, sent-to-self. Steve Freeman and Nat Pryce's Growing Object-Oriented Software, Guided by Tests3 gave it the most-quoted line: “only mock types you own.” Mark Seemann's commands-vs-queries refinement11 made the verb taxonomy clean. Martin Fowler's Mocks Aren't Stubs6 named the classicist-vs-mockist debate that the rule resolves.

The ownership half lives in the “don't test what you don't own” tradition Metz and Hyrum Wright argue from opposite sides — Metz says trust the framework; Hyrum's Law says you depend on its observable behaviour whether you mean to or not.12 The reconciliation: trust the framework's contract; verify your code against your code. The PUP standards/unit-testing.md file owns the merged formulation.

§2

Quotes

Only mock types you own.

Steve Freeman & Nat Pryce · GOOS (2009)

Mock outgoing command messages. Stub outgoing query messages. Send-to-self messages: do not test.

Sandi Metz · The Magic Tricks of Testing (RailsConf 2013)

Mocks should only be used for unmanaged dependencies. Anything else and your tests couple to implementation details.

Vladimir Khorikov · Unit Testing (2020), Ch. 5

Hermetic, deterministic, isolated. The three properties that make a test worth running twice.

Mike Bland · Goto Fail, Heartbleed, and the Culture of Test (2014)

§3

Evidence

Twenty external sources, ranked by author authority. The first five are the canon; expand to see the rest, including the qualifiers and the named opposers. Each links out to its primary source.

01
“Hermetic Servers”Supports
Google Testing Blog · 2012
The post that made the word load-bearing in the public testing literature. A hermetic test runs the same way regardless of host, network, filesystem, or clock.
02
“The Magic Tricks of Testing” (RailsConf talk)Supports
Sandi Metz · 2013
The 3×3 matrix. Mock outgoing commands at the boundary; never mock sent-to-self. The single most-cited source for the boundary-mocking lens.
03
Growing Object-Oriented Software, Guided by TestsSupports
Steve Freeman & Nat Pryce · 2009
The London-school reference. “Only mock types you own” — read seriously, that means mock at the port boundary, never on helpers you control.
04
Unit Testing: Principles, Practices, and PatternsSupports
Vladimir Khorikov · 2020
The cleanest modern synthesis. Ch. 5 Mocks and test fragility, Ch. 8 Why integration testing? — argues for unmanaged-dependencies-only mocking and the unit/integration layer split.
05
“Mocks Aren't Stubs”Supports
Martin Fowler · 2007
The classicist-vs-mockist taxonomy. The mature classicist position is the boundary-only reading.

Eighteen sources, three lineages. The OOP school (Metz, Freeman & Pryce) names the boundary. The Google school's “Hermetic Servers” piece names the hermetic property. The classicist response (Fowler, Khorikov) writes the modern consensus. They agree on the answer: mock the edges, trust the framework, isolate the unit.

§4

Examples

Viewing: TypeScript.

Avoid

Filegreet-hedgehog-visitor.spec.ts

// Before: five mocks, one of them a private helper.vi.mock("./database");vi.mock("./mailer");vi.mock("./clock");vi.mock("./random");vi.mock("./greet-hedgehog-visitor", () => ({ formatHedgehogGreeting: vi.fn().mockReturnValue("hi") }));it("greets a returning visitor", async () => {  await greetHedgehogVisitor({ visitorId: "v-12", arrivedAt: "2026-05-03T09:00Z" });  expect(formatHedgehogGreeting).toHaveBeenCalled();});

Prefer

Filegreet-hedgehog-visitor.spec.ts

// After: mock only the boundaries; let the real helper run.vi.mock("./database");vi.mock("./mailer");vi.mock("./clock");vi.mock("./random");it("greets a returning visitor with their burrow name", async () => {  const greeting = await greetHedgehogVisitor({ visitorId: "v-12", arrivedAt: "2026-05-03T09:00Z" });  expect(greeting).toBe("Welcome back to Burrow 12, dormouse-friend");}

§4b

Enforcement

Viewing: TypeScript.

Apply these rules in eslint.config.mjs. The full enforcement across every tenet lives on the implementation page.

Rule	Tool	Catches
vitest/no-conditional-tests	eslint-plugin-vitest	if-wrapped tests that skip in some environments — non-determinism leaks into the suite via env-shape.
vitest/no-conditional-in-test	eslint-plugin-vitest	if/switch inside test bodies — usually the smell of a test that exercises multiple scenarios that should be split.
vitest/no-disabled-tests	eslint-plugin-vitest	test.skip / xtest left in main — silent green that pretends to be coverage.
vitest/no-focused-tests	eslint-plugin-vitest	test.only / fit left in main — partial-suite green that hides whatever else broke.
vitest/expect-expect	eslint-plugin-vitest	tests with no expect() call — empty-green smell.
no-restricted-imports (db drivers)	ESLint core	pg, mysql2, pg-promise, mysql or any /db/* import inside unit-test files — pins the hermeticity boundary at lint time.
no-only-tests/no-only-tests	eslint-plugin-no-only-tests	.only on describe / it across the whole repo — the dual of no-focused-tests for non-vitest test runners.

eslint.config.mjsconfiguration snippet

import tseslint from 'typescript-eslint';
import vitest from 'eslint-plugin-vitest';
import noOnlyTests from 'eslint-plugin-no-only-tests';

export default tseslint.config({
  files: ['**/*.spec.{ts,tsx}', '**/*.test.{ts,tsx}'],
  plugins: { vitest, 'no-only-tests': noOnlyTests },
  rules: {
    'vitest/no-conditional-tests': 'error',
    'vitest/no-conditional-in-test': 'error',
    'vitest/no-disabled-tests': 'error',
    'vitest/no-focused-tests': 'error',
    'vitest/no-mocks-import': 'error',
    'vitest/expect-expect': 'error',
    'no-only-tests/no-only-tests': 'error',
    'no-restricted-imports': ['error', {
      paths: [
        { name: 'pg', message: 'Unit tests must not import a real database driver. Mock at the boundary or use the integration layer.' },
        { name: 'mysql2', message: 'Unit tests must not import a real database driver. Mock at the boundary or use the integration layer.' },
      ],
      patterns: [
        { group: ['**/db/**'], message: 'Unit tests must not import database modules directly. Test against the boundary stub.' },
      ],
    }],
  }
});

§4c

AI rules

Paste destination

File.cursor/rules/ts7-test-isolation.mdc

---
description: Prickles TS7 — Test Isolation
globs: "**/*.{spec,test}.{ts,tsx,js,jsx}"
alwaysApply: false
---

## Prickles TS7 — Test Isolation

The unit is what you wrote. Everything outside the unit is replaced or trusted: no DB, no network, no clock, no filesystem, no env.

Mock at the edges, never inside. Stubs and fakes belong at ports — adapters that cross the process boundary. Helpers inside the same module never get mocked; refactor instead.

Trust the framework. React's renderer, the ORM's query builder, the HTTP client's parser — these have their own tests. Don't write yours.

If a unit needs five mocks to test, the unit is doing five things. Extract until the mocks become unnecessary.

Repo layout, CI, and ESLint wiring for these paths live on /implementation — not repeated on every tenet.

§5

Counter-argument

Counter

The honest pushback comes from two directions. The London-school strict reading of GOOS argues every interaction with collaborators should be mocked, including internal ones, because that is how tests drive design.3 The integration-tests-are-better camp (J. B. Rainsberger's Integrated Tests Are A Scam5 in its sharpest form) argues isolated unit tests give false confidence; only the assembled system is the contract worth checking. Both arguments converge on the same diagnostic question: if isolation is so cheap, why does the test suite still ship bugs?

§6

Counter-argument retort

The London-school argument is half right. Every authority who taught it — Freeman and Pryce themselves — qualified it: only mock types you own, and read in context, that means the port at the boundary, not the helper inside the module.3 Fowler's Mocks Aren't Stubs, Seemann's commands-vs-queries refinement, Khorikov's book all converge on the boundary-only reading.6 Tests still drive design — a unit that needs five mocks is signalling five responsibilities, and that signal is more useful than the mocks ever were.

Rainsberger's “integration tests are a scam” is the more interesting counter and it is the engine behind the TS4 ↔ TS7 pairing.5 The reply isn't “don't do integration tests”; it is “do both, at the right layer.” Unit tests run in seconds and prove the unit behaves; E2E tests run in minutes and prove the assembled system behaves. TS4 Real-Dependency E2E owns the second layer. TS7 owns the first. The contradiction at the headline level (“mock everything” vs “mock nothing”) dissolves once you say which layer you're in. The cost of conflation is the test suite that runs against a real database, mocks private helpers, takes seven seconds to start, and proves neither property.

The hermetic-tests authority is unusually deep for a rule this concrete. Google's testing blog has used the word as load-bearing since 2008,7 Mike Bland's post-mortem on Goto Fail and Heartbleed names hermeticity as the cultural property that prevents both classes of bug,8 and Bazel codifies hermeticity as a first-class build property.9 The rule isn't Prickles invention — it is the consensus position across every shop that has shipped at scale.

The pragmatic test: pick the slowest test in your unit suite. If it does anything outside the unit, you have a layer-confusion bug. Move it to the integration layer or refactor the unit so the dependency moves to a port. The tests get faster; the failures get specific; the design pressure lands where it should.

§7

Notes

[1]Vladimir Khorikov — Unit Testing: Principles, Practices, and Patterns (Manning, 2020). Ch. 5 Mocks and test fragility, Ch. 8 Why integration testing? — the cleanest argument for boundary-only mocking and the unit/integration layer split.
[2]Sandi Metz — “The Magic Tricks of Testing”, RailsConf 2013. The 3×3 matrix: incoming query / incoming command / sent-to-self / outgoing query / outgoing command. Mock outgoing commands; never mock sent-to-self.
[3]Steve Freeman & Nat Pryce — Growing Object-Oriented Software, Guided by Tests (Addison-Wesley, 2009). The most-quoted single line in the boundary-mocking literature: “only mock types you own.”

Disagree? Found a hole in the argument? Take issue with this tenet →

Last revised: 2026-04-27