Case file — TS7

Test Isolation

If you mock your own code, the code is wrong.

A unit test that opens a database connection isn't a unit test, and a unit test that mocks a private helper isn't testing the unit. Both fail the same way: the test is asking the wrong scope a question.

ByAdam LewisPublished3 May 2026Reading12 minVersionv1.0ConfidenceHigh
§0b

Opinion

I've had the “but my unit needs the database” argument with engineers at every level for years and the answer is the same one Khorikov landed on in print: it does not.1Vladimir KhorikovUnit Testing: Principles, Practices, and Patterns (Manning, 2020). Ch. 5 Mocks and test fragility, Ch. 8 Why integration testing? — the cleanest argument for boundary-only mocking and the unit/integration layer split. A unit that needs the database is not a unit. The instinct to reach for a real Postgres connection inside a Vitest test comes from the right place (TS4 Real-Dependency E2E says that real DB does belong somewhere) but it belongs in the E2E layer, not the unit one. Trying to do both inside one test layer is what produces the seven-second test suite that nobody runs locally.

The mocking-lens framing is the sharper one. Sandi Metz's 3×3 matrix2Sandi Metz“The Magic Tricks of Testing”, RailsConf 2013. The 3×3 matrix: incoming query / incoming command / sent-to-self / outgoing query / outgoing command. Mock outgoing commands; never mock sent-to-self. names exactly where mocks belong: outgoing commands at the boundary, never sent-to-self. Freeman and Pryce gave it the most-quoted line in the literature, only mock types you own, and read seriously, “types you own” means “the port at the edge.”3Steve Freeman & Nat PryceGrowing Object-Oriented Software, Guided by Tests (Addison-Wesley, 2009). The most-quoted single line in the boundary-mocking literature: “only mock types you own.” When a test mocks five private collaborators, the test is not failing; it is screaming that F1 Single Responsibility has been broken upstream. The mock count is a design diagnostic.

The TS4 ↔ TS7 tension is real and it is the testing pillar's equivalent of the F3 DRY ↔ S1 Wait-for-Three tension in style. Both rules are true at their respective layers. Unit tests touch nothing; E2E tests touch everything you ship. The boundary between them is what counts as the unit. Conflating the two is what produces tests that are slow because they connect to a real DB, brittle because they mock private helpers, and useless because they prove neither isolation nor integration. Pick a layer; honour its contract; trust the other layer to do its job. Pair the rule with TS6 Behaviour Testing and the unit shape becomes obvious: the unit is whatever a caller can address without reaching past your code.

Copy a note and link

Grab this short comment and drop it into a PR comment or an LLM chat to prompt the right change.

Test the unit. Mock the edges. Trust the framework. Unit tests touch nothing outside the unit — no DB, no network, no clock, no internal collaborators. If you mock your own code, the code is wrong; the unit is what you wrote, everything else is replaced or trusted.

/tenet/test-isolation/TS7
§0c

AI eyes only

Rule: mock at process boundaries only. Real code talks to real code inside the system.

Reject: vi.mock of an internal collaborator. Reject: mocking adjacent modules to “speed up” the unit test. Reject: tests that pass against an empty implementation.

Generate: real implementations for everything inside the system. Mock only outbound process boundaries (HTTP, database, filesystem, queue). Use dependency injection at the boundary so the real and the fake share the same interface.

Diagnostic: every vi.mock target is a process boundary. If it targets an internal module, the test is testing the mock, not the system.

§0d

Why?

  • Tests run in seconds. A hermetic unit test starts and finishes inside a single Vitest worker; no service container, no migration, no fixture seeding. The whole suite stays cheap to run on every save.
  • Failures are deterministic. No flake from network jitter, clock drift, leftover database rows, or environment variable surprise. When the test fails, the cause is in the test or in the unit — nowhere else.
  • Mock count is a design diagnostic. A unit that needs five mocks is signalling that F1 Single Responsibility is broken upstream — the rule turns a testing pain point into design feedback.
  • The TS4 ↔ TS7 contrast clarifies the suite. Unit tests own “the unit is correct” proof; TS4 Real-Dependency E2E owns “the assembled system is correct” proof. No layer carries both questions, so neither bottlenecks.
  • Stop testing the framework. React already tests its renderer; the ORM already tests its query builder; the HTTP client already parses its responses. Trusting their tests cuts coverage of code you don't own and focuses coverage on code you do.
  • Reviews focus on the right question. Reviewers stop asking “is this mock correct?” and start asking “is the unit doing the right thing?” — the question that actually matters.
  • Coding agents stop mocking everything. With the rule in CLAUDE.md and a lint guard against vi.mock on local imports, the agent writes the test against the real implementation first.
The receipts
Origins, quoted passages, evidence, the strongest counter-argument and the reply.
§1

Origins

Three rules folded into one. The hermetic half traces to Google's testing blog — Hermetic Servers7Google Testing Blog“Hermetic Servers” (testing.googleblog.com, 2012). The post that made the word load-bearing in the public testing literature. in 2012, Mike Bland's 2014 post on goto fail and Heartbleed8Mike Bland“Goto Fail, Heartbleed, and the Culture of Test” (mike-bland.com, 2014). Frames hermeticity as the cultural property that prevents two of the most expensive bugs of the 2010s. as a culture-of-test essay, and Andrew Trenk's Test Sizes taxonomy10Andrew Trenk“Test Sizes” (Google Testing Blog, 2010). The Small/Medium/Large taxonomy that puts hermeticity in the title row of every Google test-pyramid diagram. that put hermeticity in the title row of every test-pyramid diagram inside Google. Bazel uses the same word as a first-class build property; the vocabulary crossed from internal Google culture into the wider tooling ecosystem.9Bazel team“Hermeticity” (bazel.build, 2015–). The vocabulary crossed from internal Google testing culture into Bazel’s public build-system docs. Same property, build-side.

The boundary half is older. Sandi Metz's Magic Tricks of Testing2Sandi Metz“The Magic Tricks of Testing”, RailsConf 2013. The 3×3 matrix: incoming query / incoming command / sent-to-self / outgoing query / outgoing command. Mock outgoing commands; never mock sent-to-self. at RailsConf 2013 introduced the canonical 3×3 matrix: incoming query, incoming command, outgoing query, outgoing command, sent-to-self. Steve Freeman and Nat Pryce's Growing Object-Oriented Software, Guided by Tests3Steve Freeman & Nat PryceGrowing Object-Oriented Software, Guided by Tests (Addison-Wesley, 2009). The most-quoted single line in the boundary-mocking literature: “only mock types you own.” gave it the most-quoted line: “only mock types you own.” Mark Seemann's commands-vs-queries refinement11Mark Seemann“Mocks for Commands, Stubs for Queries” (blog.ploeh.dk, 2013). The verb taxonomy that makes the boundary rule operational. made the verb taxonomy clean. Martin Fowler's Mocks Aren't Stubs6Martin Fowler“Mocks Aren’t Stubs” (martinfowler.com, 2007). The taxonomy that named the classicist-vs-mockist debate and the article most cited as the source for the modern boundary-only consensus. named the classicist-vs-mockist debate that the rule resolves.

The ownership half lives in the “don't test what you don't own” tradition Metz and Hyrum Wright argue from opposite sides — Metz says trust the framework; Hyrum's Law says you depend on its observable behaviour whether you mean to or not.12Hyrum Wright“Hyrum’s Law” (hyrumslaw.com, 2010). The contrapositive: with a sufficient number of users you will be tested on every observable behaviour. Reason the “trust the framework” rule needs discipline rather than blind faith. The reconciliation: trust the framework's contract; verify your code against your code. The PUP standards/unit-testing.md file owns the merged formulation.

§2

Quotes

Only mock types you own.

Steve Freeman & Nat Pryce · GOOS (2009)

Mock outgoing command messages. Stub outgoing query messages. Send-to-self messages: do not test.

Sandi Metz · The Magic Tricks of Testing (RailsConf 2013)

Mocks should only be used for unmanaged dependencies. Anything else and your tests couple to implementation details.

Vladimir Khorikov · Unit Testing (2020), Ch. 5

Hermetic, deterministic, isolated. The three properties that make a test worth running twice.

Mike Bland · Goto Fail, Heartbleed, and the Culture of Test (2014)
§3

Evidence

Twenty external sources, ranked by author authority. The first five are the canon; expand to see the rest, including the qualifiers and the named opposers. Each links out to its primary source.

  1. 01
    Google Testing Blog · 2012
    The post that made the word load-bearing in the public testing literature. A hermetic test runs the same way regardless of host, network, filesystem, or clock.
  2. 02
    Sandi Metz · 2013
    The 3×3 matrix. Mock outgoing commands at the boundary; never mock sent-to-self. The single most-cited source for the boundary-mocking lens.
  3. 03
    Steve Freeman & Nat Pryce · 2009
    The London-school reference. “Only mock types you own” — read seriously, that means mock at the port boundary, never on helpers you control.
  4. 04
    Vladimir Khorikov · 2020
    The cleanest modern synthesis. Ch. 5 Mocks and test fragility, Ch. 8 Why integration testing? — argues for unmanaged-dependencies-only mocking and the unit/integration layer split.
  5. 05
    Martin Fowler · 2007
    The classicist-vs-mockist taxonomy. The mature classicist position is the boundary-only reading.

Eighteen sources, three lineages. The OOP school (Metz, Freeman & Pryce) names the boundary. The Google school's “Hermetic Servers” piece names the hermetic property. The classicist response (Fowler, Khorikov) writes the modern consensus. They agree on the answer: mock the edges, trust the framework, isolate the unit.

§4

Examples

Viewing: TypeScript.
Avoid
Filegreet-hedgehog-visitor.spec.ts
// Before: five mocks, one of them a private helper.vi.mock("./database");vi.mock("./mailer");vi.mock("./clock");vi.mock("./random");vi.mock("./greet-hedgehog-visitor", () => ({ formatHedgehogGreeting: vi.fn().mockReturnValue("hi") }));it("greets a returning visitor", async () => {  await greetHedgehogVisitor({ visitorId: "v-12", arrivedAt: "2026-05-03T09:00Z" });  expect(formatHedgehogGreeting).toHaveBeenCalled();});
Prefer
Filegreet-hedgehog-visitor.spec.ts
// After: mock only the boundaries; let the real helper run.vi.mock("./database");vi.mock("./mailer");vi.mock("./clock");vi.mock("./random");it("greets a returning visitor with their burrow name", async () => {  const greeting = await greetHedgehogVisitor({ visitorId: "v-12", arrivedAt: "2026-05-03T09:00Z" });  expect(greeting).toBe("Welcome back to Burrow 12, dormouse-friend");}
§4b

Enforcement

Viewing: TypeScript.

Apply these rules in eslint.config.mjs. The full enforcement across every tenet lives on the implementation page.

RuleToolCatches
vitest/no-conditional-testseslint-plugin-vitestif-wrapped tests that skip in some environments — non-determinism leaks into the suite via env-shape.
vitest/no-conditional-in-testeslint-plugin-vitestif/switch inside test bodies — usually the smell of a test that exercises multiple scenarios that should be split.
vitest/no-disabled-testseslint-plugin-vitesttest.skip / xtest left in main — silent green that pretends to be coverage.
vitest/no-focused-testseslint-plugin-vitesttest.only / fit left in main — partial-suite green that hides whatever else broke.
vitest/expect-expecteslint-plugin-vitesttests with no expect() call — empty-green smell.
no-restricted-imports (db drivers)ESLint corepg, mysql2, pg-promise, mysql or any /db/* import inside unit-test files — pins the hermeticity boundary at lint time.
no-only-tests/no-only-testseslint-plugin-no-only-tests.only on describe / it across the whole repo — the dual of no-focused-tests for non-vitest test runners.
eslint.config.mjsconfiguration snippet
import tseslint from 'typescript-eslint';
import vitest from 'eslint-plugin-vitest';
import noOnlyTests from 'eslint-plugin-no-only-tests';

export default tseslint.config({
  files: ['**/*.spec.{ts,tsx}', '**/*.test.{ts,tsx}'],
  plugins: { vitest, 'no-only-tests': noOnlyTests },
  rules: {
    'vitest/no-conditional-tests': 'error',
    'vitest/no-conditional-in-test': 'error',
    'vitest/no-disabled-tests': 'error',
    'vitest/no-focused-tests': 'error',
    'vitest/no-mocks-import': 'error',
    'vitest/expect-expect': 'error',
    'no-only-tests/no-only-tests': 'error',
    'no-restricted-imports': ['error', {
      paths: [
        { name: 'pg', message: 'Unit tests must not import a real database driver. Mock at the boundary or use the integration layer.' },
        { name: 'mysql2', message: 'Unit tests must not import a real database driver. Mock at the boundary or use the integration layer.' },
      ],
      patterns: [
        { group: ['**/db/**'], message: 'Unit tests must not import database modules directly. Test against the boundary stub.' },
      ],
    }],
  }
});
§4c

AI rules

File.cursor/rules/ts7-test-isolation.mdc
---
description: Prickles TS7 — Test Isolation
globs: "**/*.{spec,test}.{ts,tsx,js,jsx}"
alwaysApply: false
---

## Prickles TS7 — Test Isolation

The unit is what you wrote. Everything outside the unit is replaced or trusted: no DB, no network, no clock, no filesystem, no env.

Mock at the edges, never inside. Stubs and fakes belong at ports — adapters that cross the process boundary. Helpers inside the same module never get mocked; refactor instead.

Trust the framework. React's renderer, the ORM's query builder, the HTTP client's parser — these have their own tests. Don't write yours.

If a unit needs five mocks to test, the unit is doing five things. Extract until the mocks become unnecessary.

Repo layout, CI, and ESLint wiring for these paths live on /implementation — not repeated on every tenet.

§5

Counter-argument

Counter

The honest pushback comes from two directions. The London-school strict reading of GOOS argues every interaction with collaborators should be mocked, including internal ones, because that is how tests drive design.3Steve Freeman & Nat PryceGrowing Object-Oriented Software, Guided by Tests (Addison-Wesley, 2009). The most-quoted single line in the boundary-mocking literature: “only mock types you own.” The integration-tests-are-better camp (J. B. Rainsberger's Integrated Tests Are A Scam5J. B. Rainsberger“Integrated Tests Are A Scam” (Agile 2009; written up 2010). The principal counter-argument to integrated testing. Argues contract tests at the boundary are the only test buying you proof. in its sharpest form) argues isolated unit tests give false confidence; only the assembled system is the contract worth checking. Both arguments converge on the same diagnostic question: if isolation is so cheap, why does the test suite still ship bugs?

§6

Counter-argument retort

Reply

The London-school argument is half right. Every authority who taught it — Freeman and Pryce themselves — qualified it: only mock types you own, and read in context, that means the port at the boundary, not the helper inside the module.3Steve Freeman & Nat PryceGrowing Object-Oriented Software, Guided by Tests (Addison-Wesley, 2009). The most-quoted single line in the boundary-mocking literature: “only mock types you own.” Fowler's Mocks Aren't Stubs, Seemann's commands-vs-queries refinement, Khorikov's book all converge on the boundary-only reading.6Martin Fowler“Mocks Aren’t Stubs” (martinfowler.com, 2007). The taxonomy that named the classicist-vs-mockist debate and the article most cited as the source for the modern boundary-only consensus. Tests still drive design — a unit that needs five mocks is signalling five responsibilities, and that signal is more useful than the mocks ever were.

Rainsberger's “integration tests are a scam” is the more interesting counter and it is the engine behind the TS4 ↔ TS7 pairing.5J. B. Rainsberger“Integrated Tests Are A Scam” (Agile 2009; written up 2010). The principal counter-argument to integrated testing. Argues contract tests at the boundary are the only test buying you proof. The reply isn't “don't do integration tests”; it is “do both, at the right layer.” Unit tests run in seconds and prove the unit behaves; E2E tests run in minutes and prove the assembled system behaves. TS4 Real-Dependency E2E owns the second layer. TS7 owns the first. The contradiction at the headline level (“mock everything” vs “mock nothing”) dissolves once you say which layer you're in. The cost of conflation is the test suite that runs against a real database, mocks private helpers, takes seven seconds to start, and proves neither property.

The hermetic-tests authority is unusually deep for a rule this concrete. Google's testing blog has used the word as load-bearing since 2008,7Google Testing Blog“Hermetic Servers” (testing.googleblog.com, 2012). The post that made the word load-bearing in the public testing literature. Mike Bland's post-mortem on Goto Fail and Heartbleed names hermeticity as the cultural property that prevents both classes of bug,8Mike Bland“Goto Fail, Heartbleed, and the Culture of Test” (mike-bland.com, 2014). Frames hermeticity as the cultural property that prevents two of the most expensive bugs of the 2010s. and Bazel codifies hermeticity as a first-class build property.9Bazel team“Hermeticity” (bazel.build, 2015–). The vocabulary crossed from internal Google testing culture into Bazel’s public build-system docs. Same property, build-side. The rule isn't Prickles invention — it is the consensus position across every shop that has shipped at scale.

The pragmatic test: pick the slowest test in your unit suite. If it does anything outside the unit, you have a layer-confusion bug. Move it to the integration layer or refactor the unit so the dependency moves to a port. The tests get faster; the failures get specific; the design pressure lands where it should.

§7

Notes

  1. [1]Vladimir KhorikovUnit Testing: Principles, Practices, and Patterns (Manning, 2020). Ch. 5 Mocks and test fragility, Ch. 8 Why integration testing? — the cleanest argument for boundary-only mocking and the unit/integration layer split.
  2. [2]Sandi Metz“The Magic Tricks of Testing”, RailsConf 2013. The 3×3 matrix: incoming query / incoming command / sent-to-self / outgoing query / outgoing command. Mock outgoing commands; never mock sent-to-self.
  3. [3]Steve Freeman & Nat PryceGrowing Object-Oriented Software, Guided by Tests (Addison-Wesley, 2009). The most-quoted single line in the boundary-mocking literature: “only mock types you own.”
Disagree? Found a hole in the argument? Take issue with this tenet →
Last revised: 2026-04-27