Test Isolation
If you mock your own code, the code is wrong.
A unit test that opens a database connection isn't a unit test, and a unit test that mocks a private helper isn't testing the unit. Both fail the same way: the test is asking the wrong scope a question.
Opinion
I've had the “but my unit needs the database” argument with engineers at every level for years and the answer is the same one Khorikov landed on in print: it does not.1Unit Testing: Principles, Practices, and Patterns (Manning, 2020). Ch. 5 Mocks and test fragility, Ch. 8 Why integration testing? — the cleanest argument for boundary-only mocking and the unit/integration layer split. A unit that needs the database is not a unit. The instinct to reach for a real Postgres connection inside a Vitest test comes from the right place (TS4 Real-Dependency E2E says that real DB does belong somewhere) but it belongs in the E2E layer, not the unit one. Trying to do both inside one test layer is what produces the seven-second test suite that nobody runs locally.
The mocking-lens framing is the sharper one. Sandi Metz's 3×3 matrix2“The Magic Tricks of Testing”, RailsConf 2013. The 3×3 matrix: incoming query / incoming command / sent-to-self / outgoing query / outgoing command. Mock outgoing commands; never mock sent-to-self. names exactly where mocks belong: outgoing commands at the boundary, never sent-to-self. Freeman and Pryce gave it the most-quoted line in the literature, only mock types you own, and read seriously, “types you own” means “the port at the edge.”3Growing Object-Oriented Software, Guided by Tests (Addison-Wesley, 2009). The most-quoted single line in the boundary-mocking literature: “only mock types you own.” When a test mocks five private collaborators, the test is not failing; it is screaming that F1 Single Responsibility has been broken upstream. The mock count is a design diagnostic.
The TS4 ↔ TS7 tension is real and it is the testing pillar's equivalent of the F3 DRY ↔ S1 Wait-for-Three tension in style. Both rules are true at their respective layers. Unit tests touch nothing; E2E tests touch everything you ship. The boundary between them is what counts as the unit. Conflating the two is what produces tests that are slow because they connect to a real DB, brittle because they mock private helpers, and useless because they prove neither isolation nor integration. Pick a layer; honour its contract; trust the other layer to do its job. Pair the rule with TS6 Behaviour Testing and the unit shape becomes obvious: the unit is whatever a caller can address without reaching past your code.
Copy a note and link
Grab this short comment and drop it into a PR comment or an LLM chat to prompt the right change.
Test the unit. Mock the edges. Trust the framework. Unit tests touch nothing outside the unit — no DB, no network, no clock, no internal collaborators. If you mock your own code, the code is wrong; the unit is what you wrote, everything else is replaced or trusted. /tenet/test-isolation/TS7
AI eyes only
Rule: mock at process boundaries only. Real code talks to real code inside the system.
Reject: vi.mock of an internal collaborator. Reject: mocking adjacent modules to “speed up” the unit test. Reject: tests that pass against an empty implementation.
Generate: real implementations for everything inside the system. Mock only outbound process boundaries (HTTP, database, filesystem, queue). Use dependency injection at the boundary so the real and the fake share the same interface.
Diagnostic: every vi.mock target is a process boundary. If it targets an internal module, the test is testing the mock, not the system.
Why?
- Tests run in seconds. A hermetic unit test starts and finishes inside a single Vitest worker; no service container, no migration, no fixture seeding. The whole suite stays cheap to run on every save.
- Failures are deterministic. No flake from network jitter, clock drift, leftover database rows, or environment variable surprise. When the test fails, the cause is in the test or in the unit — nowhere else.
- Mock count is a design diagnostic. A unit that needs five mocks is signalling that F1 Single Responsibility is broken upstream — the rule turns a testing pain point into design feedback.
- The TS4 ↔ TS7 contrast clarifies the suite. Unit tests own “the unit is correct” proof; TS4 Real-Dependency E2E owns “the assembled system is correct” proof. No layer carries both questions, so neither bottlenecks.
- Stop testing the framework. React already tests its renderer; the ORM already tests its query builder; the HTTP client already parses its responses. Trusting their tests cuts coverage of code you don't own and focuses coverage on code you do.
- Reviews focus on the right question. Reviewers stop asking “is this mock correct?” and start asking “is the unit doing the right thing?” — the question that actually matters.
- Coding agents stop mocking everything. With the rule in CLAUDE.md and a lint guard against
vi.mockon local imports, the agent writes the test against the real implementation first.
Origins
Three rules folded into one. The hermetic half traces to Google's testing blog — Hermetic Servers7“Hermetic Servers” (testing.googleblog.com, 2012). The post that made the word load-bearing in the public testing literature. in 2012, Mike Bland's 2014 post on goto fail and Heartbleed8“Goto Fail, Heartbleed, and the Culture of Test” (mike-bland.com, 2014). Frames hermeticity as the cultural property that prevents two of the most expensive bugs of the 2010s. as a culture-of-test essay, and Andrew Trenk's Test Sizes taxonomy10“Test Sizes” (Google Testing Blog, 2010). The Small/Medium/Large taxonomy that puts hermeticity in the title row of every Google test-pyramid diagram. that put hermeticity in the title row of every test-pyramid diagram inside Google. Bazel uses the same word as a first-class build property; the vocabulary crossed from internal Google culture into the wider tooling ecosystem.9“Hermeticity” (bazel.build, 2015–). The vocabulary crossed from internal Google testing culture into Bazel’s public build-system docs. Same property, build-side.
The boundary half is older. Sandi Metz's Magic Tricks of Testing2“The Magic Tricks of Testing”, RailsConf 2013. The 3×3 matrix: incoming query / incoming command / sent-to-self / outgoing query / outgoing command. Mock outgoing commands; never mock sent-to-self. at RailsConf 2013 introduced the canonical 3×3 matrix: incoming query, incoming command, outgoing query, outgoing command, sent-to-self. Steve Freeman and Nat Pryce's Growing Object-Oriented Software, Guided by Tests3Growing Object-Oriented Software, Guided by Tests (Addison-Wesley, 2009). The most-quoted single line in the boundary-mocking literature: “only mock types you own.” gave it the most-quoted line: “only mock types you own.” Mark Seemann's commands-vs-queries refinement11“Mocks for Commands, Stubs for Queries” (blog.ploeh.dk, 2013). The verb taxonomy that makes the boundary rule operational. made the verb taxonomy clean. Martin Fowler's Mocks Aren't Stubs6“Mocks Aren’t Stubs” (martinfowler.com, 2007). The taxonomy that named the classicist-vs-mockist debate and the article most cited as the source for the modern boundary-only consensus. named the classicist-vs-mockist debate that the rule resolves.
The ownership half lives in the “don't test what you don't own” tradition Metz and Hyrum Wright argue from opposite sides — Metz says trust the framework; Hyrum's Law says you depend on its observable behaviour whether you mean to or not.12“Hyrum’s Law” (hyrumslaw.com, 2010). The contrapositive: with a sufficient number of users you will be tested on every observable behaviour. Reason the “trust the framework” rule needs discipline rather than blind faith. The reconciliation: trust the framework's contract; verify your code against your code. The PUP standards/unit-testing.md file owns the merged formulation.
Quotes
Only mock types you own.
Mock outgoing command messages. Stub outgoing query messages. Send-to-self messages: do not test.
Mocks should only be used for unmanaged dependencies. Anything else and your tests couple to implementation details.
Hermetic, deterministic, isolated. The three properties that make a test worth running twice.
Evidence
Twenty external sources, ranked by author authority. The first five are the canon; expand to see the rest, including the qualifiers and the named opposers. Each links out to its primary source.
- 01“Hermetic Servers”SupportsThe post that made the word load-bearing in the public testing literature. A hermetic test runs the same way regardless of host, network, filesystem, or clock.
- 02The 3×3 matrix. Mock outgoing commands at the boundary; never mock sent-to-self. The single most-cited source for the boundary-mocking lens.
- 03The London-school reference. “Only mock types you own” — read seriously, that means mock at the port boundary, never on helpers you control.
- 04The cleanest modern synthesis. Ch. 5 Mocks and test fragility, Ch. 8 Why integration testing? — argues for unmanaged-dependencies-only mocking and the unit/integration layer split.
- 05“Mocks Aren't Stubs”SupportsThe classicist-vs-mockist taxonomy. The mature classicist position is the boundary-only reading.
Eighteen sources, three lineages. The OOP school (Metz, Freeman & Pryce) names the boundary. The Google school's “Hermetic Servers” piece names the hermetic property. The classicist response (Fowler, Khorikov) writes the modern consensus. They agree on the answer: mock the edges, trust the framework, isolate the unit.
Examples
// Before: five mocks, one of them a private helper.vi.mock("./database");vi.mock("./mailer");vi.mock("./clock");vi.mock("./random");vi.mock("./greet-hedgehog-visitor", () => ({ formatHedgehogGreeting: vi.fn().mockReturnValue("hi") }));it("greets a returning visitor", async () => { await greetHedgehogVisitor({ visitorId: "v-12", arrivedAt: "2026-05-03T09:00Z" }); expect(formatHedgehogGreeting).toHaveBeenCalled();});
// After: mock only the boundaries; let the real helper run.vi.mock("./database");vi.mock("./mailer");vi.mock("./clock");vi.mock("./random");it("greets a returning visitor with their burrow name", async () => { const greeting = await greetHedgehogVisitor({ visitorId: "v-12", arrivedAt: "2026-05-03T09:00Z" }); expect(greeting).toBe("Welcome back to Burrow 12, dormouse-friend");}
Enforcement
Apply these rules in eslint.config.mjs. The full enforcement across every tenet lives on the implementation page.
| Rule | Tool | Catches |
|---|---|---|
| vitest/no-conditional-tests | eslint-plugin-vitest | if-wrapped tests that skip in some environments — non-determinism leaks into the suite via env-shape. |
| vitest/no-conditional-in-test | eslint-plugin-vitest | if/switch inside test bodies — usually the smell of a test that exercises multiple scenarios that should be split. |
| vitest/no-disabled-tests | eslint-plugin-vitest | test.skip / xtest left in main — silent green that pretends to be coverage. |
| vitest/no-focused-tests | eslint-plugin-vitest | test.only / fit left in main — partial-suite green that hides whatever else broke. |
| vitest/expect-expect | eslint-plugin-vitest | tests with no expect() call — empty-green smell. |
| no-restricted-imports (db drivers) | ESLint core | pg, mysql2, pg-promise, mysql or any /db/* import inside unit-test files — pins the hermeticity boundary at lint time. |
| no-only-tests/no-only-tests | eslint-plugin-no-only-tests | .only on describe / it across the whole repo — the dual of no-focused-tests for non-vitest test runners. |
eslint.config.mjsconfiguration snippet
import tseslint from 'typescript-eslint';
import vitest from 'eslint-plugin-vitest';
import noOnlyTests from 'eslint-plugin-no-only-tests';
export default tseslint.config({
files: ['**/*.spec.{ts,tsx}', '**/*.test.{ts,tsx}'],
plugins: { vitest, 'no-only-tests': noOnlyTests },
rules: {
'vitest/no-conditional-tests': 'error',
'vitest/no-conditional-in-test': 'error',
'vitest/no-disabled-tests': 'error',
'vitest/no-focused-tests': 'error',
'vitest/no-mocks-import': 'error',
'vitest/expect-expect': 'error',
'no-only-tests/no-only-tests': 'error',
'no-restricted-imports': ['error', {
paths: [
{ name: 'pg', message: 'Unit tests must not import a real database driver. Mock at the boundary or use the integration layer.' },
{ name: 'mysql2', message: 'Unit tests must not import a real database driver. Mock at the boundary or use the integration layer.' },
],
patterns: [
{ group: ['**/db/**'], message: 'Unit tests must not import database modules directly. Test against the boundary stub.' },
],
}],
}
});AI rules
.cursor/rules/ts7-test-isolation.mdc---
description: Prickles TS7 — Test Isolation
globs: "**/*.{spec,test}.{ts,tsx,js,jsx}"
alwaysApply: false
---
## Prickles TS7 — Test Isolation
The unit is what you wrote. Everything outside the unit is replaced or trusted: no DB, no network, no clock, no filesystem, no env.
Mock at the edges, never inside. Stubs and fakes belong at ports — adapters that cross the process boundary. Helpers inside the same module never get mocked; refactor instead.
Trust the framework. React's renderer, the ORM's query builder, the HTTP client's parser — these have their own tests. Don't write yours.
If a unit needs five mocks to test, the unit is doing five things. Extract until the mocks become unnecessary.Repo layout, CI, and ESLint wiring for these paths live on /implementation — not repeated on every tenet.
Counter-argument
The honest pushback comes from two directions. The London-school strict reading of GOOS argues every interaction with collaborators should be mocked, including internal ones, because that is how tests drive design.3Growing Object-Oriented Software, Guided by Tests (Addison-Wesley, 2009). The most-quoted single line in the boundary-mocking literature: “only mock types you own.” The integration-tests-are-better camp (J. B. Rainsberger's Integrated Tests Are A Scam5“Integrated Tests Are A Scam” (Agile 2009; written up 2010). The principal counter-argument to integrated testing. Argues contract tests at the boundary are the only test buying you proof. in its sharpest form) argues isolated unit tests give false confidence; only the assembled system is the contract worth checking. Both arguments converge on the same diagnostic question: if isolation is so cheap, why does the test suite still ship bugs?
Counter-argument retort
The London-school argument is half right. Every authority who taught it — Freeman and Pryce themselves — qualified it: only mock types you own, and read in context, that means the port at the boundary, not the helper inside the module.3Growing Object-Oriented Software, Guided by Tests (Addison-Wesley, 2009). The most-quoted single line in the boundary-mocking literature: “only mock types you own.” Fowler's Mocks Aren't Stubs, Seemann's commands-vs-queries refinement, Khorikov's book all converge on the boundary-only reading.6“Mocks Aren’t Stubs” (martinfowler.com, 2007). The taxonomy that named the classicist-vs-mockist debate and the article most cited as the source for the modern boundary-only consensus. Tests still drive design — a unit that needs five mocks is signalling five responsibilities, and that signal is more useful than the mocks ever were.
Rainsberger's “integration tests are a scam” is the more interesting counter and it is the engine behind the TS4 ↔ TS7 pairing.5“Integrated Tests Are A Scam” (Agile 2009; written up 2010). The principal counter-argument to integrated testing. Argues contract tests at the boundary are the only test buying you proof. The reply isn't “don't do integration tests”; it is “do both, at the right layer.” Unit tests run in seconds and prove the unit behaves; E2E tests run in minutes and prove the assembled system behaves. TS4 Real-Dependency E2E owns the second layer. TS7 owns the first. The contradiction at the headline level (“mock everything” vs “mock nothing”) dissolves once you say which layer you're in. The cost of conflation is the test suite that runs against a real database, mocks private helpers, takes seven seconds to start, and proves neither property.
The hermetic-tests authority is unusually deep for a rule this concrete. Google's testing blog has used the word as load-bearing since 2008,7“Hermetic Servers” (testing.googleblog.com, 2012). The post that made the word load-bearing in the public testing literature. Mike Bland's post-mortem on Goto Fail and Heartbleed names hermeticity as the cultural property that prevents both classes of bug,8“Goto Fail, Heartbleed, and the Culture of Test” (mike-bland.com, 2014). Frames hermeticity as the cultural property that prevents two of the most expensive bugs of the 2010s. and Bazel codifies hermeticity as a first-class build property.9“Hermeticity” (bazel.build, 2015–). The vocabulary crossed from internal Google testing culture into Bazel’s public build-system docs. Same property, build-side. The rule isn't Prickles invention — it is the consensus position across every shop that has shipped at scale.
The pragmatic test: pick the slowest test in your unit suite. If it does anything outside the unit, you have a layer-confusion bug. Move it to the integration layer or refactor the unit so the dependency moves to a port. The tests get faster; the failures get specific; the design pressure lands where it should.
Notes
- [1]Vladimir Khorikov — Unit Testing: Principles, Practices, and Patterns (Manning, 2020). Ch. 5 Mocks and test fragility, Ch. 8 Why integration testing? — the cleanest argument for boundary-only mocking and the unit/integration layer split.
- [2]Sandi Metz — “The Magic Tricks of Testing”, RailsConf 2013. The 3×3 matrix: incoming query / incoming command / sent-to-self / outgoing query / outgoing command. Mock outgoing commands; never mock sent-to-self.
- [3]Steve Freeman & Nat Pryce — Growing Object-Oriented Software, Guided by Tests (Addison-Wesley, 2009). The most-quoted single line in the boundary-mocking literature: “only mock types you own.”