Golden Tests

Golden tests compare current program output against a previously approved reference artifact. That artifact is usually called a golden file, snapshot, baseline, or approved output.

The basic loop:

Run code against a controlled input.
Capture the output.
Compare it to a checked-in reference file.
Fail the test if the output differs.
Review the diff.
Either fix the regression or approve the new golden.

This makes golden tests a good fit for behavior that is easier to inspect visually or as output than to assert field-by-field.

Terminology

Term	Meaning
Golden file	Checked-in reference output used as the expected result.
Snapshot	Serialized reference value, often text or structured data.
Approval test	Test where a human approves the captured output as correct.
Visual regression test	Golden test where the reference is usually an image.
Characterization test	Test that captures current behavior, often for legacy code.

The names overlap but in practice, golden test is common in compiler, CLI, and renderer contexts; snapshot test is common in many language ecosystems; approval test emphasizes the human review workflow.

The central idea

A normal unit test says:

For this input, this specific property must hold.

A golden test says:

For this input, the whole produced artifact should still match this approved artifact.

This is useful when the output is large enough that many small assertions would obscure the intent.

Most snapshot testing tools follow the same pattern: render or produce a value, store a reference on first run, and compare on subsequent runs. If the values differ, either the implementation changed intentionally and the reference must be updated, or the new output reveals a bug.¹

The same pattern applies for file outputs: tests generate output files, compare them to checked-in golden versions, and fail when they differ.²

Use Cases

Golden tests work best when the output is important, reviewable, and deterministic.

Golden tests are regression nets

Golden tests are especially useful when refactoring code that already works. Approval testing documentation describes approval tests as taking a snapshot of results and confirming that they have not changed; this is useful when asserting complex objects or rich output directly would be awkward.³

Determinism

Golden tests need stable output.

Generated values like dates and IDs are a common source of unstable snapshots. Use matchers, serializers, or explicit replacement for any non-deterministic segments.⁴

I prefer to use golden tests for interfaces, not internals.

References

Jest, “Snapshot Testing.” https://jestjs.io/docs/snapshot-testing ↩
goldenfile crate docs. https://docs.rs/goldenfile/latest/goldenfile/ ↩
ApprovalTests.com. https://approvaltests.com/ ↩
Jest, “Snapshot Testing - Property Matchers.” https://jestjs.io/docs/snapshot-testing#property-matchers ↩