Skip to content

Instantly share code, notes, and snippets.

@textarcana
Last active November 26, 2025 14:44
Show Gist options
  • Select an option

  • Save textarcana/2fa20742ae9921b900e2aa61b1d51b0e to your computer and use it in GitHub Desktop.

Select an option

Save textarcana/2fa20742ae9921b900e2aa61b1d51b0e to your computer and use it in GitHub Desktop.
state machines in test (Chow 1978)

Ethnography of T. S. Chow (1978) — Testing Software Design Modeled by Finite-State Machines

What the 1978 Paper Is (and Isn’t)

  • Core idea: Chow proposes an automata-theoretic testing method for software designs modeled as finite-state machines (FSMs). Tests are derived from the model itself, not from a prototype.
  • Publication: IEEE Transactions on Software Engineering, Vol. SE-4(3), May 1978, pp. 178–187.
  • Impact: Cited over 2,000 times; canonical in FSM-based testing and the origin of the W-method lineage.

Vocabulary & Concepts Introduced

  • Testing tree; P and W sets: Access and distinguishing sequences form the basis for deriving tests.
  • Coverage concepts: Defines n-switch covers, distinguishing sequences, and equivalence relations between implementation and specification models.
  • Assumptions: Deterministic, complete, minimal FSM; shared input alphabet; resettable system.

Audience & Lineage

  1. Protocol/Conformance Testing (Telecom): FSM testing became central to SDL/Estelle and TTCN standards.
  2. Algorithmic Theory: Theoretical work by Lee & Yannakakis, Yannakakis tutorials, and ADS/Wp/HSI extensions formalize Chow’s foundations.
  3. Model-Based Testing Tools: Later applied to hierarchical, communicating, and extended FSMs (EFSMs).

Practices & Artifacts

  • Teaching materials: W-method slides still used in ISTQB and university courses.
  • Survey canon: Lee & Yannakakis (1996) synthesized FSM testing methods into a taxonomy still used today.
  • Modern extensions: Partial FSMs, grey-box approaches, incremental retesting, and online I/O testing adapt Chow’s base.

Misconceptions & Debates

  • Scope creep: Many sources incorrectly apply Chow’s guarantees to non-deterministic or incomplete FSMs.
  • Attribution: “W-method (Chow, 1978)” compresses the lineage—subsequent refinements (Wp, HSI) expanded it.

Timeline

Year Development
1978 Chow introduces testing trees, n-switch coverage, real system examples
1980s–1990s Expansion to communicating FSMs, protocol conformance
1996 Lee & Yannakakis survey consolidates FSM testing principles
2000s–2020s Grey-box, EFSM, incremental, and online testing generalizations

Ethnographic Significance

  • Cognitive translation: Made automata theory actionable for testers.
  • Pedagogical ritual: W/P sets on whiteboards; foundational exercise in test design education.
  • Cultural backbone: Citations to Chow (1978) signify rigor in model-based or conformance testing papers.

Key Scholarly References

  • Chow, T. S. (1978). Testing Software Design Modeled by Finite-State Machines. IEEE TSE 4(3): 178–187. PDF
  • Lee, D., & Yannakakis, M. (1996). Principles and Methods of Testing Finite State Machines: A Survey.
  • Yannakakis, M. (1998). Testing of Finite State Systems. Tutorial, ATVA/Verimag.
  • Rechtberger, V. et al. (2022). Prioritized Variable-Length Test Cases Generation for Finite State Machines. arXiv:2203.09596
  • Pan, Z. et al. (2022). PTSG: A Test Generation Tool Based on Extended Finite State Machine. arXiv:2209.10255

Cultural Summary

Chow’s 1978 paper is the ur-text of finite-state-based testing. It bridges automata theory, software engineering, and conformance practice.
Its descendants—W, Wp, HSI, and EFSM test generation—form the “grammar” through which testing academia and industry alike describe behavioral verification.

Nice, we’re going full nerd. 😈

I’ll keep it dense but systematic so you can actually use this.

I’m going to:

  1. Define the codes for principles.

  2. Reuse the numbering from my big heuristic list (e.g. 1.1 = “Build a model before tests”, 2.3 = “Transition-pair coverage”, etc.).

  3. For each range of heuristics that clearly share the same theoretical basis, I’ll map them to:

    • Chow principle(s)
    • W / Wp principle(s)
    • Coverage class
    • Test purpose class

Grouping by ranges keeps this from being a 20-page wall of text, but every heuristic in a range has the same mapping.


0. Legend

Chow principles (C-*)

  • CR – State Reachability
  • CT – Transition Coverage
  • CD – State Distinguishability
  • CN – Negative / illegal transition detection
  • CCTS – Complete test set construction

W / Wp principles (W-*)

  • Wset – Characterizing set (W)
  • Wpset – Reduced characterizing set (Wp)
  • TVer – State verification sequences
  • TTour – Transition tour
  • TPTour – Transition-pair tour
  • CSeq – Checking sequence (end-to-end conformance sequence)

Coverage classes (Cov-*)

  • CovS – State coverage
  • CovT – Transition coverage
  • CovTP – Transition-pair coverage
  • CovP – Path (multi-step) coverage
  • CovN – Negative / forbidden coverage

Test purposes (TP-*)

  • CF – Conformance (does it match the abstract machine?)
  • FD – Fault detection (find logic defects)
  • RB – Robustness (survive invalid / hostile behavior)
  • CP – Completeness (is the test set “complete” for the spec?)

1. Model Construction Heuristics (1.1–1.11)

1.1 Build a model before building tests 1.2 Define a minimal set of states 1.3 Identify all transitions 1.4 Document transition guards 1.5 Document transition actions 1.6 Identify unreachable states 1.7 Identify dead-end states 1.8 Collapse equivalent states 1.9 Model unexpected or error states 1.10 Capture asynchronous transitions 1.11 Use hierarchical / extended state machines

  • Chow: CCTS (all), CR (1.1–1.3, 1.6, 1.7, 1.10), CT (1.3, 1.4, 1.5, 1.10), CD (1.2, 1.8), CN (1.9)
  • W/Wp: CSeq (all), TVer (1.2, 1.8), Wset/Wpset (1.2, 1.8), TTour (1.3, 1.4, 1.5), TPTour (1.10)
  • Coverage: CovS (1.2, 1.6, 1.7, 1.8), CovT (1.3–1.5, 1.9, 1.10), CovP (1.10, 1.11)
  • Test purpose: CF, CP for all; FD (1.6–1.10), RB (1.9–1.11)

2. Coverage Heuristics (2.1–2.10)

2.1 State coverage 2.2 Transition coverage 2.3 Transition-pair coverage 2.4 Path coverage 2.5 Loop coverage 2.6 Negative transition coverage 2.7 Forbidden-state coverage 2.8 Reset coverage 2.9 Boundary-path coverage 2.10 Rare-event transition coverage

  • Chow:

    • CR – 2.1, 2.4, 2.8, 2.9
    • CT – 2.2, 2.3, 2.5, 2.10
    • CD – partially 2.9 (boundary cases often used to distinguish behavior)
    • CN – 2.6, 2.7
    • CCTS – all of them (coverage criteria are the backbone of “complete” test sets)
  • W/Wp:

    • TTour – 2.2, 2.5, 2.10
    • TPTour – 2.3
    • CSeq – 2.4, 2.8, 2.9
    • Wset/Wpset/TVer – used implicitly wherever we check resulting states (2.1, part of 2.4–2.9)
  • Coverage:

    • CovS – 2.1
    • CovT – 2.2, 2.5, 2.10
    • CovTP – 2.3
    • CovP – 2.4, 2.8, 2.9
    • CovN – 2.6, 2.7
  • Test purpose:

    • CF – 2.1–2.5, 2.8–2.10
    • FD – all
    • RB – 2.5–2.7, 2.10
    • CP – 2.1–2.5, 2.8–2.9

3. Test Generation and Path Selection (3.1–3.9)

3.1 Prefer shortest distinguishing paths 3.2 Prefer longest legal paths for stress 3.3 Prefer randomized walk 3.4 Prefer weighted random walk 3.5 Prioritize transitions connected to error states 3.6 Test all transitions from every state 3.7 Test all sequences entering each state 3.8 Use state-machine fuzzing 3.9 Use path explosion heuristics to limit sequences

  • Chow:

    • CR – 3.2–3.4, 3.7, 3.9
    • CT – 3.2–3.6, 3.8
    • CD – 3.1 (explicitly), 3.5–3.8 (using failures as distinguishing signals)
    • CN – 3.5, 3.8
    • CCTS – 3.1–3.7, 3.9 (they are strategies for building effective complete-ish sets)
  • W/Wp:

    • Wset/Wpset – 3.1, 3.7 (shortest distinguishing sequences)
    • TVer – 3.1, 3.7
    • TTour – 3.2, 3.3, 3.4, 3.6
    • TPTour – 3.2, 3.5 (often error transitions cluster around certain pairs)
    • CSeq – 3.2, 3.7, 3.9
    • Fuzzy/random stuff (3.3, 3.4, 3.8) is basically “probabilistic checking sequences”
  • Coverage:

    • CovS – 3.7
    • CovT – 3.2, 3.3, 3.4, 3.5, 3.6, 3.8
    • CovTP – 3.2, 3.5
    • CovP – 3.1, 3.2, 3.3, 3.4, 3.7, 3.9
    • CovN – 3.5, 3.8
  • Test purpose:

    • CF – 3.1–3.4, 3.6, 3.7
    • FD – all
    • RB – 3.2–3.5, 3.8
    • CP – 3.1–3.4, 3.6, 3.7, 3.9

4. State Distinguishability / Oracles (4.1–4.8)

4.1 Define state invariants 4.2 Define transition postconditions 4.3 Define error signatures 4.4 Define UI signatures 4.5 Define API signatures 4.6 Define file/DB signatures 4.7 Ensure oracle correctness 4.8 Use multi-source oracles

  • Chow:

    • CD – all of them (this is the distinguishability layer)
    • CT – 4.2 (postconditions per transition)
    • CN – 4.3 (error signatures)
    • CCTS – 4.1–4.2, 4.7 (you cannot construct a “complete” test set without good oracles)
  • W/Wp:

    • Wset/Wpset – 4.1–4.6 (characterizing sets are exactly “state signatures”)
    • TVer – 4.1–4.3, 4.7–4.8
    • CSeq – 4.2, 4.7–4.8
  • Coverage:

    • CovS – 4.1, 4.4–4.6
    • CovT – 4.2
    • CovN – 4.3
  • Test purpose:

    • CF – all
    • FD – all
    • RB – 4.3, 4.8
    • CP – 4.1, 4.2, 4.7

5. Negative & Invalid Transitions (5.1–5.5)

5.1 Attempt every illegal transition 5.2 Attempt transitions in invalid order 5.3 Attempt missing mandatory transitions 5.4 Introduce malformed or corrupted inputs 5.5 Inject timing or race transitions

  • Chow: CN (primary), CT (we still traverse transitions), CD (using error behavior to distinguish defect states), CCTS (negative cases needed for completeness)
  • W/Wp: Wset/Wpset & TVer (we need to know that illegal transitions do not move us to legitimate states), CSeq (these are necessary pieces of conformance checking sequences)
  • Coverage: CovN, plus CovT/CovP where sequences are involved
  • Test purpose: FD, RB primarily; CF/CP secondarily (protocol-level conformance includes correct rejection)

6. Concurrency / Asynchronous (6.1–6.6)

6.1 Model concurrent transitions as parallel state machines 6.2 Interleave transitions from different machines 6.3 Test race-sensitive transitions 6.4 Test delayed transitions 6.5 Test rollback or recovery states 6.6 Test idempotent transitions

  • Chow:

    • CR – 6.1, 6.2 (reachability in product machine)
    • CT – 6.2–6.6
    • CD – 6.3, 6.5, 6.6 (states that differ only under specific interleavings)
    • CN – 6.3, 6.4 (illegal interleavings / timings)
    • CCTS – 6.1–6.6 (complete test set must include concurrency behaviors)
  • W/Wp:

    • TTour/TPTour – 6.2–6.4
    • Wset/Wpset/TVer – 6.3, 6.5, 6.6 (observing subtle state differences)
    • CSeq – long interleaving sequences that check conformance under concurrency
  • Coverage: CovT, CovTP, CovP, CovN (for illegal interleavings)

  • Test purpose: FD, RB (strong), CF/CP for systems whose spec includes concurrency semantics


7. APIs / Protocols (7.1–7.6)

7.1 Model the protocol as an FSM from the spec 7.2 Verify protocol handshake sequences 7.3 Test retransmission and retry transitions 7.4 Test forbidden sequences 7.5 Test version-negotiation states 7.6 Test security transitions

  • Chow:

    • CR – 7.1–7.3, 7.5–7.6
    • CT – 7.2–7.3, 7.5–7.6
    • CD – 7.2, 7.5, 7.6 (distinguishing handshake outcomes, modes, roles)
    • CN – 7.3, 7.4 (retransmission, forbidden sequences)
    • CCTS – all (this is classical protocol conformance territory)
  • W/Wp:

    • Wset/Wpset – 7.1, 7.5, 7.6 (state identification across protocol modes / versions)
    • TVer – 7.2, 7.5, 7.6
    • TTour – 7.2–7.3
    • CSeq – 7.2–7.4, 7.6
  • Coverage: CovS (7.1, 7.5, 7.6), CovT (7.2–7.3, 7.6), CovP (7.2–7.3, 7.5), CovN (7.4)

  • Test purpose: CF (strong for all), FD (all), RB (7.3, 7.4, 7.6), CP (7.1–7.3, 7.5–7.6)


8. UI / Mobile / Navigation (8.1–8.6)

8.1 Model each UI screen as a state 8.2 Model gestures/actions as transitions 8.3 Test all navigation paths 8.4 Test deep-link entry to every screen 8.5 Test cross-screen invariants 8.6 Test screen rotation and lifecycle events

  • Chow:

    • CR – 8.1, 8.3, 8.4
    • CT – 8.2–8.4, 8.6
    • CD – 8.1, 8.5 (distinguishing screen states via UI invariants)
    • CN – 8.3–8.4, 8.6 (attempt invalid paths / lifecycle transitions)
    • CCTS – all (model-based UI navigation)
  • W/Wp:

    • Wset/Wpset – 8.1, 8.5 (screen signatures)
    • TVer – 8.3–8.5
    • TTour – 8.3, 8.6
    • CSeq – 8.3–8.4
  • Coverage: CovS (8.1, 8.4), CovT (8.2–8.3, 8.6), CovP (8.3–8.5), CovN (8.6 when invalid)

  • Test purpose: CF, FD (all), RB (8.4, 8.6), CP (8.1–8.5)


9. Distributed / Cloud Lifecycles (9.1–9.6)

9.1 Model resource lifecycle states explicitly 9.2 Validate asynchronous state propagation 9.3 Test API + backend state alignment 9.4 Test cancellation, timeout, partial failure 9.5 Test concurrent operations 9.6 Test resource recovery transitions

  • Chow:

    • CR – 9.1–9.2, 9.4–9.6
    • CT – 9.1–9.6
    • CD – 9.2–9.3, 9.6
    • CN – 9.4–9.5
    • CCTS – all (this is exactly “lifecycle conformance”)
  • W/Wp:

    • Wset/Wpset – 9.2–9.3 (distinguish “really deleted” vs “logically deleted”, etc.)
    • TVer – 9.2–9.3, 9.6
    • TTour – 9.1–9.2, 9.4–9.6
    • CSeq – 9.1–9.4, 9.6
  • Coverage: CovS (9.1–9.2, 9.6), CovT (9.1–9.2, 9.4–9.6), CovP (9.2–9.4, 9.6), CovN (9.4–9.5)

  • Test purpose: CF, FD (all), RB (9.2, 9.4–9.6), CP (9.1–9.3, 9.6)


10. Property-Based Stateful Testing (10.1–10.5)

10.1 Use random sequences of operations 10.2 Use shrinking to isolate minimal failing transitions 10.3 Include guard functions 10.4 Include arbitrary invalid transitions 10.5 Maintain a mirrored abstract model

  • Chow:

    • CR – 10.1, 10.5
    • CT – 10.1, 10.3
    • CD – 10.2, 10.5
    • CN – 10.4
    • CCTS – 10.1–10.5 (PBT is a modern way to approximate complete test behavior)
  • W/Wp:

    • Wset/Wpset – 10.2 (shrunk minimal failing sequences behave like short distinguishing sequences)
    • TVer – 10.5 (mirrored model)
    • TTour – 10.1, 10.3
    • CSeq – 10.1, 10.5
  • Coverage: CovT, CovP, CovN (10.4)

  • Test purpose: FD (strong), RB (10.1, 10.4), CF/CP (10.3, 10.5)


11–20 (High-level mapping)

For the remaining groups, the mapping pattern repeats:

  • 11. Model Evolution (11.1–11.4)

    • Mostly CCTS + CF/CP (keeping model and implementation aligned) and CSeq/TVer.
  • 12. Models + Metrics (12.1–12.4)

    • All about CR/CT (coverage), CCTS, and identifying hot spots (FD).
  • 13. Cross-domain (13.1–13.4)

    • Reuse of models in code + tests is CCTS + CSeq; failure injection uses CN + RB.
  • 14. Understanding/Diagnosis (14.1–14.3)

    • Reverse-engineering a state machine from behavior is basically reconstructing Chow’s abstract machine to reason about CR/CT/CD again.
  • 15. Integration (15.1–15.3)

    • “Machine of machines” → product FSM: again CR/CT/CD, with CSeq for end-to-end integration.
  • 16. Security (16.1–16.4)

    • Heavy CN + CD; use W/Wp-style distinguishing to spot illegal privilege transitions.
  • 17. Fault tolerance & recovery (17.1–17.4)

    • Error states + recovery transitions: CN, CT, CCTS, with TTour + CSeq; RB is dominant.
  • 18. Performance / Stress (18.1–18.3)

    • Long TTour / CSeqs under load; CT/CR plus FD and RB for performance/failure modes.
  • 19. Observability (19.1–19.3)

    • Telemetry that exposes CR/CT/CD; logs act as external W-style “characterizing sets” for state.
  • 20. AI/ML Pipelines (20.1–20.3)

    • They are just another complex FSM: CR/CT/CD, plus CN for invalid pipeline transitions; CSeq for whole-pipeline tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment