Skip to content

Chapter 8: Property based testing

Maksim Pečorin edited this page Jun 15, 2015 · 7 revisions

Contents

  • Notes: Chapter notes and links to further reading related to the content in this chapter
  • FAQ: Questions related to the chapter content. Feel free to add questions and/or answers here related to the chapter.

Notes

The style of combinator library for testing we developed in this chapter was introduced in a 2000 paper by Koen Claessen and John Hughes, QuickCheck: A Lightweight Tool for Random Testing of Haskell Programs (PDF). In that paper, they presented a Haskell library, called QuickCheck, which became quite popular in the FP world and has inspired similar libraries in other languages, including ScalaCheck. Many programmers who adopt this style of testing find it to be extraordinarily effective (see, for instance, this experience report on Tom Moertel's blog).

The wikipedia page on QuickCheck and the Haskell wiki page are good places to start if you're interested in learning more about these sorts of libraries. QuickCheck sparked a number of variations, including the Haskell library SmallCheck, which is focused on exhaustive enumeration.

Although property-based testing works quite well for testing pure functions, it can also be used for testing imperative code. The general idea is to generate lists of instructions, which are then fed to an interpreter of these actions. We then check that the pre and post-conditions are as expected. Here's a simple example of testing the mutable stack implementation from Scala's standard library (API docs):

forAll(Gen.listOf(Gen.choose(1,10))) { l => 
  val buf = new collection.mutable.ArrayStack[Int]
  val beforeSize = buf.size 
  l.foreach(buf.push)
  buf.beforeSize == 0 && buf.size == l.size
}

In this case, the "interpreter" is the push method on ArrayStack, which modifies the stack in place, and the "instructions" are simply the integers from the input list. But the basic idea can be extended to testing richer interfaces--for instance, we could generate instructions that could either push or pop elements from an ArrayStack (perhaps represented as a List[Option[Int]]), and write a property that sequences of push and pop preserve the invariants of ArrayStack (for instance, the final size of the stack should be the number of push calls minus the number of pop calls). Care must be taken to craft generators that produce valid sequences of instructions (for instance, pop without a corresponding prior push is not a valid input).

Similar ideas have been used for testing thread safety of concurrent programs. (See Finding Race Conditions in Erlang with QuickCheck and PULSE (PDF)) The key insight here is that thread-safe code does not allow the nondeterminism of thread scheduling to be observable. That is, for any partial order of instructions run concurrently, we ought to able to find some single-threaded linear sequence of these instructions with the same observable behavior (this criteria is often called linearizability). For instance, if our ArrayStack were thread-safe, we would expect that if 2 push operations were performed sequentially, followed by two pop operations and two push operations performed concurrently, this should yield the same result as some deterministic linear sequence of these push and pop operations). There are some subtleties and interesting questions about how to model this and how to report and minimize failing test cases. In particular, doing it the "obvious" way ends up being intractable due to having to search through a combinatorial number of interleavings to find one that satisfies the observation. The Erlang paper linked above has more details, in particular see section 4. You may be interested to explore how to incorporate these ideas into the library we developed, possibly building on the parallelism library we wrote last chapter.

Lastly, we mention that one design goal of some libraries in this style is to avoid having to explicitly construct generators. The QuickCheck library makes use of a Haskell type class to provide instances of Gen "automatically", and this idea has also been borrowed by ScalaCheck. This can certainly be convenient, especially for simple examples, though we often find that explicit generators are necessary to capture all the interesting constraints on the shape or form of the inputs to a function.

FAQ