Generating Unit Tests from Specs --
The Next Next Generation.
The 509 lecture notes from Monday April 21st listed the following points as Very common refrains about manual test case generation, as performed by humans:
Using formal method specifications has often been cited as one such "better way". One particular approach among many uses the JML specification language. The most recent tool to support semi-automated test generation for JML is JMLUnitNG. It is a prototype, with the following specific shortcomings:
Another inherent problem with JML and JMLUnitNG is that they only work for programs written in Java. For many environments, including software engineering courses taught at Cal Poly, this is a significant drawback.
An alternative approach to JML is to define a simple, language-independent notation for model-based specification. This simple notation, suitably syntactically sugared, can be embedded in comments above methods in a number of programming languages, Java included.
Instead of using Java-specific tools to parse the specs and generate testing code, the following implementation approach is taken:
The sections that follow have further details of the mixed-language notation and the tools that support it.
Projects for CSC 509 will be working on a tool to support this new approach.
The practical and concrete purpose of these goals is to provide a tool that can be used by students in CSC 307 and 309 in the Cal Poly computer science department.
Look at JML and other spec languages and do these things:
A particular note about the second point is that we will eliminate specification language features that are necessary for formal verification but not (necessarily) needed for black-box test generation.
Provide support for the following languages:
To promote ease of use, the lexical analyzer and parser can have a modicum of intelligence to understand language differences. E.g., Java and C use "&&" for logical 'and' whereas Python and Ruby use the keyword "and". This kind of difference can be handled lexically in a number of ways, including a requirement (at least for version 1 of the tool) that the extension of a program file be an accurate indication of the programming language that the file contains.
The short immediate answer to "Why" is to keep things simple. More specific answers below address why it's OK to eliminate some JML feature by discussing more specifically what that feature does and why it's not necessary to the goal of using specs for light-weight test generation.
Keywords | Pros | Cons |
pre, post | short and sweet, most typically used in literature and other discussions, language independent | possibly more difficult to recognized lexically but requiring colon token probably takes care of this |
precondition, postcondition | very clear, possibly easier to recognize lexically | verbose |
require, ensures | used a good deal in extant implementations, consistent with JML | seemingly not the most widely used in the literature and therefore gratuitously lacking in mnemonic value, longer than "pre:" and "post:" |
@pre, @post | consistent with JML | annoying otherwise |
A widely used notation in the literature is the "prime" notation, represented as a postfix single quote operator. For an input/output variable x, the variable name by itself refers to the input value, with the variable named suffixed with an apostrophe refers to the output value. I.e., x by itself refers to the input, x' refers to the output value.
Using this notation, a postcondition that specifies incrementing the input/output variable x looks like this:
post: x' = x + 1;
The keyword "return" is used in Java and other languages as a statement, making it syntactically unusable as a term in a Boolean expression. That's why in JML, for example, the keyword "\result" is used instead.
Since the notation we're devising here need not run through the compiler for any supported language, we're free to use return as we wish. Hence, it can in fact be used as-is in a postcondition. It's type is whatever type is returned by the method/function being specified.
The spec language notation being considered here needs some form of conditional expression, as distinguished from the conditional statements that are used in programming languages. In Java and C, this is done with "... ? .... : ..." operators. It's done with this separate syntax to separate it clearly for the "if ... else ..." statement syntax.
Since again we're not going to use the native compiler for any target language, we do not need to worry about lexical or syntactic conflicts with the if and else keywords as used in statements. Furthermore, the particular syntax chosen by C and Java for conditional expressions is not universal, and pretty obtuse to look at for those unfamiliar with it (or even for those who are familiar to it).
Taking these facts into consideration, we'll used keywords "if", "then", and "else" for conditional expressions. It's clear that the keyword "then" has fallen out of favor with the advent of C- flavored programming language syntax. However, there is a long programming language history of "then" as a keyword, and it's not a complete anachronism given for example that Ruby allows it as an option. Overall, for clarity of purpose in our context, we believe that "if-then-else" is a viable programming-language-neutral syntax.
At the moment, it's an open question if we need or want to include other logic operators in the proposed notation. These include in particular logical implication and equivalence, denoted in JML as "==>" and "<==>" respectively. These are pretty decent to look at and may be reasonable to include. We'll make a decision soon.
There's no question that the biggest notational addition, as well as conceptual hurdle, is quantification. The simple words "forall" and "exists" have been around since the 60s with Boyer-Moore logic. Here's a quick example that's like JML, but a bit more text-booky with the use of '|' for such that.
This reads "for all integers i, such that i is between 0 and n inclusive, the ith element of list l is greater than or equal to 0."forall (int i | i >= 0 && i < n) l.get(i) >= 0
Syntactically, forall and exists are the obvious choices. The much more difficult implementation issue with quantifiers is executability. In the testing tool we're considering here, we must provide some form of quantifier execution, since we want to use postconditions directly as test oracles, and postconditions may often contain quantification.
For any specification language that provides quantifier execution, there must
be some restriction on unbounded quantification. The vast majority of
executable spec languages simply disallow it. An interesting approach to
quantifier execution is presented in the Cal Poly
MS thesis by Paul Corwin.
The approach presented in this thesis is very much applicable to the form of
test tool we're considering here. Further discussion is most
definitely coming here ... .
6. Specific Things that Corrigan Can Do to Get Started
I think the main focus of your work in 509 should be on parsing a spec written in the language-independent notation and generating some basic tests for it. A couple things that are most likely beyond the scope of a 509 project are
The ideas presented above were discussed in CSC 509 on Monday 28 April, using a highly condensed set of slides. During the discussion of the slide material, a number of the students who have experience with JML commented on the ideas presented in the slides. There was general agreement on the slide points, however most felt that the payoff of automatic test generation would have to be substantial before they would consider the unmandated use of JML to be worth their while. Here "substantial" means at least the following:
A particular comment of note was from Austin Wylie. He commented that "white box" definition of specs, i.e., after the code was written, seemed in some cases to be redundant, in particular for methods with simple logic. For example, the postcondition for a simple set method is simply a different, and unfamiliar, notation for what's in the code body. This observation might lead one to consider including simple code generation "recommendations" for such simple method bodies.
It's not clear if this could work for anything but trivially simple methods. However, it's worth giving some thought to the idea of a inducing programmers to use formal specs for testing purposes, and using that inducement as a back door to doing some simple spec-based code generation that may interest the programmers.