8. Formal Specification Examples

The five object composition primitives in SpecL are just that -- primitive. In particular, using just these five primitives, there is no way to specify any of a number of important properties about objects and operations. For example, in the PersonDatabase object suppose it is necessary to specify that the database cannot have duplicate entries. Or, suppose it is necessary that the database behave as a queue-like structure, such that new records are added on a first-come-first-served (FIFO) basis. Neither of these or similar properties can be defined with the use of composition primitives alone. The use of formal specification is necessary in such cases.

One method to describe properties such as these is to use English. It is in fact useful to describe all properties of an entity in its English description. However, it is not sufficient to rely solely on natural language to specify critical properties. To do so soon leads to well-known problems of ambiguity and imprecision. Therefore, it is necessary to use the more formal language of mathematics to obtain a truly precise and unambiguous specification.

Another method to describe properties is to use a computer programming language. This has the advantage of being fully precise and unambiguous. However, using a programming language for specification is fundamentally contrary to the purpose of specification. Namely, a specification should be as free as possible of implementation detail. The purpose of a programming language is precisely for the expression of implementation detail.

To overcome the disadvantages of English and programming languages, researchers have developed a number of formal techniques expressly for use in specification. The two formal techniques supported primarily in SpecL are algebraic and predicative specification. With the algebraic technique, formal properties are specified as a set of equations associated with an object. Hence, algebraic specifications can be considered object-oriented. In the predicative technique, formal properties are specified as preconditions and postconditions on operations. Hence, predicative specification can be considered operation-oriented.

It is important to note that the two techniques provide different approaches to formal specification. A system can be fully specified using only the algebraic approach, it can be fully specified using only the predicative approach, or it can be specified using a combination of the two approaches.

A potential problem with either of these forms of specification is the introduction of implementation biases into a specification. As examples below illustrate, these formal specification techniques tread a fine line between specification and implementation. In fact, what many software engineers call a specification language others may call a very high-level programming language.

The gist of the problem is that the more precise we try to become with a specification, the more we tend to constrain what the implementation can look like. However, this violates the general principle that a specification is as free from implementation details as possible.

Another problem is that in some cases we find that in order to state a specification sufficiently precisely, we need to interject auxiliary functions, thereby violating the general rule that a specification should contain only entities directly visible to the end user.

In summary, specification is a continual battle between risking imprecision by saying too little versus risking the addition of implementation details by saying too much. The experienced specifier learns how to wage this battle successfully.

8.1. Equational Specification

An equational specification formally defines an object in terms of equations between the operations of the object. Equational specifications are desirable because they require no underlying data model. Rather, the definition of an object is stated entirely in terms of its abstract operations. This model-free property of equational specification is generally not satisfied by other forms of formal specification. For example, the predicative style of specification defined below in Section 8.4 is not model-free. Rather, predicative specification relies on a list-based data model to provide a basis for defining formal preconditions and postconditions. Without reliance on some underlying model, the predicative style of specification would not be complete.

The process of defining an equational specification is divided into three major steps:

  1. Define the operation signatures for the object's operations
  2. Categorize the operations into constructors, destructors selectors, and initializers
  3. Define the equations

To illustrate these steps, consider the definition of a database object that has a set-like structure. That is, it has the property that duplicate entries are not allowed. Here is an equational definition for such a database:

object SetDB is
    components: Elem*;     (* Note the minimal representation *)

    operations:
        Insert: (SetDB, Elem) -> (SetDB),  (* Constructor operation *)
        Delete: (SetDB, Elem) -> (SetDB),  (* Destructor operation *)
        Find: (SetDB, Elem) -> (boolean),  (* Selector operation *)
        EmptyDB: () -> (SetDB);            (* Initializer operation *)

    equations:
        var s: SetDB; e, e': Elem;

        Find(EmptyDB()) == false;
        Find(Insert(s, e), e') ==
            if e=e' then true else Find(s, e');
        Delete(EmptyDB(), e) == EmptyDB();
        Delete(Insert(s, e), e') ==
            if e=e'
            then Delete(s, e')
            else Insert(Delete(s, e'), e);

end SetDB;
The first two lines of this definition use the standard SpecL notation for defining an object. When defining an object equationally, its component structure should always be an expression of the form "C*", where C is a single object name. In the case of the SetDB, the components are Elem*, where Elem is some type of element defined elsewhere.

The operations section of the SetDB definition contains the full signatures for its operations. The general form of a signature in the operations sections is

operation name : (list of inputs) -> (list of outputs)
where the input and output lists contain zero or more object names.

Before defining the equations, an object's operations are organized into the following categories:

  1. Constructor and initializer operations that build an object out of smaller parts. These two categories can be grouped under the common heading of constructors.
  2. Selector and destructor operations that access or remove some component of an object. These two categories can be grouped under the common heading of selectors.
  3. Derived operations that can be defined formally in terms of one or more constructors or selectors.
The comments in the SetDB definition indicate the categories of each of the operations.

The important characteristic of constructor operations is that they are additive. That is, the resulting output of a constructor is a combination of the inputs. In contrast, selector operations are subtractive. That is, the resulting output is some smaller piece of the input. In general, an equationally defined object should have at least one each of the constructor, initializer, destructor, and selector operations.

Once operations have been fully defined and categorized, the equations themselves are defined. The general format of an equation is the following:

functional_expr == quantifier_free_expression;
where a functional_expr is defined above, and a quantifier_free_expression is an expression that contains no quantifiers or set operations. The "==" operator separates an equation into a left-hand side (LHS) and a right-hand side (RHS).

The following guidelines are useful for constructing a correct equational specification:

  1. The equations define what the selector operations do to the constructor operations, and not vice versa. That is, the outermost operation name in the LHS of an equation should always be a selector, not a constructor.
  2. There should be one equation that defines what each selector does to each constructor. Hence, if there are c constructors and s selectors, then there are c * s equations.
  3. Equations are defined in an inductive style. Specifically, equations are defined in groups, where the first of the group specifies what a selector does with an initializer. The remaining equations in the inductive group specify what a selector does with the other constructors.
Consider how these guidelines were used to construct the SetDB equations:
  1. The SetDB selectors are Find and Delete. Hence, each of the SetDB equations has one of these as the outermost operation in the LHS of the equation.
  2. Since there are 2 selectors and 2 constructors, there are a total of 4 equations.
  3. The inductive style of definition indicates that two groups of equations are appropriate. The first group of two equations specifies what Find does to EmptyDB and Insert. The second group of equations specifies what Delete does to EmptyDB and Insert.
The most difficult part of an equational definition is constructing the second part of the inductive definition. That is, defining the equations for what the selectors do with the non-initializing constructors. To examine this part of process, let us review what an inductive style definition is.

To begin, it is assumed that the reader is familiar with the concept of proof by mathematical induction. Readers entirely unfamiliar with this topic should consult an appropriate reference. Recall that an inductive proof is a two step process, in which a proposition p(x) is proved for all values of x. The steps are:

  1. Proof of the base case: set x=0 and prove p(x).
  2. Proof of the inductive step: assume p(x) is true for x=n and then prove that p(x) is true for x=n+1.
An inductive equational definition involves a similar two-step process. That is, we define one equation for a base case and one for an inductive step. In the base-case equation, there is an analog to setting the induction variable to 0 -- we define the equation for an object that has 0 components. That is, the base case equation defines what a selector does to an initializer, since an initializer constructs an object with zero components. In the SetDB example, the base case equations are:
Find(EmptyDB()) == false;
and
Delete(EmptyDB(), e) == EmptyDB();
These equations define what the selectors do with a SetDB of size 0, that is a SetDB constructed with the EmptyDB initializer. Both these equations are quite intuitive -- if we try to find something in a SetDB of size 0, we receive a false result back. If we try to delete something from a SetDB of size 0, we get back the same 0-sized SetDB we started with. As in a mathematical induction proof, the base case is generally easy to establish.

The two inductive step equations in the SetDB example are:

        Find(Insert(s, e), e') ==
            if e=e' then true else Find(s, e');
and
        Delete(Insert(s, e), e') ==
            if e=e'
            then Delete(s, e')
            else Insert(Delete(s, e'), e);
These equations specify what the selectors do with a SetDB of size n+1. The derivation of n+1 comes from an assumption about the set s that is the first argument to Insert, Find, and Delete. The assumption is that s contains n elements. Hence, the constructed object Insert(s, e) contains n+1 elements -- the n elements assumed to be in s plus the one new element e that is inserted.

Both of the inductive-step equations use a recursive style of definition. This is very common in inductive-step equations. The recursion always involves some form of conditional, typically an if-then-else. The test of the condition checks the n+1 case, then the recursion handles the other n cases. Consider again the case for Find:

        Find(Insert(s, e), e') ==
            if e=e' then true else Find(s, e');
To paraphrase this equation in English, the LHS says "consider what the Find selector does with a SetDB of size n+1". The RHS then says, "if the last item that was inserted is the item we're trying to find, then we've found it so return true; otherwise, the item we're looking for, if present, must be one of the n items in the SetDB s, so we recursively apply Find to look for it."

By far the most difficult equation in the SetDB example is the last one that defines the inductive step for Delete. The equation is not necessarily intuitive in terms of exactly how it defines set-like behavior. An important technique to determine if a set of equations does what we want is to formally test the equations. To gain an understanding of how the last equation works, we will apply this testing technique shortly. Before doing so, we examine some additional equational definitions.

8.2. Some Additional Equational Definitions

The SetDB example defined equations for the database property that duplicate entries are not allowed. Given below are additional equational definitions that specify other database structural properties. These are all properties that are considered appropriate to be known by an end user.

8.2.1. A Bag-Like Object

Suppose the desired definition of database is one that does allow duplicate entries, that is, it behaves like a mathematical bag instead of a set. To obtain the definition of a bag from that of a set, only one change is necessary -- the then clause of the last SetDB equation is changed from

then Delete(s, e')    (* for a set *)
to
then s               (* for a bag *)
The upcoming section on testing equations will discuss why this change does what is claimed.

8.2.2. A LIFO-Structured Database

Suppose we choose to define a LIFO DB, that is, a DB that behaves like a stack. A stack is much simpler than a set since no recursive searching is necessary for the selector operation. For a stack, the selector is the Top operation. Here is the equational definition:

object StackDB is
    components: Elem*;

    operations:
        Push: (StackDB, Elem) -> (StackDB), (* constructor operation *)
        Pop: (StackDB) -> (StackDB),        (* destructor operation *)
        Top: (StackDB) -> (Elem),           (* selector operation *)
        EmptyStackDB: () -> (StackDB);      (* initializer operation *)

    equations:
        var s: StackDB; e: Elem;

        Top(EmptyStackDB()) == EmptyElem();
        Top(Push(s, e)) == e;
        Pop(EmptyStackDB) == EmptyStackDB();
        Pop(Push(s, e)) == s;

end StackDB;

8.2.3. A FIFO-Structured Database

Next we consider the definition of a DB with a FIFO property that is, the definition of a queue-like DB. This definition is intermediate in terms of complexity between a SetDB and StackDB.

object QueueDB
    components: Elem*;

    operations:
        Enq: (QueueDB, Elem) -> (QueueDB),  (* constructor operation *)
        Deq: (QueueDB) -> (QueueDB),        (* destructor operation *)
        Front: (QueueDB) -> (Elem),         (* selector operation *)
        EmptyQueueDB: () -> (QueueDB);      (* initializer operation *)

    equations:
        var q: QueueDB; e:Elem;

        Front(EmptyQueueDB()) == EmtpyElem();
        Front(Enq(q, e)) ==
            if q = EmptyQueueDB()
            then EmptyQueueDB()
            else Front(q);
        Deq(EmptyQueueDB) == EmptyQueueDB();
        Deq(Enq(q, e)) ==
            if q = EmptyQueueDB
            then EmptyQueueDB
            else Enq(Deq(q), e);

end QueueDB;

8.2.4. A Keyed Database

Finally, we consider the definition of a keyed database, in which duplicates are not allowed. The difference between this and SetDB is a more realistic version of the Find operation. No real database would have Find return a boolean. Rather, Find should return a whole element, which would be located by some Key. The following specification defines this form of database. The equations have the same fundamental structure as SetDB, but with the addition of a Key argument where appropriate.

object DB is
    components: Elem*;

    operations:
        Insert: (DB, Key, Elem) -> DB;  (* constructor operation *)
        Delete: (DB, Key) -> (DB);      (* destructor operation *)
        Find: (DB, Key) -> (Elem);      (* selector operation *)
        EmptyDB: () -> (DB);            (* initializer operation *)

    equations:
        var d: DB; e: Elem; k,k': Key;
        Find(EmptyDB, k) == EmptyElem();
        Find(Insert(d, e, k), k') ==
            if k=k' then e else find(d, k');
        Delete(EmptyDB, k) == EmptyDB;
        Delete(Insert(d, e, k), k') ==
            if k=k'
            then Delete(d, k')
            else Insert(Delete(d, k'), k, e);
end DB;

8.3. Testing Equational Specifications

The first step in testing an equational specification is to understand precisely how equationally defined objects are represented. Since an equationally-defined object has no concrete representation, we have no real way to "get our hands on it." We cannot draw a picture of an object that has no concrete representation. The only means to represent such an object is by using its constructor operations.

Consider the representation of SetDB objects. To build a SetDB, the component type Elem must be defined. Assume for simplicity object Elem is defined as a number. Given this, a SetDB object will contain numeric values. As a concrete example, the SetDB containing elements 1, 2, and 3 is represented as a functional expression containing three applications of the Insert constructor:

Insert(Insert(Insert(EmtpyDB(), 1), 2), 3)
In general, any equationally-defined object can be represented as a sequence of constructor operations, applied at the base to an initializer operation.

To apply a selector operation to a constructed object, the constructed object simply appears as an argument in the appropriate selector argument position. For example, to delete the element 2 for the above set, the Delete operation is applied as follows:

Delete(Insert(Insert(Insert(EmtpyDB(), 1), 2), 3), 2)
Functional expressions such as this are commonly called "terms". A term is any number of equationally-defined operations, applied properly according to their signature definitions.

The general method to test a set of equations is to consider the equations as reduction rules and apply these rules to a constructed object. In general, a reduction rule is of the form


LHS -> RHS
where the "↔" symbol is read "reduces to". To treat an equation as a rewrite rule, the "==" is simply replaced by a "↔". We do not actually change the "==", but rather we view the equation as a rewrite rule, as if the "==" were "↔".

A reduction rule is said to be applied to some subject. In the case of an equational rewrite rule, the subject is a term. Rewrite rule application involves matching the LHS of the rule to some part of the term, and then replacing the matched term with the RHS of the rule. This replacement of the matched LHS with a RHS is the actual rewriting process.

The goal of equational term reduction is to remove all selector operations from a term, leaving only constructors. This goal makes sense when we consider the form of the equations. Recall that equations are written in terms of what a selector does to a constructor, not the other way around. The reason that we do not define equations for what constructors do is that they need not be reduced. In other words, a term containing only constructors is considered fully reduced. In this way, an actual object is represented fundamentally by a series of constructor operations. Whenever a selector is applied to a constructed object, the selector is "reduced out" to produce a term that again only contains constructors.

Consider the reduction of the Delete term just above:

Delete(Insert(Insert(Insert(EmtpyDB(), 1), 2), 3), 2)
The goal of the reduction is the following object
(Insert(Insert(EmtpyDB(), 1), 3)
which represents a SetDB containing a 1 and 3 -- the 2 has been deleted as expected. This goal was reached by a series of rule applications which matched an equation LHS to some part of the term, and then replaced the matched LHS with the RHS of the equation. Let us trace through the reduction steps for the above term. We first start with the subject term:
Delete(Insert(Insert(Insert(EmtpyDB(), 1), 2), 3), 2)
To find a match, we consult the SetDB equations, looking for a LHS match. The process we apply is pattern matching. That is, we consider the LHS of each equation to be a pattern that we attempt to match to some part of the term. The matching technique attempts to match each operation name in a LHS with a the same name in the term. The variables in the LHS are matched to components in the term of the appropriate type.

Consider how the match is attempted on the above term with each of the four SetDB equations. The LHSs of the first two equations contain a Find operation. These two equations can be eliminated from consideration immediately, since there is no occurrence of Find anywhere in the subject term (i.e., there are only Delete, Insert, and EmptyDB). The LHS of the last two SetDB equations are possible candidates for a match, since they start with Delete. The third equation cannot match however, since there is no pattern in the subject term that will match.

Delete(EmptyDB(), ...)
This is because the only occurrence of Delete in the subject term is applied to Insert, not EmptyDB, and the operation names "Insert" and "EmptyDB" do not match.

We are left with the fourth SetDB equation. This equation does in fact match, as follows:
LHS Subterm Matched Subject Subterm
Delete Delete
Insert outermost application of Insert
s Insert(Insert(EmptyDB(), 1), 2)
e 3
e' 2
where a subterm is some part of a term. Given this matching, the rule can now be applied, i.e., reduced. The reduction involves systematic substitution of the matched LHS with the appropriate subterms of the RHS of equation 4. The RHS of equation 4 is the following:

if e=e'
then Delete(s, e')
else Insert(Delete(s, e'), e);
When the RHS of an equation contains an if-then-else, the replacement will be selected from either the then clause or the else clause. This requires the the evaluation of the if predicate. In this case, the if predicate is
e = e'
Based on the above subterm matches, this predicate evaluates to
3 = 2
which is false. Hence, the RHS replacement is the term given in the else clause of the if-then-else, which is
Insert(Delete(s, e'), e);
That is, we must substitute this term for the matched LHS. Doing this, the first step of the overall reduction process yields the following results:
Delete(Insert(Insert(Insert(EmptyDB, 1), 2), 3), 2)  =>
    Insert(Delete(Insert(Insert(EmptyDB, 1), 2), 2), 3)
Notice what has happened here: the Delete has been moved inward in the term passed the Insert. Also, the 2 has been moved inward passed the 3. This sort of manipulation is the gist of what term writing accomplishes. Namely, the term is rewritten by (re)moving pieces according to the equations.

Since the term above still contains a selector operation, it is not yet fully reduced. We therefore apply the same matching process as was used for the first reduction step. As in that case, the first two equations are eliminated immediately as possible matches. The LHS of the third equation also fails to match as before. We again match the fourth equation, this time as follows:
LHS Subterm Matched Subject Subterm
Delete Delete
Insert 2nd to the outermost application of Insert
s Insert(Insert(EmptyDB(), 1), 2)
e 2
e' 2
The difference in this case is that the evaluation of the if predicate in the RHS of equation 4 is now

2 = 2
which evaluates to true this time. Hence, now instead of the else clause, we substitute the then clause, which is:
then Delete(s, e')
Doing this, the second step of the reduction process yields the following results:
Insert(Delete(Insert(Insert(EmptyDB, 1), 2), 1), 2) =>
    Insert(Delete(Insert(EmptyDB(), 1), 2), 3)
The resulting term is not yet fully reduced, since it still contains the Delete selector. Two additional reduction steps will occur to arrive at complete reduction. The next step will match equation 4 again, as in the preceding two steps. On this match, the if predicate, e=e', will test
1 = 2
which is false, thereby causing substitution of the else clause. Doing this, the third step of the reduction yields:
    Insert(Delete(Insert(EmptyDB(), 1), 2), 3) =>
        Insert(Insert(Delete(EmptyDB(), 1), 2), 3)
The final step of the reduction uses SetDB equation 3. This is because the pattern matching finds "... Delete(EmptyDB() ..." in the term, which matches the LHS of equation 3. Given this, the final step of the reduction is
    Insert(Insert(Delete(EmptyDB(), 1), 2), 3) =>
        Insert(Insert(EmptyDB(), 1), 3)
In addition to testing that SetDB equations are correct, the above reduction sequence reveals precisely how the SetDB equations work. It should now be clear why equation 4 of SetDB is written as it is. Namely, it is up to the Delete equation to make sure that the set property is maintained. Any number of Insert operations can be applied to a SetDB term, including Insert's of the same element. While this temporarily violates the set property, the Delete equation makes sure that set behavior is maintained by deleting all of the same elements that may have been inserted. While this may seem to be an inefficient manner in which to maintain the set property, we do not care about efficiency at all, only that the desired property is properly maintained.

8.4. Predicative Specification

Predicative specification is a complementary form of specification to equational. In the predicative approach, preconditions and postconditions are associated with an operation. The precondition specifies a predicate that must be true before the operation begins. The postcondition specifies a predicate that must be true after the operation completes.

Consider an alternate definition of a set-like database:

object SetDB
    components:  Elem*;
    operations: Insert, Delete, Find, EmptyDB;

    (* Note that we do not need full operation signatures, nor
     * equations, since we are now specifying with pre and post-
     * conditions in place of the equations we used above.
     *)

end SetDB;

obj Elem;

operation Insert
    inputs: d:SetDB, e:Elem;
    outputs: d':SetDB;
    precondition: Find(d, e) = false;
    postcondition: d' = d + e;
end Insert;

operation Find
    inputs: d:SetDB, e:Elem;
    outputs: b:boolean;
    precondition: (* An empty precond means true *) ;
    postcondition: b = e in d;
end Find;

operation Delete
    inputs: d:SetDB, e:Elem;
    outputs: d':SetDB;
    precondition: e in d;
    postcondition: d' = d - e;
end Delete;

operation EmptyDB
    inputs: ;
    outputs: d':SetDB;
    precondition: ;
    postcondition: forall (e:Elem) not Find(d', e);
         (* See discussion below about removing this precond. *)
end EmptyDB;
It is instructive to compare and contrast this predicative definition of SetDB with the equational definition given in the preceding section. Both definitions specify the same meaning in different forms. The following are are some important points about specification with pre and post conditions in particular.

By definition, failure of a precondition or postcondition is an error. Abstractly this means simply that the operation fails and produces a value of nil. Concretely (e.g., in a system user's manual), this means that the end-user should see some form of appropriate error message.

A specification can be weakened or strengthened by the selective removal or addition of pre and/or post conditions. E.g.,

op Delete is
    in: d:DB, e:Elem;
    out: d':DB;
    precond: Find(d, e) = true;
    postcond: Find(d', e) = false;
end Delete;
is relatively stronger than
op Delete is
    in: d:DB, e:Elem;
    out: d':DB;
    postcond: Find(d', e) = false;
end Delete;
in that the former specification says that it is an error to try to delete an element that is not already in the database, whereas the latter is non- committal.

8.4.1. Quantification

In equational specification, recursion is used to specify a single equation that can be applied in an indefinite (potentially infinite) number of cases. That is, a recursive equation can be applicable to objects from size 1 to an infinite size. The predicative style of specification uses quantification to define predicates that apply to objects of an indefinite size.

Consider the following basic example of universal quantification:

operation FindAllYoungFolks is
    inputs: pdb: PersonDatabase;
    outputs: nl: NameList;
    postcondition:
        forall (p: PersonRecord)
            if (p in pdb) and (p.a < 40)
            then p.n in nl;
    description: (*
        FindAllYoungFolks produces a list of the names of all
        persons in the PersonDatabase whose age is less than 40
    *);
end FindAllYoungFolks;

object PersonDatabase is
    components: PersonRecord*;
    description: (* Same as earlier definitions. *)
end PersonDatabase;

object PersonRecord is
    components: n:Name and a:Age and ad:Address;
    description: (* Same as earlier definitions. *)
end PersonRecord;

object NameList is
    components: Name*;
end NameList;
The postcondition of the FindAllYoungFolks operation uses forall to quantify over the input PersonDatabase. An English paraphrase of the postcondition is as follows: " For each PersonRecord, p, in the input, if the age of person p is less than 40, then the name of person p is in the output name list". This form of quantification is very typical of postconditions on operations that produce list-structured objects as outputs.

The following objects and operations further exemplify the use of universal and existential quantification in SpecL. Consider an operation that merges set- like databases, of the type specified earlier:

operation MergeDBs
    inputs: d1:SetDB, d2:SetDB;
    outputs: d3:SetDB;
    postcondition:
        forall (e: Elem)
            if (e in d1) or (e in d2)
            then e in d3;
    description: (*
        The MergeDBs operation merges two databases of the same type
        of element.  The postcondition states that the result of the
        merge is that any element that is in either input d1 or d2
        must be in the output d3.
    *)
end MergeDBs;
There is no needed precondition. The postcondition states that if an element is in either of the input DBs, then it is in the output DB. An interesting question is the strength of this postcondition. In particular, does it guarantee that there are no duplicates in the output d3? Since the "in" operator is not constructive, based on this postcondition alone there is no way to state for certain whether d3 has duplicates or not. That is, this postcondition is too weak to guarantee no duplicates. Thus, we must look elsewhere.

The "elsewhere" we look is in the specification of the other DB operations. Specifically, the specifications for SetDB Find and Delete are:

operation Find is
    inputs: d:SetDB, e:Elem;
    outputs: b:boolean;
    precondition: (* An empty precond means true *) ;
    postcondition: b = e in d;
end Find;

operation Delete is
    inputs: d:SetDB, e:Elem;
    outputs: d':SetDB;
    precondition: e in d;
         (* See discussion below about removing this precond. *)
    postcondition: not Find(d,e);
end Delete;
What Find ensures is that if at least one copy of an element is in a SetDB, then it can be found. What Delete ensures, is if there are one or more copies of an element in a SetDB, then after the Delete none can be found. Therefore, whether duplicates are ever physically present in a SetDB is immaterial. The specifications of Find and Delete uphold the critical set property that if an element has been added at least once, then it can be found, and if it is deleted, it cannot be found. Hence, even if MergeDBs does add duplicates, Find and Delete will maintain the set property when subsequently invoked.

Consider another form of database merge:

object PairedDB is
    components: ElemPair*;
end PairedDB;

object ElemPair is
    components: e1:Elem and e2:Elem;
    description: (* An ElemPair is just a pair of elements. *)
end ElemPair;

operation PairwiseMergeDBs
    inputs: d1:DB, d2:DB;
    outputs: dp: PairedDB
    precond: #d
    postcondition:
        forall (e1,e2: Elem)
            if (e1 in d1) and (e2 in d2)
            then
                exists (ep: ElemPair) (ep in dp) and
                           (ep.e1 = e1) and (ep.e2 = e2)
    description: (*
        PairwiseMergeDBs merges two databases of the same type of
        element into a paired database.  The precondition states
        that the two input DBs must have the same number of
        elements.  The postcondition states that the resulting
        output, dp, must consist of pairs of all the elements that
        are the two input databases.
   *)
end PairwiseMergeDBs
It is important to note that universal quantification does not deliver elements in any specific order. Given this, the postcondition for the PairwiseMergeDBs example may be weaker than it appears at first reading. Namely, it specifies that dp contains all possible pairs of elements from d1 and d2, where the paired elements were selected from the databases in no guaranteed order. Suppose, for example, that d1 = [r1, r2, r3], and d2 = [r4, r5, r6]. Then the postcondition specifies that the output dp =[ {r1,r4}, {r1,r5}, {r1,r6}, {r2,r4}, {r2,r5}, {r2,r6}, {r3,r4}, {r3,r5}, {r3,r6} ], where the order of the pairs in dp is not specified.

Suppose that only ordered pairs were desired, such that each pair contains the ith element from each of the inputs. This could be specified in a number of ways, including as follows:

operation PairwiseMergeDBs
    ...
    postcondition:
        forall (e1,e2: Elem)
            if (e1 in d1) and (e2 in d2)
            then
                exists (ep: ElemPair) ((ep in dp) and
                PositionOf(ep.e1, d1) = PositionOf(ep.e2, d2))
    ...
where PositionOf is the following auxiliary function that outputs the ordinal position of an Elem within a DB:
function PositionOf(e:Elem, d:DB):(n:number) =
    if d = nil
    then 0
    else if e = d[1]
         then 1
         else PositionOf(e, d[2:#d]) + 1;

8.4.2. Combining Predicative and Equational Specification

The following example shows how a sorted, keyed database can be defined using a combination of equational and predicative specification:

object Elem
    components: k: Key, ElemValue;
    description: (*
        A DB element with an internal key.
    *);
end Elem;

object Key = string;
object ElemValue = ... ;

object SKDB
    components: Elem*;
    operations:
        Insert (SKDB, Elem) -> (SKDB),
        Delete (SKDB, Key) -> (SKDB),
        Find (SKDB, Key) -> (Elem),
        EmptySKDB () -> (SKDB),
        FindNth (SKDB, number) -> (Elem),
        Size (SKDB) -> number,
        SortDB (SKDB, Key) -> (SKDB);
        EmptyDB () -> (SKBD);
    equations:
        var d: SKDB; var e:Elem; var i:number;
        Find(EmptyDB(), k) == EmptyElem;
        Find(Insert(d, e), k') ==
            if e.k=k' then e else find(d, k');
        Delete(EmptyDB(), k) == EmptyDB;
        Delete(Insert(d, k), k') ==
            if k=k'
            then Delete(d, k')
            else Insert(Delete(d, k'), k, e);
        FindNth(EmptyDB(), i) == EmptyElem;
        FindNth(Insert(d,e), i) ==
            if i < 1 then EmptyElem
            else if i = 1 then e
            else FindNth(d,i-1);
        Size(EmptyDB()) == 0;
        Size(Insert(d,e)) == Size(d) + 1;
end SKDB;

operation SortDB
    inputs:    d:SKDB, k:Key;
    outputs:   d':SKDB;
    precondition: ;
    postcondition: if Size(d') > 1 then
                       forall (i:integer | (i >= 1) and (i < Size(d')))
                           FindNth(d',i).k < FindNth(d',i+1).k;
    description: (* Sort a database *);

end SortDB;

operation FindNth
    inputs: d:SKDB, n:number;
    outputs: e:Elem;
    description: (* Find the DB element at position n. *);
end FindNth;

operation Size
    inputs: d:SKDB;
    outputs: i:integer;
    description: (* Compute the size, in elements, of the given DB. *);
end Size;
Of particular note in this example is the use of the universal quantifier in the SortDB postcondition. An English paraphrase of this postcondition is as follows: "If the output database has more than one element, then for each position i in the database, such that i is between the first and the second to the last positions, the key of the ith element is less than the key of i+1st element, where the key at position n is delivered by the FindNth operation."

8.4.3. More on Auxiliary Functions

A negative critique of keyed database specification is that FindNth might best be left invisible to end-users. That is, there may be no reason that users need to find the nth element in a DB. Rather, users simply need the DB sorted, and to be able to find elements by key. Even if this is the case, we still need FindNth in order to fully specify the definition of SortDB.

It was noted earlier that specification is a battle between saying too little and saying too much. In the case of FindNth, the battle has been lost if FindNth is not a necessary user-level operation, since FindNth is still a necessary specification-level function. In the example above, FindNth was specified equationally as a visible operation. The alternate definition as an auxiliary function is the following:

function FindNth(d:SKDB, n:number):(e:Elem) = d[n];

In some cases, it may turn out that what at first appears to be an auxiliary function is in fact a legitimate user-level operation. When this happens, the process of formalizing a specification has helped uncover an incompleteness originally present before the formal definition was considered. In other cases, auxiliary functions should best be left invisible to the user, in which case they remain genuinely auxiliary. In all cases, specifiers must consider carefully when auxiliary functions are necessary.

A noteworthy property of auxiliary functions is that their definitions may be constructive. That is, the body of the function constructs an actual value that the function outputs. Such constructive definitions are in contrast to operations that are defined solely in terms of pre- and post-conditions. A postcondition does not construct a value, but rather states a property that an assumed constructed value must meet.

The astute reader will notice that with the introduction of constructive functions, SpecL has the expressive power of a functional programming language, such as LISP or ML. Hence, using constructive functions, it is possible for specifications to become very much like programs. This is clearly not the intent. Specifiers should always be mindful that the fundamental goal of specification is to define what a system does, not how it works. Constructive auxiliary functions should be used judiciously, so that SpecL specifications remain free of unnecessary program-level detail.

8.5. Inheritance of Formal Specifications

As noted earlier in the manual, part of the rule for inheritance involves inheritance of operations, which in turns involves inheritance of equations and pre/post conditions. The inheritance of operations and equations is straightforward. Viz., all operations and equations of a parent class are inherited by its instances, with systematic name substitution of instance name for parent name.

The inheritance of pre and post conditions warrants further discussion to clarify it fully. Consider the following example.

object GenericDB
    components: GenericRecord*;
    operations:
        AddRecord(GenericDB,GenericRecord)->GenericDB;
        DelRecord(GenericDB,GenericKey)->GenericDB;
        UpdateRecord(GenericDB,GenericRecord)->GenericDB;
        FindRecord(GenericDB,GenericKey)->GenericRecord;
    description: (*
        A GenericDB is the parent for more specific databases, which will
        contain more specific forms of records.
    *);
end GenericDB;

object GenericRecord
    components: key:GenericKey;
    description: (*
        A generic record contains only a key field, which will be common to all
        instances.  The key is used to access the record in a GenericDB.
    *);
end GenericRecord;

object GenericKey
    description: (*
        A generic key is componentless, to be specialized further by its
        instances.
    *);
end GenericKey;

object PersonDB extends GenericDB
    where:
        GenericRecord = PersonRecord;
        GenericKey = PersonId;
    description: (*
        A PersonDB is a specialization of GenericDB.  Note use of the where
        attribute, which means that UserRecord replaces all occurrences of
        GenericRecord in the defintion of GenericDB, and PersonId replaces all
        occurences of GenericKey.
    *);
end UserDB;

object PersonRecord extends GenericRecord
    components: name:Name and address:Address and age:Age and Sex;
    description: (*
        A PersonRecord specializes a GenericRecord by adding appropriate
        components for Name, etc.
    *);
end UserRecord;

object PersonId extends GenericKey = number
    description: (*
        A PersonId specializes GenericKey to be a number.  In so doing, the
        access key for a PersonRecord can be some unique numeric id, such as
        social security number, for example.
    *);
end PersonId;
op AddRecord(gdb:GenericDB, gr:GenericRecord)->(gdb':GenericDB)
    pre: FindRecord(gdb, gr.key) = nil;
    post: gdb' = gdb + gr;
end;

op DelRecord(gdb:GenericDB, key:GenericKey)->(gdb':GenericDB)
    pre: FindRecord(gdb, key) != nil;
    post: gdb' = gdb - FindRecord(gdb, key);
end;

op UpdateRecord(gdb:GenericDB, gr:GenericRecord)->(gdb':GenericDB)
    pre: FindRecord(gdb, gr.key) != nil;
    post: gdb' = gdb - FindRecord(gdb, gr.key) + gr;
end;

op FindRecord(gdb:GenericDB, key:GenericKey)->(gr':GenericRecord)
    pre: ;
    post: (gr' in gdb) and (gr'.key = key);
end;
Here, pre and post conditions have been specified for the operations defined on GenericDB. When the UserDB object is instantiated, its effective definition includes the following derived operations, which are generated by the normal name substitution of the inheritance mechanism.
op AddRecord(gdb:PersonDB, gr:PersonRecord)->(gdb':PersonDB)
    pre: FindRecord(gdb, gr.key) = nil;
    post: gdb' = gdb + gr;
end;

op DelRecord(gdb:PersonDB, key:PersonId)->(gdb':PersonDB)
    pre: FindRecord(gdb, key) != nil;
    post: gdb' = gdb - FindRecord(gdb, key);
end;

op UpdateRecord(gdb:PersonDB, gr:PersonRecord)->(gdb':PersonDB)
    pre: FindRecord(gdb, gr.key) != nil;
    post: gdb' = gdb - FindRecord(gr.key) + gr;
end;

op FindRecord(gdb:PersonDB, key:PersonId)->(gr':PersonRecord)
    pre: ;
    post: (gr' in gdb) and (gr'.key = key);
end;
As usual, these operations are effective in that they do not appear explicitly in the specification, but are automatically generated by the inheritance mechanism. While an explicit definition of these operations is not required, it may be necessary for further formal specification. I.e., each of these operations may need precondition strengthening to define input constraints formally. For example, a strengthened precondition for AddRecord could be defined as follows:
op AddRecord(gdb:PersonDB, gr:PersonRecord)->(gdb':PersonDB)
    pre: #gr.name <= 30 and #gr.address <= 50 and
             gr.age >= 0 and gr.age <= 200;
end;
This precondition constrains the name field to be <= 30 characters, the address to be <= 50 characters, and the age to be between 0 and 200. As noted in the inheritance rule of Section 4.1, this precondition is anded to the precondition in the parent operation. Hence, the effective definition of AddRecord becomes the following:
op AddRecord(gdb:PersonDB, gr:PersonRecord)->(gdb':PersonDB)
    pre: (FindRecord(gdb, gr.key) = nil) and
             (#(gr.name) <= 30) and (#(gr.address) <= 50) and
                 (gr.age >= 0) and (gr.age <= 200);
    post: gdb' = gdb + gr;
end;





Prev: formal-specs | Next: refs | Up: index | Top: index