SpecL/FMSL/RSL Implementation Ideas
To be formal, the type of the empty-list value "[]" is ?T*. Similarly, the type of the valueless function value "''" is op(?T)->(?T). The latter definition may be unnessary, overly complicated, and stupid. If it really is, then we'll just say that '' is a synonym for nil for any opaque type or valueless function. If we need to make this work formally, or even reflexively in SpecL, then it would appear that we need some way to define "opaque" type in the SpecL type algebra.
At this point, I'd say that the easy way out is to represent '' formally as a special case overloading of a bit of syntactic sugar, that's easy enough to define in the formal semantics, but does not need to be defineable directly in SpecL, e.g., as something like this would-be definition:
obj '' = (op(?T)->(?T)) or ?T|is_opaque(?T);
They are offcially "valueless" operations, that always return nil, and have no type. This begs the question as to whether the keyword "nil" should be usable as a type ident. The bottomline is that I say NO. Rataionnale follows.
While NilType is in fact an internal interpreter type, it make things too loose to allow it to be used in a source specification. Here's why. Having any data ident be declared NilType is useless, because there's now way to meaningfully allocate storage for it.
I was initially tempted to say that an op with not explicit return type was an "opaque" type, but that seems not to be sensible. What we can say is that the sym lit '' is the only existing value for a valueless function. We can have '' be a synonym for nil, but for operation types, in the same way that [] is a synonym for nil for list types. There seems to be some degenerate consistency here, though it may be more complicated than it's worth. See the preceding item in these notes for further discussion.
As if we need yet another unsettled "idea", here's one. It may be worthwhile to consider an exception to the would-be rule that set can only be used on declared vars, that exception being the use of set on output parms. So, motherfucker, consider it.
At this point, I think it's largely futile to worry about having any substantial agreement with SpecL versus UML terminology. Where it doesn't significantly comprimise SpecL syntax and/or semantics, then SpecL/UML agreement is fine. Otherwise, F it.
A particular case in point is the long-standing anxiety over the UML use of the term "attribute" compared to the SpecL use. Having read Wikipedia, of all places ( the article on Entity-attribute-value model ), I've become fully comfortable with the SpecL sense of the term being worth keeping, and hence we will NOT rename SpecL "attributes" to "properties".
We'll have an appropriate section in the ref man that discusses the issue thoroughly, perhaps even with a small dose of whimsy. The idea is this: "Look at the UML, understand it in terms of the underlying SpecL, and don't worry about the differences in UML versus SpecL terminology. A UML diagram is-a UML diagram. By any other name, it would smell just as bad."
Keep grammar rules, but don't make them synomous with obj defs. Keep symlits, as the way to represent something that has mnemonic name, but is not modelwise considered to be a string. (We could forget the second of these if we redo the latest term-factor example with double-quoted strings for terminals instead of symlits, but having looked at it yet again, symlits are better, if not significantly so, as terminal symbols in a grammar, as well as other places that we've cited often enough, including
A FIRM DECISION: We should in fact have separeate syntax for a new type of entity called a "rule". It can look like a conventional grammar rule, so that the latest and greatest version of the term-factor attribute still works. I thought for a bit about having a new 'rule' keyword, but that would make grammar rules look stupid. Another good reason to separate grammar rules from obj defs is that they're really not the same entities, despite the areas of similarity. In particular, embedded actions really don't make sense for obj defs, and inheritance doesn't seem to make much if any sense for grammar rules.
There is the kitechen-sink-esque nature of having grammar rules as a separate language feature, but I think it can be ((reasonably) well) rationalized on the grounds that SpecL is, after all, billed as a "multi-paradignm" specification language. Two whit, the semantic paradigms that are supported are:
"... There is a simple rule to remember about specifications that use variables -- if you want your specification to be purely functional, don't have any variable declartions."
"So, what is the differece between a let expression with and wihtout an explicit type declaration? It's subtle, having to do with inheritance. Also worth noting is the issue of testing value universes. (These both need to be explained thoroughly.)"
For the purposes of overload resolution, a type var anywhere in a signature subsumes overloading at the type var position. E.g., the following results in a function redef error:
as doesop O1(x:?X, y:integer, z:string); op O1(y:?Y, y:integer, z:string);
op O1(x:?X, y:integer, z:string); op O1(x:integer, y:integer, z:string);
For even more symplicity, it may well be worthwhile to say that a type var anywhere in a signature subsumes overloading at all. E.g., the following also results in a redef error, or more clearly, an error described as redef of a polymorphic function:
op O1(x:?X, y:integer, z:string); op O1(...); // For *any* zero or more parameters in "..."
And for even further simplicity, we will nix coarity overloads. (Motherfucker -- this decision was made firmly 8.5 years ago, in the 21sep00 entry! This is one shocking and scary motherfucker of an observation, motherfucker. Time to get things the fuck done, once and for motherfucking all. Did I say Motherfucker?!?)
Given how ill-used the keywordless, space-delimited obj definitions are likely to be, and how not-really-that-messy a file like term-factor-does-work.rsl looks, I think it's time to simplify the fuck out of things by getting rid of the syntax that allows objects to be declared without an "obj" keyword, and with space-delimited componentn exprs.
Once this is done, parens will be a better means to declare parameterized types than angle brackets, since the latter most likely causes syntactic clash, if no visual confusion with '>' as the inheritance shortcut.
And another (?final for now?) syntactic observation is that we may well want to allow all type expression ops in an op signature, to allow the ready definition of functions like the ML for loops. We classes-archives/530/examples/ml/for.ml and the rudimentary attempt to duplicate this in newer-inputs/for.fmsl.
OK, how bout we screw ML, and have genuine single-assessment lets, meaning in particular that you can't reassign a let var the way you can an ML value. This will make things nicer pretty much all around, since more than one appearance of let var on the LHS of a let is a redeclaration error. Lets can stil, of course, appeare on the RHS of another let, as long as they're declared lexcically before their RHS use. This makes lets like let* in Lisp, in that order is in fact relevant.
Provide different levels of axiom validation:
OK, here are conclusions related to the 3feb09 discussion of this topic:
A let var is declared, and subsequently appears on the LHS of a set; the typecheck error message in this case is "cannot use set on a local variable declared with let".Note that the converse of this will not cause a clash, since a global variable name appearing on the LHS of a let will redeclare that var name locally, and hence hide the global.
Doing work on the interp has led to reconsideration of these semantics, and their implementation. Here are some initial notes, with more to follow, perhaps on a different date.
For the implementation of value execution, it can be done as last pass in the type checker, after all other checking has been done. Alternatively, it could be done as first pass in the interpreter. Whether values are constants, or "lettable" is discussed further below. I'm strong inclined to keep them constants.
To avoid confusion between 'let' and ':=', we could say that let's not legal at the top-level of the interp, which can be done however it needs to happen. Probably only a single expr should follow the '>' "prompt", instead of an expr list. This shoud take care of lets not being allowed at the top level. Explicit paren-enclosed expr lists will still be allowed at the top level, but the lets, if any, in these contexts willl be local only to those anonymous scopes, and therefore not survive out to the top level. What his means, also, is that only global var decls and assignment can be used at the top level to store intermediate sandbox values. This may well be fine.
I've considered making the semantics of FMSL values like those of ML, however the problem with this is that it allows values to be redeclared, which is not really what we want in fmsl. For example, if FMSL vals worked like ML, we could do the following:
outputsval x = 10; op f() = x; > x; > f(); val x = x + 1; > x; > f();
But this seems (has always seemed) rather disengenuous, since x is really acting more like a variable than a value in this case. The non-variable behavior here is that the use f retains the historical value of x, rather than values of x made subsequent to f's declaration. Here's what things could look like in FMSL to get the same semantics:10 10 11 10
The deal here is that let is a functional single-assignment thing, that temporally redeclares a variable, such that any previous uses refer to a different value than a subsequent redclaration. This seems a bit better than ML, though it does still have the "temporal surprise" issue.let x = 10; op f() = x; > x; > f(); let x = x + 1; > x; > f();
Let's try to rationalize things here, in terms of values, variables, and lets. First, a value declaration is a constant through the lifetime of a specification. It is always an error to re-declare a value identier. Also, a value declaration is a top-level entity declaration, NOT an expression. This means a value declaration cannot appear in an expression sequence, in particular in a function body. The value identifier itself is an expression value, and can appear in any r-value expression context.
A variable is like a value, declarationwise. I.e., a variable declaration is top-level only, and cannot be re-declared. It models a piece of data that is shared by all operations a functions. A variable identifier is an expression value, and can appear in any r-value or l- value context.
A let is both a declaration and an expression. It (re)declares an identifier in the current scope, and may appear in any expression context. Specifically, a let of the form
declares iname in the current scope, if it is not already declared. It then binds the value of the expr to name. A let name can appear in any subsequent r-value context, where it is evaluated by immediate macro expansion. This means, in particular, when a let variable appears as an r-value within the body of an operation declaration, the evaluated value of the let varible is placed within the function body, NOT the unevaluated let variable name. The most important consequence of this is that if and when the operation is subsequently invoked, it (effectively) uses the most recent bound value of the let name, not any subsequent value that the let variable may have been re-bound to. This behavior is akin to the behavior of val in ML.let name = expr
A let expression, as with any other expression, CANNOT appear in a top-level declaration context, although it can be executed in the top-level execution scope. It's the latter case that gets a little funky, surprisewise. E.g.,
I'm inclined to use these as actual error message texts, so that the whimsy might help diminish the nastiness of the surprise.> let x = 10; var x:integer; ERROR: you cannot redeclare the sweet and inoocent let variable "x" to ba a nasty and evil global variable > x := 10; ERROR: you cannot change the sweet and innocent let variable "x" with the nasty and evil assignment operator ":=" var y: integer; > let y = 10; ERROR: you cannot change the nasty and evil global variable "y" with the sweet and innocent let operator
Another possibility is to syntactically restrict let exprs to be at the beginning of an expression list. This would be consistent with Lisp and ML, and make things much simpler with regard to mixing and matching global vars and lets.
It may well have already been done/thought-of in the literature. Anyway, the idea is to find uses of global vars and term them into explicit function parms. I think it can be done statically, with a transitive closure on the calling chain of each fucntion using one or more globals.
A quick look at the Wikipedia article on parallel computing is a bit depressing in this regard, given that there appears to be a lot of shit I don't know about, even as early as Lamport's 1979 shit of the sequential equivalence of parallel program. Anyway, the idea of "purifying", or at least having the type checker check for pureness may be worthwhile, if not ground breaking.
To avoid CJ culture shock, consider using/allowing curly braces for expression sequences. This would overload them with their use as tuple constructor, but this is probably OK. One potential ramification of this is the assignment of function values to function-valued variables, but this might just well workout fine. This brings up the issue of free vars in a function body, which presumably we should deal with like Lisp, but have the type cheker issue a warning of the form "Variable X is unbound in function F."
Based on the several entries of today's date that follow, plus other entries further down the list, and in the interest of simplification, here are what I hope are some definitive decisions:
What do we do about cases where the name of a tuple field or parm is not a plain ident, as in
?ManagerRecord > PersonRecord = Employee*; InterspersedStuff = (Tag or Text)*;
I believe the answer must be that typename field access can only work when the type of a component is a plain ident, which does not include a type var. This is a bit of a duh, in that the designation typename field access means that we're referring to the field by it's named type, not some type expression.
I'm not sure at this point if this realization is a potential nail in the coffin of typename field access. I think it might well be. In reading again the 23apr03 entry, I'm thinking it's not really as compelling an example as it seemed at the time. Further, having a good IDE, with tuple component completion, could go a long way towards lessening the need for typename field access.
So let's do an explicit pro/con analysis for typename field access:
Pros:So far, it looks pretty clear that the cons have it.
- simple, in particular to the novice, e.g., the intro SE student
- can be cleaner and shorter
Cons
- violates simplicity principle
- notationally violates the use of type names in expressions, even if it's a shorthand that does not violate the underlying semantics
- does not apply uniformly to all types, specifically, to tuples with non-ident field types
- gets complicated for tuples with two or more fields of the same type
- a decent IDE, including specldoc inclusion of show/hide names button, can go quite aways towards lessening the benefit of typename field access
If we go for nicely parameterized objects, I don't think Object as the "top" type is necessary. Not having it makes us more functional "feeling" than OO feeling, for what that's worth. Further, it avoids the weak typing hole exemplified in newer-inputs/min-max.fmsl.
The idea exemplified just below, of no keywords at all, is just fine. I can even live with using angle brackets for parameterized objects. As far as keyword elimination goes, what just might work best of all is to eliminate the abbreviated keywords, e.g., "obj", and "op". This seems to be a reasonable compromise between retaining keywords for clarity, if the user so chooses, versus simplifying by eliminating excess keywords.
And with a decent IDE, the shorter keywords aren't really all that useful. The keywordless forms are good for on-the-board use and quick typing, while the full-length keyword forms will work fine in a template-driven IDE.
The other nice thing about advertising the keywordless forms is that it makes grammar definitions just a normal thing, without doing something funky to allow grammars as a special case. In the "I can even live with it" file again, I'd put using just plain '=' as the LHS/RHS separator, instead of having "::=" as yet another "is" keyword.
Here's an example of full-blown keywordless and variabless defs:
PersonRec = Name and ID; DB = PersonRec*; Add(PersonRec, DB)->DB pre: PersonRec.Name != nil and not (PersonRec.Name in DB); post: NoJunkNoConfusion(PersonRec, DB, DB'); end; NoJunkNoConfusion(item:?T, in_list:?T*, out_list:?T*) = item in out_list and forall (item' in_list) item' in out_list and #out_list = #in_list + 1; GenericList<?T> = ?T*;
Note here that:
With this, I still think it's fine to say that 'and', ',', and ' ' are synonous ways delimit list items.
An advantage of having no obj/op keywords at all is that allows them to be used as var names, avoiding the possibly confusing error when using a keyword as a user-defined ident. However, I think this is a small enough advantage that retaining obj/op keywords is highly arguable. An advantage with this is that it can help a definition read more cleary.
Regarding (still) allowing coponent and parm names, it can be argued on the basis of style, particularly if one wants to make expressions less verbose- looking, as in the following equiv to the above:
And more importantly, we must have names when field types are not plain idents (see "A Not-Yet-Fully-Explored ... (4dec07)" item above).PersonRec = n:Name and id:ID; Add(db:DB, pr:PersonRec)->db':DB pre: pr.name != nil and not (pr.name in db); post: pr in db'; end;
In the end, as long as we don't make things absurdly complicated and/or sacrifice understandability severly (if at all), having optional syntaxes will be fine. And we can have a smart(er) syntax checker that warns of inconsistent notations, if the user wants to turn on such warnings. Call in "notational lint" mode.
Candiates to get rid of include "object", "operation", "function", "operations", "precondition", "postcondition".
Problems with this include:
Possibly big problem here is that we're using type names directly in exprs. We could do this:op O(A, B) -> C pre: A < B; post: C = A + B; end O; op P(D, E) -> E pre: D < E; post: E' = D + E; end P;
but I think that's fucked up.op O(A, B) -> C pre: in.A < in.B; post: out.C = in.A + in.B; end O;
But maybe as long as it's clear what we're doing here, referring to vars by their type names could be OK. We'll just have to do some definitive examples to be completely sure.
Despite the 26nov07 thoughts on the matter, I think having axiom (and theorem) names is a prefectly reasonable thing to do. It'll provide a consistency with other entity defs. At least that's the thought for the moment.
'=' binds a name to an expression. Specifically,
Define fully the semantics of relational operators on all types. To whit (for starters, at least):
4dec07 Update: Given that we're going with angle brackets for generics, the revised version of the example that follows is this:
and the discussion that follows is now obsolete, insofar as it deals with the use of parens in the syntax of generic type instantiation. It's retained here for historical and rationale purposes.obj GenericList<?T> = ?T*; op Op1(gl:GenericList<integer>, i:integer) = gl + i; op Op2(l:?T*, x:?T) = l + x;
If it's not been stated explicitly (enough) so far, this needs to be done, to ensure that the co-existence is in fact doable. E.g.,
obj GenericList(T?) = T?*; op Op1(gl:GenericList(integer), i:integer) = gl + i; op Op2(l:T?*, x:T?) = l + x;
The on-the-fly generic instantiation seems not to conflict with a construtor invocation, since the former is used in a declaration context, whereas the latter is used in an expression context.
Sparked by need for a state machine definition as part of FIPS 140 compliance, I'm thinking it might not be that tough to add state machine syntax and semantics. The graphical forms are a state-transition table and FSM graph.
Some would-be syntax:
(* Input symbols: *) a; b; c; (* Output symbols: *) d; e; f; machine M1 states: S0, S1; transitions: --> S0, -- Initial state S0 --a--> S1, -- Goto S1 from S0, on input 'a', S1 --b--> S2, -- Goto S2 from S1, on input 'b' S0 --c/d--> S3; -- Goto S3 from S1, on input 'c', producing output 'd' description: (* ... *); end M1; state S0 inputs: a,b,c; outputs: d,e; description: (* ... *); end;
The presence or absence of inputs and outputs determines the type of machine it is:
If we have the gumpshum, the type checker can perform the following (types of) checks:
We may want to consider the keyword "symbol" as a equivalent way of defining opaque types. I.e.,
is equivalent tosymbol x; symbol a,b,c; symbol d description: (* ... *); end d;
and symbols are disallowed from having a components attribute.object x; object a; object b; object c;
HOWEVER -- Given that opaque objects can be declared as shown at the top of the preceding example, i.e.,
I really don't think the "symbol" keyword is necessary, at least for the state machine context.(* Input symbols: *) a; b; c; (* Output symbols: *) d; e; f;
Am I smoking something, or is '<' backwards for a UML-like mnemonicette for inheritance. I.e., shouldn't it be '>', so the inherited-from object is in the pointy end of the arrow? E.g.,
versusobject EmployeeRecord > PersonRecord = WageScale and EmployeeStatus;
or evenobject EmployeeRecord < PersonRecord = WageScale and EmployeeStatus;
object EmployeeRecord > PersonRecord <> WageScale and EmployeeStatus;
Also, we should consider allowing extends and '=' to be interchanageable in the order of an object definition, as in
versusobject EmployeeRecord = WageScale and EmployeeStatus extends PersonRecord;
object EmployeeRecord extends PersonRecord = WageScale and EmployeeStatus;
I'm not sure there was ever a big clamour for naming axioms, and so, given the potential unclarity of the name/axiom-body separator syntax (i.e., ':' versus '='), I'm not really sure we need to have axiom names, after all. What I'd like to see is a complelling example were axiom naming is useful.
The arguments against it include:
which is nicely analogous to the data dic entry for an object, viz.,Name (auto-gen'd) Expression Description
Name (declared) Components Description
Do unions of incompatible types imply runtime type checking? Reread Tennent and look at functional languages about unions, including ML (datatypes), Haskell, and Z.
As we've at least hinted at before, I think I like the idea of having different levels, or strengths, of type checking. At the stongest level, we could say that un-overloaded equality is not defined for unions of incompatible component types. For sure, look again fully at what ML does to require that functions over datatypes account for all alts of an ML datatype; see, e.g., 530/ml/lispval*.ml. The warning message from the ML compiler is "match nonexhaustive", the specl analog of which is the preceding idea of "weaker" type checking.
Figure out if type vars should have the '?' at the beginning, end, either, or one or more times anywhere in the identifier string.
Also figure out whether type vars should be required in defining parameterized (aka, generic) types. See, e.g., the versions with and without the syntactic use of type vars in newer-inputs/parameterized-types.fmsl.
Also figure out if just plain '?' is OK as a type var, as long as we don't care about neighboring types constraints. However, it may be the case that '?' outside of the context of an ident may cause syntax problems. Need to check it out.
Also figure out if we should limit '?' in an ident to type idents, i.e., have separate lexical/syntactic categories for ident and type_ident.
Re. the rationale for the term "object" instead of "class", I think we can rationally argue that specl objects are not really classes in what might be called the widely-conceptalized sense, for at least the followning reasons:
It is clear that English word "object" is highly overloaded when used in SE circles. The sense most closely aligned with its use in specl comes from the phrase "objects and operations", used in context of software requirements analysis and modeling. Yadamotherfuckingyada.
Despite what it says below in the 27oct07 item, I think "primitive" is in fact a better term that "atomic". It's akin to the "attribute" versus "property" discussion. I.e., since I don't see any particularly compelling reason to favor "atomic" type over "primitive" type, we can go with the latter on the grounds of overall clarity.
Now to the question at hand -- can we extend primitive types. I'm not sure we've come to a definitive conclusion on this, but we clearly need to. What needs to be done is a scan of items in this file, and elsewhere, followed by a DECISION. You know what one of those is, right motherfucker?
One possible, seemingly benign change in SpecL terminology (yes, I think I'll start calling it that now) is changing specl "attribute" to "property". This avoids confusion with UML's use of the term "attribute", and there is some historical precedent for the term "property" in spec lang contexts such as this.
4dec07 Update: We will in fact use '=', as discussed in the "Syntax Rationale" entry of 1dec07.
With the full advent of '=' instead of 'is', we should consider if it makes better semantic sense to have the tuple component initializer be ':=' instead of '='. What brought this on was the odd-looking one-tuple definition in newer-inputs/const-components.fmsl:
with the thinking thatobj OneTupleOfIntegerInitializedToTen = integer = 10;
looks (somewhat) better. The question to be answered is if component initialization is more like an equality definition or an assignment.obj OneTupleOfIntegerInitializedToTen = integer := 10;
I just had a somewhat sickening thought about possibly (re...re)changing the syntax of object declaration (back to) allowing (having) ':' be the (only) separator in a short-form obj definition. A quick hack to the grammar revealed that it appears to work.
But fuck it -- I think '=' is just fine for binding a type name to a type "value". We can go on about this, but I really don't think we need to.
4dec07 Update: With the use of '>' instead of '<' as inheritance sugar, there is no extra s/r conflict when using '<' ... '>' to bracket genric object parms.
Consider strongly replacing the current where attribute stuff (aka, kludge) with a more conventional syntax, as illustrated, e.g., in newer- inptus/parameterized-types.fmsl. A syntactic issue is the conflict between '<' as a type parm bracket versus the inheritance symbol, which in fact does cause an s/r conflict in the current grammar, when just hacked in. There are a number of syntactic options, including using a different bracketing, or changing or eliminating '<' as inheritance sugar.
Further thought on this syntactic issue is that regular parens should be fine for parameterized types, as shown in the second example in the parameterized-types.fmsl, and the commented out RHS in obj_heading, dated with today's date (4nov07). We need to make sure it all works out in terms of type refs, but hopefully it's OK. There's someting to be said for regular parens instead of angle brackets, on at least two accounts. First, regular parens are consistent with the terminology "parameterized types", akin to the notation for parameterized functions. Also, using regular parens avoids the introduction of additional bracketing syntax, when it is arguably unecessary.
One potentially very good thing about this more explicit form of type parameterization is that it may well at least partially address the more "featureful" form of composition in UML 2.1, vis a via UML 1.5, as discussed in ../related-work/uml/general-info/UML-2-comp-model.pdf. This was rather worrisome on first reading, but generalized parameterized typing seems to go in this direction, in that the type parm can be used in relational attributes, in addtion to just the components that (seemed to be) the case for where instantiation. More thought is necessary here, including the redefines and subsets association constraints that are most likely related to all this.
Re. the semantics of generalized type parameters, my initial thought (hope) is that it can be very close if not identical to the current semantics of where clause instantiation. This obviously needs to be worked out. We of course need to study (and understand) fully Java generics.
In looking some more at the UML specs, the thought occurs, again, to consider using terminology that's more consistent with UML, e.g., "class" instead of (in addition to) "object", "association(al)" instead of (in addition to) "relation(al)". What I would very much prefer is to spell out the terminological diffs between FMSL a UML, and rationalize why they exist. And I think this preferred course is in fact quite doable.
Whither next and prev ops on enum values? It's a bit kludgy, but perhaps we can say that if all union components are of type "the T x", for some type T, then there are next and prev ops defined. This implies there's an order to union (and tuple) components, which in fact there is, per the ML-like notion of #n to access the nth component. What we're saying here is that if a union (and, what the heck, a tuple) has components of the same type, then there are next and prev functions available.
We of course need to define "same type" formally, and there are some subtype and type compatibility things going on here we need to deal with. E.g., if all of the elements of a union (tuple) are compatible with each other directly or indirectly, there are next/prev ops. Indirectly compatible means something like "compatible with other type", as in forall i "the T xi" are compat with T.
What we might be able to do is generalize this to all unions and tuples being fully indexable, in the normal square-bracket way. What we do is make the co- arity of the index op be the union of the component types, modulo union type simplification. This notion of union simplification is related to (or may subsume) the notiion indirect compatibility mentioned in the preceding paragraph. E.g., "union of T or T or T" simplifies to "T"; "union of the integer 1 or the integer 2 or the integer 3" simplifies to "integer". Sounds promising (but so do a lot of other motherfucking things).
Allow multiplicity in Relational Attributes. See, e.g., newer-inputs/attr- multiplicity.fmsl.
Make the functional basis for relational values be ref.
I don't remember right now why we currently allow attributes to have general expressiions as values (see the parser syntax for obj_attribute, which includes expr as a RHS alternative. Given that we seemed to have gone whole-hog into concrete-valued objects, we don't really need this alt any more, and it causes a reduce/reduce when we added the syntax for relational multiplicity shown in the newer-inputs example cited just above. So, we should get rid of it, if at all possible.
As noted in the LOG entry of this date, the "possibly starred" biznis in parser.y seems rather kludgy. As I recall, it was done incrementally to allow stars in signatures.
With the coming of refs, we need to revisit this area, and see if we should go a route like "possibly starred and/or reffed", or what the fuck. We may just want to rethink the ML-style of auto-tuple args, though as I recall this had some potentially serious problems, like the concept of a truly multi-arg function in a dataflow diagram. Anyway, we need to get closure on this.
Related to this is the idea of the difference between constant-valued tuple components, versus initialized tuple components, as illustrated in the following examples:
I think it's pretty clear at this point, if it has been so (and discussed) till now, that the verbosely mnemonic object names of this exmaple spell things out. Now we must make sure this shit is spelled out in the ref man, and whereever the fuck else it should be.
Regarding the fabled "ref man", I think it's time to have an official two- or maybe even three-pass version, akin to the "Gentle Introduction to Haskell" versus the full Haskell ref man. I think we can readily motivate the gentle intro, including including some phrasiology like "But can I do this ...", a preliminary lists of subheadings for which includes:
I think we should fill this in here, and include it in some form in the ref man. Here's the list so far:
With all the thinking we've done, the question for a newbie might be ``Why allow both "value X = const-val" and "object X = const-val"''. A decent answer for allowing the latter is to say that it's a degenerate case of allowing a type to have one or more constant values as components, without creating one or more special-case exceptions for what is and is not legal. Moreover, as long as there's no harm in allowing this, things are just fine.
The benefit of allowing values in general to be object components is it provides a natural and straight forward way to define enumerated types, as can be done in most PLs. In addition, one can easily model other forms of enumerations, all stemming from one basic idea.
Given that both UML and ML have refs, I think we should consider seriously adding them to FMSL (cum SpecL). The ML syntax and semantics looks a bit funky, so we could go with something like this:
obj A = integer; obj B = boolean; obj C = string; obj ARef = ref A; obj ABC = A and B and C; obj ABCRefs = ref A, ref B, ref C; op Op(a:A, b:B, c:C, aref:ARef, rt:RefTuple) = ( a = 10; -- normal int var and val aref = new A(10); -- ref var and heap val aref = A(10); -- equiv OR replacement for the to preceding; -- see discussion below aref = new A(); -- ref var and heap val, uninit'd @aref = 10; -- deref to access "ref to int" (@aref).A = 20; -- deref then field select afef.A = 20; -- equiv to prev expr, courtesy of -- auto-deref-on-field-select-of-ref-to-tuple aref = new integer() -- ERROR: new can only be applied to user-defined types let a = 100; -- single assignment set a = 200; -- mutating assignment a := 200; -- syntactically sugared mutating assignment -- a bit more syntax obj X = rt:ref T and lt:T*; )
aref = bref -- shallow equality @aref = @bref -- deep equality
But wait on the following "but wait". What's emerging here are at least a couple ideas:
BOGUS: But wait just a farging minute. Java has no explicit refs in it, and we seem to be able to do just file with it. But then, there's the setf stuff in Lisp, that we really should consider. Hmm, this needs plenty of thought. :SUGOB
OK, here's what I think are the reasons we want to keep sym lits, and I'm getting to the point where I don't think the reasons are strong enough to keep them:
I mean, it says what we want, we can have a strong-typed semantics for it, and we don't need to mess around with several other ways to define the same thing with opaque types and/or sym lits.object Sex = "Male" or "Female";
Bottom line -- let's go for some simplicity here.
It would appear what we have with generics is a reasonable deal for a modeling language, in that allows some useful things to be defined in terms of overloading and inheritance.
We've thought in the past about somewhat gratuitously throwing in the Java- style Object. I don't think we can just do this without doing one of two things:
Re. type inference, my general inclination is that it's likely to be more trouble than it's worth, if what we're after is a clean spec lang. However, I believe that type inference might be a bit easier in FMSL, given that we do not want ML-style fully typeless op sigs. I.e., even for an op with a fully type- variable signature, we still need to declare explicitly the type of each parm. However, type inference is still likely to be an implementation pain, and the question remains if it's really worth it in the spec-based domain we're going for.
Re. dynamic typing, we just might be able to argue for dynamic typing, as a gateway to an underlying procedural program. In fact, we might have sections in the ref man, after all of the functional sections, with a title like "Dynamic Typing and Procedural Computation", wherein we start out like this:
"The features described thus far in the manual have defined a fully functional, statically-typed language. In this (and perhaps following) sections are defined procedural features (assignment, looping, refs (maybe)) and dynamic typing features (Object and down casting)."
Lastly, for now, it might just be fine to have both type inference and dynamic typing, if we can work out the details OK. But you know, motherfucker, life's getting shorter these days.
With the overload-based generics stuff we have going on now, operator overloading might just fit it, even somewhat sweetly. What's come to mind just now (if not in the past), is an example like this:
which could be more conveniently defined like this:obj GenericDB = GenericRecord*; op CompareRecords(gr1:GenericRecord, gr2:GenericRecord) = gr1 = gr2; obj NamedRecord < GenericRecord = name:string; op CompareRecords(nr1:GenericRecord, nr2:GenericRecord) = nr1.name = nr2.name;
the convenience being not having to define the CompareRecords op at all.obj GenericDB = GenericRecord*; obj NamedRecord < GenericRecord = name:string; op "<"(nr1:GenericRecord, nr2:GenericRecord) = nr1.name < nr2.name;
This could be nice, but at the price of op overloading details that may take some time. The underlying hope is that the current generic rules and semantics are effectively unchanged by this style of op overloading.
One issue is that comparison is built-in for all types (remembering that it's tautologically false for all function types). Given this, we may want to limit overloading to the comparison ops, though it's probably not necessary to do this. What we can say is that op overloading has the potential to undermine the clarity of a spec, given that the appearance of an op like "CompareRecords" is potentially more clear than just "<", even in the context of record comparisons. And of course there's my frequent peeve about the non-transparency of inheritance altogether, in that the op named CompareNamedRecords would be the clearest of all.
Ah, but a quick check of parser.y shows that strings are still valid op names. So all we gotta do now is update the type checker. (That'll be easy -- chuckle motherfucking chuckle.)
We should probably disallow inheritance from atomic types.
I was thinking that "primitive type" might be a better term than "atomic type", given that strings really aren't atomic. However, we can in fact think of string as atomic as a type, in the sense that it is not composed of other types. The fact that string values don't feel atomic is not really relevant to the string type being atomic. And, in fact, we can think of integer values as being unatomic, in that we can decompose them into separate digits using division. So, I'd say, the term "atomic type" is fine.
So the following doesn't sound so bad, and aside from potential confusion of auto-decl of sym lits with opaque decls, might just work out fine:
An opaque type has exactly one value, and that's the symbolic literal of the type's name.
To say in ref man -- Careful:
does NOT define TenAndTwenty to be the value 30, but rather to be the two-tuple containing the 10 and 20.obj TenAndTwenty = 10 + 20;
OR, we might just screw the way-old data dictionary syntax that allows '+' as an operator, and go back to only ',' and 'and for tuple creation. This is actually probably a pretty decent idea. Think about it.
3nov07 update: I have thought about it, and I think I do want to get rid of '+' as a component expression operator. Among other things, I recall '+' meaning logical or in some dialects of boolean logic, with a multiplication-like "cross" operator meaning something like tupleness.
Did a bit of reading in the Haskell manual and gentle introduction. It would be very informative to do some more Haskell reading, and maybe even refer to it in appropriate places in the ref man.
One particular thing that stood out today was the statement that types are not at all first-class objects in Haskell. In FMSL, it would appear (I'm pretty sure, anyway), that types are first-class in the case of union (and thereby) class types. (BTW, if we haven't said it explicitly already, the definition of "class type" should simply be a type that is extended from).
The use of union (class) types that makes them first class comes from the '?.' and '?<' operators, where type names are used as expression operands. This needs to be expounded upon in the ref man.
Given what appears to be the most recent thinking on this subject, it's not clear if we still want the one-to-one correspondence of opaque types and sym lits. Also, we may want to de-emphasize the age-old way of defining enums using opaque types. (We can't entirely get rid of opaque-type-based enums, since if we have opaque types, we can always make unions out of them.)
A way to do this is to introduce a new atomic type called "symbol". With this, the issue pretty much boils down to a binary decision. Viz., is the type of x' "symbol" or "the opaque type x"?
There are a number of questions and/or issues that need to be resolved here, and I'm purposely leaving them unresolved at this particular junction (there's grading to do). Said questions/issues include:
as opposed to having to do type interrogation, as inif (color = 'Blue')
if (color.?Blue)
Just looking (yet again) at this in this context, it looks pretty funky to have a say that there's a symbolic value that's created with the same name as an (opaque) type.obj Sex = Male or Female; obj Male; obj Female; var s:Sex; ... if (s = 'Male')
OneAndTwo = 1 and 2;
In working out ideas in calendar/specification/ideas/duration-bounds.fmsl, q.v., it occurs to me that having nil do double duty as both the empty value and the error value is weak. So, we shoud say that vars can assume three kinds of value --
Examples of expressions that produce error as a value are:
Based on the 24may02 and related entries, all opaque types are unique, and incompatible with any other types, except Object (?nil?, if in fact there exists a nil type??). Anyway, the important thing is that opaque types are effectively unique. And, as I'm now working a specs for the new dftool, it has occured to me that having opaque types be unique may be pretty handy for dfd editing, in that when a new edge is created between two untyped nodes, a new type is created of the form
obj edge-name ;
And to keep things really motherfucking simple, I think we're ready to get rid
of "is" altogether as a keyword. It'll be a pain to change the test
suite, but worth it overall.
NIXED: [BI] think it's best as is, since, among other
things, an ax declaration like this "ax A1 = 1 = 1" looks funkier than "ax A1:
1 = 1"; also, an axiom is not really like an entity that's being bound to a
value, but rather a formula that's being labeled. And even if it is, the funky
look thing wins out here.
While we're at it, I think we need to change the syntax of named axioms to be
instead of the currentaxiom [<name> '='] expr
:DEXINaxiom [<name> ':'] expr
We're going back to the 13nov05 bottom motherfucking line below. The problem with the "is/Chas" business is that it's just too fucking hard to use these words so they consistently make intuitve sense, and yet result in a formal definition that works nicely. So, the deal is that we'll sacrifice the intuitive use of "is-a" for an otherwise consistent typing framework. The intuitive background we can refer to is that of standard data dictionary defs, that use "=" as the separator. Also, in my real world, inheritance takes a way back seat to composition, and multiple inheritance takes a way way back seat. So providing a more intuitve "is-a" notation for inheritance at the sacrifice of cleanliness elsewhere, is really not worth it.
Examples:
obj DB has GenericRecord*; obj EmployeeRecord has Name and ID; obj SupervisorRecord is GenericRecord has Supervisee*; obj StaffRecord is EmployeeRecord has Supervisor; obj Name is string; -- hmm, is this counterintuitive? obj ID has number; -- hmm, how bout this? -- sure obj DayName has Sunday or Monday or Tuesday or Wednesday or Thursday or Friday or Saturday;
obj DB = GenericRecord*; obj EmployeeRecord = Name and ID; obj SupervisorRecord > GenericRecord = Supervisee*; obj StaffRecord > EmployeeRecord = Supervisor; obj Name = string; -- hmm, is this counterintuitive? obj ID > number; -- hmm, how bout this? -- probably not
obj DB = GenericRecord*; obj EmployeeRecord = Name and ID; obj SupervisorRecord extends GenericRecord is Supervisee*; obj StaffRecord extends EmployeeRecord is Supervisor; obj Name = string; -- hmm, is this counterintuitive? obj ID > number; -- hmm, how bout this? -- probably not
For a bit of detail, try this. We can explain the two flavors of "is" as follows:
What these things mean technically from a language perspective is that the is clause can syntactically be one of the following:
Given the extant semantics that one-tuples and is-a types are semantically equiv, is and has as keywords can be used interchangeably to define is-a types.
I'm thinking "singleton type" might be a good name for type defined with is that has only a single name or non-tuple composition expression and no extension with has.
For non-tuple types, consider allowing is, particularly for simple enums and function types. What we can be saying here is that any one-tuple type can be defined with "is".
Sorry, but "Prim" as a name is (a) a bit too cute; (b) used for a bunch of other things, including "Prim's algorithm" and Source Forge's "Permission Record Information Machine".
"fmsl", on the other hand, shows up nowhere at Source Forge, and is even pretty much nowheresville on Google, with fmsl.net and fmsl.com still available. And fmsl.org is the Franklin Men's Softball League -- pretty much off the beaten path.
Anything we want to consider primitive that's above function invocation, typing, arithmetic, and quantifier-free boolean logic, we'll define equationally. So, here's the list of such things:
16jan06 one motherfucking more bottom motherfucking line:
Given the poignant confusion about "is-a" and "has-a" vis a vis RSL's use of "is", I really think we ought to see if we can live with "is" and "has" as the keywords, without '=' or '>' at all. One of the main reasons for '=' is data dict compatibility, but I'm pretty darn comfortable just farging this.
So, for good clean simplicity, one more motherfucking time, try this:
KeywordAttr Name | Meaning |
has, components: | Defines the components of an object. In terms of typing, it binds the type defined by the component expression to an object ident. |
is, parents: | Defines the parents of an object, thereby defining the components of an object to be thoses of its parents, unbundled and anded if multiple parents, and then anded to the object's own componnts. In terms of typing, it binds the type "parent-type<1> OR ... OR <parent-type<n> OR parent-types-unbundled and component-type" to an object ident. |
With this style of defs, we can leave the current type checking semantics in tact, except that we will disasslow inheritance from atomic types, perhaps including opaque types. The latter bit would be consistent with the 24may02 conclusion that opaque types are at the "bottom" of the type hierarchy. However, not being able to inherit from an opaque type may be funky in terms of incremental development, because it precludes defining an object as a place holder and then refining it later. Also, Java (and one suspects other OO PLs) allows componentless classes without any problem.
SO, we need to think through whether to allow inheritance from opaque types. I think it has to be OK. Whether it's OK to inherit from a non-opaque atomic type remains to be determined.
In terms of type rules, whether we do it by spelling or otherwise, I think it's pretty darn clear at this point that any two differenct opaque types are incompatible. I'm not sure if we've thought of this in structural terms, but I think it make sense at that level too, in a perhaps odd kind of way. The deal is that two different opaque types each of no structure, and types with no structure will be considered incomparable, and therefore not equiv (i.e., compat).
13nov05 bottom motherfucking line: here are THE keywords and symbols:
Keyword/Symbol/Attr Name | Meaning |
is, =, components: (for object), body: (for operation) T} | Binds a type to an object identifier, an expression to an operation identifier, or a value to a value identifier; in the case of a type binding, it defines the one or more components of the object; in the case of an expression binding, it defines the functional body of the operation. |
extends, >, parents: Defines the one or more objects from which another object inherits, or operations from which an operation inherits; values do not inherit. |
Symbol (concise) exmaples:
obj DB = GenericRecord*; obj EmployeeRecord = Name and ID; obj SupervisorRecord = Supervisee* > GenericRecord; obj StaffRecord > EmployeeRecord = Supervisor and Status; -- or -- obj StaffRecord = Supervisor and Status > EmployeeRecord -- or -- obj StaffRecord > EmployeeRecord <> Supervisor, Status obj Name = string; obj ID = number;
Keyword (verbose) examples:
obj DB is GenericRecord*; obj EmployeeRecord is Name and ID; obj SupervisorRecord extends GenericRecord is Supervisee*; -- or -- obj SupervisorRecord is Supervisee* extends GenericRecord;
obj StaffRecord extends EmployeeRecord is Supervisor and Status; -- or -- obj StaffRecord is Supervisor and Status extends EmployeeRecord;
obj Name is string; obj ID is number;
You know, it may well be time to dispense with the "has" and "is" business, and just use symbols. The good thing about this is that the sematics of one-tuples being equivalent to type equality is just fine. If we go this root, then only the concise examples below work.
Use has and is for keywords instead of is and inherits from, resp. Also, make = and > synonyms for has and is, with the use of > in particular suggesting the directionality of the UML arrow for inheritance.
Concise exmaples:
obj DB = GenericRecord*; obj EmployeeRecord = Name and ID; obj SupervisorRecord = Supervisee* > GenericRecord; obj StaffRecord > EmployeeRecord = Supervisor and Status; -- or -- obj StaffRecord = Supervisor and Status > EmployeeRecord -- or -- obj StaffRecord > EmployeeRecord <> Supervisor, Status obj Name = string; obj ID = number;
Verbose examples:
obj DB has GenericRecord*; obj EmployeeRecord has Name and ID; obj SupervisorRecord is GenericRecord has Supervisee*; obj StaffRecord is EmployeeRecord has Supervisor; obj Name is string; -- hmm, is this counterintuitive? obj ID has number; -- hmm, how bout this?
Consider using <- for assignment, including for parameter initialization. Regarding parm initialization, there's the interesting issue of default inputs of the form "Untitled n", for n = 1 ... .
Probably, or at least maybe, get with the program of using "/*" and "//" as comments symbols.
Perhaps replace user-defined attributes with just plain relationships.
Build a GUI editor, with this kind of dialog for objects:
and this kind for operations:=== Object Editor === Name: Components: Extends: Description: In Module: V More Operations: Equations: + - V More Relations: Name: Entities: + - ... V More: Properties: Name: Type: ... + -
and this for modules:=== Operation Editor === Name: Inputs: Outputs: Description: Module: V More Precondition: Postcondition: V More Components: Dataflow: V More: Properties: Name: Type: ... + -
and this for values:Name: Exports: Imports: Entity List: V More Theorems: Axioms:
Name: Type: Value: Attributes:
Based on consistent 205 misuse, I'd suggest putting back the singular or "input" and "output", and for consistency, "component". Then, with the good Icon-style more accurate syntax error messages, we can be dandy.
For the CJ types, allow '.' to be used as an op selector, for explicitly- delcared ops. E.g.,
obj X ops: a,b,c; end; obj Y; op a(X)->X; op b(X,Y)->X; op c(Y)->X; op d(Y)->Y; op main(x:X, y:Y) = ( x.a(); x.b(y); );
I inorder for '.' to be usable as an op selector, the obj being '.'ed must have an declared an op of the name on the RHS of the '.', and that op must have exactly one input of the type of the obj being dotted. And then, the form "val.op" means treat obj as the (sole) input of type typeof(val) to op, and do not supply that input in the normal way within the parenthesized list of inputs. All the complicatedness of this reveals just how silly dotted selection of ops really is.
By these rules in the examples above, x.a() and x.b(y) work fine. However, x.b cannot work in any form, because it has no input of type X.
A new bit of checking to add to op decls is that an op listed in an obj's ops decls must have the obj in at least on place in its signature. By this rule, it's OK to list c in X's ops, since it has X as an output, even though we can't say x.c, because X is not one of c's inputs. However, d cannot be legally listed in X's ops, since it has X nowhere in its signature.
The answer to the "I don't know ..." in the next paragraph is "I do know, and it's now decided". Viz., we can have any mixture we like of type versus value components of an object. As the comment in parser.y explains, we're ready to go with this syntactically, the typechecker just needs to be fixed.
I don't know if we've already decided this, but it would seem that there's a juncture between objects and values at the point where an object is declared with 100% concrete values. However, by the "the" prefix spelling rule, an object can never be used as a value, since its type will never be prefixed with "the". Hence, while the following two defs amount to the same entity in some sense, they are still different because the object version cannot be used as a value and the value version cannot be used as a type.
The object version of XO is not in fact particuarly useful, because it can hold only one type of value, namely XV. Further, ... (dropped off). However, it's fine to allow it, as a general rule.object XO is 1 and "xyz" and true; value XV = {1, "xyz", true};
For objects with one or more constant value components, the constructor does not have args for those components. E.g., for
the auto-gen'd constructor isobj X is "abc" and i:integer and "xyz"
op X(i:integer)->X
Fuck, the above applies to unbundling, since the previously-defined deal with auto-gen'd constructors is that their signatures are exactly the objects type, not its unbundled components. The unbundling is a courtesy. What the real issue here is is what the fuck happens to a tuple constructor for constant components. E.g., for the type X above, is {1} a sufficent constructor, or do there need to be some kind of place holders for the constant components, as in {,1,}. I think I like the latter, but it's not currently syntactically legal. Figure it all the you-know-what out. There's a LOG entry of this date to this effect.
And here's another potentially major thought about auto-gen constructor ops -- do we fucking need them at all given structural equiv? I.e., if we have to pass a value constructor in as the arg to a constructor op, why don't we just use the value op directly, instead of having to wrap a constructor call around it? The only reason I can think of is to force the type to be a particular name, but I don't think this is necessary given that this happens at binding anyway. As I recall, the only time we need name equiv is related to inheritance, and I don't see clearly how the value stuff is going to be a problem here. So, check out the following example:
I think the comments there say it the fuck all. I.e., if we're a smart motherfucker, we should dump motherfucking auto-gen'd constructors altogether, unless there's some lurking reason to have them around that I've forgotten about, in my reinvention-of-the-motherfucking-wheel way. Fuck me, I need to go confirm this, but I sure the fuck hope this is a nice new discovery to simplify things.obj X = integer and string and boolean; val x1 = X({1, "abc", true});-- the canonical (and stupid) constructor call val x2 = X(1, "abc", true); -- slightly better, courtesy of auto-unbundling val x3 = {1, "abc", true}; -- OK, but doesn't guarantee x3 is type X val x4:X = {1, "abc", true}; -- does the trick, without the constructor at all val x5:X = {1, "abc", 1}; -- just-for-giggles test of type checking; -- fuck me, it fails as of 17dec04, and so needs -- to be FIXED; there's a LOG entry for it
Conclusion first, followed by blather. So, here's the part 1 conclusion, about extended quantifier forms. MOTHERFUCKER. It would appear that after all of the bullshit and anguish I've gone through over elseless ifs, the normal truth table definition of implication, aka elseless if, is in fact "if x then y else true". This is because when x is false, "if x then whatever" is always true. Looking at the alternative for of "not p or q" makes this quite clear, since "not p" is true whenever p is false. So, the conclusion to the discussion below about needing to be 100% sure on the equivalences for extended quanitifier forms is that we are now in fact 100% sure. MOTHERFUCKER.
And here's part 2 of the conclusion about elseless ifs in postconds. Since elseless ifs amount to "go true for free cards" in the cases where the if is false, using elseless ifs in postconds is just plain too weak. The deal is that it leaves outputs values unspecified for all of the cases where the if expression is false. This should in fact be covered in the ref man, to say that elseless ifs, in general, and in specific cases as examples, are fundamentally too weak for postconds. What we have to say in general is that a postcond must explicitly specify a condition for all possible values in the range of each output var. What exactly this means fully is part what we have to work out, but elseless ifs in postconds are a significant part of it.
The blather related to the preceding two conclusion paragraphs now starts, from here to the next item.
We need to be 100% sure about the current equivalences that are in the ref man when explaining the extendned quantifier forms, e.g., "forall (x:t | y) p" <==> "forall (x:t) if y then p". The problem I'm worried about right at the moment is the what happens with y is false. I.e., should in fact the equivalence not be "forall (x:t) if y then p else true". I think the fuck so, and I need to figure it totally the fuck out, pretty much right the fuck now.
In general, we may need to come to grips with the one-time-conidered-to-be-an- oddity position of Lois Brady that elseless if don't make sense (in pre/post) logic. This may well be true, particularly for postconds, since the idea there is that we need to cover all possible values for outputs.
Let's push on this a bit. Consider
Given the current rules we've (sort of) got, what about when i < 0? Well, according to the standard truth table for implication, we have thisop Op(i:integer)->j:integer post: if i >= 0 then j = 20;
So OK, this really is not bogus per se. What it says is that the only time the postcond fails is when i >= 0 but j != 20. It succeeds whenever i < 0 or when both i >= 0 and j = 20. What this means is that i < 0 is a "don't care" case as far as the value of j is concerned.i j i >= 0 j = 20 (i >= 0) => (j = 20) -1 19 0 0 0 => 1 = TRUE -1 20 0 1 0 => 1 = TRUE 0 19 1 0 1 => 0 = FALSE 0 20 0 1 1 => 1 = TRUE
Now, to push on this further, are such "don't care" cases really sensible in the case of postconds, or do we really have to cover all possible output values in a postcond?
Well, we might say that if a postcond rules out a particular case, then an elseless if might be OK, as in
But this is in fact bogus, or at least redundant, since the postcond can never happen if i < 0, given the precond. Hence in this case, the if clause in the postcond is unnecessary.op Op(i:integer)->j:integer pre: i >= 0; post: if i >= 0 then j = 20;
What I think we've arrived at is that "don't care" cases are in fact bogus in postconds. I.e., we need to have the form be like this:
since otherwise the value of j is fully unspecified when i < 0. Butop Op(i:integer)->j:integer post: if i >= 0 then j = 20 else j = ... ;
I think the conclusion there about "x <bool op> nil = nil" may well be fucked up, based on much of what we've said in 205 notes, if not elsewhere. Specifically, we need to look at the way we describe what goes wrong with unboudned quantification in sorting to make things completely right. As I recall, the explanation does in fact say that "x <bool op> nil = false".
A postcond of the form "output-parm = expr".
The general way to get a constructive formulation is if you can solve (easily) for the output var. Presumably this leaves out any quanified exprs as constructive, since I don't think solving for var inside a quantifier body is possible. Let's try. I.e., what would it mean to solve for l in the following expr?
Farg, I think it's pretty clear that it doesn't mean anything. What I need to do is come up with a bit more precise explanation of the imposibility of solving for l in such a case, based on some reasonable definition of "solve for". Nevertheless, I think the answer is clear that unless we have something like constructive normal form, solving is going to be difficult to impossible.forall (i in [1..#l-1]) l[i] < l[i+1]
OK, in a lattice-theoretic sense, Object is Top. So, how
about this as the lattice of Prim types:
Object
tuple union list op
integer real string boolean opaque
nil
Exists as Search (13nov04)
See new-inputs/exists-as-search.rsl for the motivation for a new choose operator, with the following semantics:
exists (x:t) p <=> not forall (x:t) not p(choose (x in l) p) (lambda (Object, Object*, ... farg, this needs more thot; and I did not just ... fall off here; we'll hopefully deal with this ... eventually; I added a LOG ref as a nudge not (forall
OK, I'm not sure on what kind of shakey ground it may put me, but I think I'd like to go with using algebraic specs to define anything we've got that's primitive, but isn't part of pure boolean logic. See the 10nov05 entry in implementation notes for details. The point of this here is that we'll define exists and choose equationally, which should help us deal with the formal "bootstrapping".
How about this. We can allow the definition of "value spaces" explicitly to bound otherwise unbounded quantification. I.e., we provide some built-in syntax and/or operators to do this.
This can then be combined with the cute trick for testing-based value spaces in the following way. When during testing values are sent to ops, they become part of the value spaces for whatever types they are.
And we might even be able to introduce this cute trick -- everytime a constant of a particular type appears in a spec, it automatically becomes part of that type's value space. In this way, simply mentioning a constant value anywhere in a spec adds it to a value space. And this may in fact be a very cute way of avoiding some extra syntax/ops for defining value spaces. We could just say "define some constants" to do it.
Now the just "define some constants" could be tedious for things like integer, but hey, I think we have the solution to this already in the '..' list constructor form. E.g.,
The deal is that the specific value names don't (typically) get used anywhere, but they're appearance forces the internal value spaces to be expanded.value IntegerValueSpace = [1 .. 65536]; value CardValueSpace = [ { ... };
Well, two considerations now come to mind. First, it would probably be nice to do some form of lazy eval of forms like [1 .. 65536], or at least say we're thinking about it when we warn folks that we don't (at present) do it.
A second consideration is one of convenience in defining something like CardValueSpace. It might be nice of we could somehow embed '..' in component values, so as to say that we want multiple values of, say, an integer component without having to define each one explicitly as a constant value.
What this may lead us to is the following idea -- whenever a value is created in any context during spec evaluation, it too becomes a (permanent) part of it's type's value space. This has (potential) ramifications for garbage collecting, in that it seems to imply that we can never garbage collect any values, or at least only garbage collect duplicate copies of values. (And it occurs to me that I'm reminded about why doing the interpreter in Java is a good idea, so we can in fact have garbage collection.)
Anyway, at this point, I think this idea is potentially quite promising, we just need to work out some more details.
Allow ops to be invoked via "." op, under the following circumstances:
Question -- does it make sense to use the '.' op in an output list? It can if we say that it's a short cut for change only the right operand of the '.'. Whether we want to make this happen syntactically is a currently open question.
A pretty serious issue with formal process modeling, even just at the process- step signature level, is that we'll need the highly generic "user input" as a formal input in the signature in order for things to work. I'm not altogether sure we want to open the fully formal can-o-worms for process modeling, as was noted in the Fall 04 version of the process chapter. Anyway, I'll do some more work on chapter 3 and see where things go.
We need some kind of equality operator that says tuple X = tuple Y, except for one or more specific components. The idea of "let ... in ..." comes to mind, but seems a bit cumbersome. Also comes to mind is some Icon-like multi-char equality op, but that may be funky, as always. What would be nice is some intuitive-looking operator symbol, with some way to have a list following.
OK, we've come to the syntax that's exemplified in newinputs/except-tst.rsl. One thing worth noting is that we decided to require that the left operand of "~=" be a name, which disallows cases such as the following:
In such cases, a let must be used to get the desired effect, i.e.,forall (x in l) x[i] = whatever except x[i].j = 10;
forall (x in l) (let xi = x[i]; xi = whatever except xi.j = 10);
If it's not already clearly stated, it would appear that the semantics of list addtion are such that adding an element or a listified element have exactly the same effect. Think this through and documentt it fully if not already.
It seems to me that we most likely need to define the operational semantics in terms of tagged values, since this the only type-safe/meaningful way to compare (with '=' and 'in') two objects that inherit from some other common parent object. The top Object type is the degenerate case, in that equality always type checks correctly statically for two values of type Object, but dynamically, equality, specifically v1 = v2, must be defined as follows:
The bottom line, as I think has always been the case, is that using inheritance weakens static type checking, and using Object weakens the heck out of it. The same goes for union types, and in fact, it's the other way round, since inheritance is defined in terms of unions. And BTW, the definition of inheritance in terms of unions, if it will/can stand in the face of the existence Object, will have to go something like this: everytime a new type is defined, the definition of Object is dynamically extended with the newly-defined type, in the context of the semantics of the program being type checked/executed.
Screw the ungrammaticalness of exprs like "item isa Appointment". The isa operator makes better sense as an expression operator, so as to avoid overloading confusion with is as a type definition non-terminal. So, we will in fact get rid of `?' as an infix operator, but we'll leave isa.
One thing I don't think we got fully explained below (we may have, but whatever) is the difference between isa when used as the former `?' versus `?<' operator. Here's the deal, I'm pretty sure: When used on a top-level union, it's the former `?'; when used on an inheriting obj, it's the former `?<'; when used on anything else, including a non-inheriting tuple, it's an error. One way or another, we'll always be able to disinguish between an top-level union versus an inherting obj, since the later must always be a tuple. Even if it's a one- tuple, by virutue of inheriting from an opaque parent, it still won't be considered a top-level union for the purposes of `isa'. We need to work out the complete details of this, but one way or another we'll make it work, even if it's a bit funky for a seeming top-level union that isn't because it also inherits. The tricky bit has to do with the fact that we can now use a component type name as the right operand of `isa', instead of a component name, which means that there's probably no way to treat an inheriting-from-opaque top-level union as a union for the purposes of `isa except by using an explicit name or position access operator. This is fine, given that an inheriting-from-opaque top-level union is a rare and funky thing.
See updates to jan03 item below.
This item regards the jan03 and 23may02 items on accessing tuple fields.
The jan03 item concludes that we should nix this type of reference, in part because it's only "a tad" more convenient. Unfortunately, I think the "tad" part is incorrect, based on the experience of adding the weekly-recurring clause to the precondition of the ScheduleAppointment operation, q.v. in schedule.rsl.
The more serious reason to nix typename reference to fields, as stated in the jan03 item, is the inconsistency of this type of ref when it comes to op args. However, at this point, I'm strongly inclined to say that the convenience issue outweighs the inconsistency issue. Given below is a piece of evidence for this inclination, in the form of excerpts of from object defs necessary to support the aforementioned precondtion clause in ScheduleAppointment. The first excerpt uses explicity-named field reference whereas the second excerpt uses typename reference. As can be seen, there's a bunch of clerical, after-the- fact component-name caca required in the first excerpt compared to the second. The actual time spent on the caca is not trivial, since it took me a few passes to get things to compile. The time-to-compile was to some extent based on just hunting down the full ref chain, with or without type names. However, the mere typing of the extra name syntax consumed some valuable time. So, today's conclusion is to bring back typename access to fields, the rules for which are detailed in the 23may02 item.
Here are the excerpts:
object Appointment inherits from ScheduledItem components: start_time:StartTime and duration:Duration and recurring:RecurringInfo and location:Location and security:Security and priority:Priority and remind_info:RemindInfo and details:Details; ... object RecurringInfo is components: is_recurring:IsRecurring and interval:Interval and details:IntervalDetails; ... obj Interval is components: weekly:Weekly or biweekly:Biweekly or monthly:Monthly or yearly:Yearly; ... object IntervalDetails is components: weekly:WeeklyDetails or monthly:MonthlyDetails; ... object WeeklyDetails is components: sun:OnSun and mon:OnMon and tue:OnTue and wed:OnWed and thu:OnThu and fri:OnFri and sat:OnSat; ... operation ScheduleAppointment is inputs: cdb:CalendarDB, appt:Appointment; outputs: cdb':CalendarDB; ... precondition: ... if appt.recurring.is_recurring and appt.recurring.interval?weekly then appt.recurring.details.weekly.sun or appt.recurring.details.weekly.mon or appt.recurring.details.weekly.tue or appt.recurring.details.weekly.wed or appt.recurring.details.weekly.thu or appt.recurring.details.weekly.fri or appt.recurring.details.weekly.satversus
object Appointment inherits from ScheduledItem components: StartTime and Duration and RecurringInfo and Location and Security and Priority and RemindInfo and Details; ... object RecurringInfo is components: IsRecurring and Interval and IntervalDetails; ... obj Interval is components: Weekly or Biweekly or Monthly or Yearly; ... object IntervalDetails is components: WeeklyDetails or MonthlyDetails; ... object WeeklyDetails is components: OnSun and OnMon and OnTue and OnWed and OnThu and OnFri and OnSat; ... operation ScheduleAppointment is inputs: cdb:CalendarDB, appt:Appointment; outputs: cdb':CalendarDB; ... precondition: ... if appt.Recurring.IsRecurring and appt.Recurring.Interval is Weekly then appt.RecurringInfo.Details.Weekly.OnSun or appt.RecurringInfo.Details.Weekly.OnMon or appt.RecurringInfo.Details.Weekly.OnTue or appt.RecurringInfo.Details.Weekly.OnWed or appt.RecurringInfo.Details.Weekly.OnThu or appt.RecurringInfo.details.Weekly.OnFri or appt.RecurringInfo.Details.Weekly.OnSat
Shit, the processing of imports rears its ugly head again. The LOG file has notes about how we'd like to do them in V4, including some discussion of making them like Java, which concludes with saying we shouldn't try to be like Java, given the difference between what's importable to a module versus package..
So anyway, I'm inclined to go back to a way-old idea that can be characterized as "on-demand" import checking, in order to avoid two possible problems with the "pass-3" style of import checking we're now doing. The benefit of on-demand checking is that it can avoid the following problems:
So, superceding all current design and implementation of imports, including what's in LOG, here's a clear and simple import scheme:
the type returned is M.F, where F is the type of field f of t, defined in module MM.t.f
So, here's the import-checking algorithm we have in mind:
Evidently, there is a problem with reference-counting versus mark-and-sweep garbage collectors vis a vis closures. This may well mean that we want to implement the interpreter in Java, using JNI access to the C-built parse tree and symbol table.
This item regards the 23may02 item on accessing tuple fields. The original title of this item started with "OK -- Let's ...", but now it's "NOT -- Let's ...", which means we've decided to press on with the 23may02 ideas. The reason is that it's significantly more convenient for 205 students, as a visit with one (a 205 student) again made clear today (27may03).
In answer to the "major problem arises ..." comment in the next paragraph, an answer is to do the ML thing, where op args can be treated as a single tuple- valued arg. If we name the args-as-a-tuple value some keyword, like say "args", then the problem described in the next paragraph is eliminated. Let's do.
While it might be cute and a tad more convenient, a major problem arises when we want to have the equivalent notation for op args, but we cannot because in the case of op args, there's no qualifying obj name, which means we'd have type names within the exprs of conditions and function values. Given this, I think I'm fine with the following strategy in the reference manual, which is pretty much there already. Viz., have a section after the basic intro to obj and op structure called "Referring to Obj Components" (which is a bit stronger than the current "Names and Types" section). We don't really need this extra strengthening, but whatever. At this point, the important thing is to make peace with the idea once and for all that we need explicitly-declared component names to refer to tuple fields.
In all likelihood, this decision does not affect the recent idea of making the right operand of "is" (formerly "isa") a type name rather than a field name. The deal is that for unions, the type name is used for tag query (as it should be it seems pretty clearly now), and the field name is used to access the value as (of) a particular type.
OK, I may just have solved the problem with "isa" as a funky operator name, particularly when its right operand starts with a vowel (as in "item isa Appointment"). How 'bout we use just plain "is" as an operator that applies uniformly to types, never to values. In this way, its use in object defs and also in type query expressions is consistent. Check it out:
object X is A and B and C; -- In an object declaration, "is" equates object A is string; -- a name with a type expression object B is integer; object C is boolean; value x = {"xyz", 123, true}; -- In a value declaration, "=" equates a name -- with an expression value function Op(X)->X = ... ; -- The reason "=" makes sense for function -- values is the construct following the "=" -- is in fact a value object Y is a:A or b:B or c:C; var y:Y; ... if (y is A) then ... -- Here, "is" appears in the context of a -- value expression, however it is being used -- here as a runtime type query, hence the -- use of "is" makes good sense. ... if (y is a) then ... -- This form is now ILLEGAL, since is should -- apply uniformly and consistently to types, -- not to values.
So, with our new-found consistent use of "is", can we make the following happen?
Well, this can work if we make the equality type checking rule go like this:obj Days is Sunday or Monday or ... or Saturday; var day:Days; ... if day = 'Sunday' then ...
one operand is a union type and the other is compat (but probably not subtype compat) with one of the elements of the union. At runtime, the equality evaluation entails the two steps of checking the type, then the value.What's potentially misleading here is the in the case of enum literals as in the Days example, the runtime value check is unnecessary, since there's exactly one value for each opaque type. In an example like this,
the need for both the runtime type check and value check is evident. At compile time, the expression ios = 1 is legal because the type 1 is one of the IntOrString union members. At runtime, the evaluation of ios = 1 involves first checking the current tag of ios to ensure that it is integer, then performing the normal numeric equalitiy evaluation.obj IntOrString is integer or string; var ios:IntOrString; ... if ios = 1 then ...
Now, all of this made me think for a sec that we might not need the `.' operator at all for unions, since we can do the injection (or is it projection) now so neatly. However, we do in fact still the `.' on unions in order to bind a union value to an explicitly-declared var of one of the union members, e.g.,
op X(i:integer) ... ... X(ios.integer) -- X(ios) won't work, by the (now) normal rules -- of injection/projection
One last bit of syntax -- the use of "is" in an axiom declaration is not really consistent with the above observations. Using "=" as the ax declaration separator would probably be most consistent, but it causes some syntax problems since the right operand of the ax declaration is an expression that can, of course, contain '=' as an operator. Therefore, I think ':' is a reasonable ax declaration separator, while not perfectly consistent with the other uses of "is" and ":", it's fine given the infrequency in of ax decls in the kind of specs I'm likely ever to be doing. Also, upon a bit further reflection, using "=" in an ax declaration wouldn't be all that consistent with the notion of "=" for value binding, since the axiom name isn't usable as a runtime value. Hence, the name binding in an ax declaration is sort of an odd-ball case anyway, so ":" as the separator is probably as good a choice as any, if not in fact the best choice.
I'm not sure it's ever been this clear, so if not, for the zillionth time, here is an explanation of what's legal projectionwise and injectionwise.
obj X is A or B or C; op Main(x:X, a:A, b:B) = let x = a; -- legal because x is already vague, and may hold values of three alternative types let a = x; -- illegal because a can only hold values of type A obj A; obj B; obj C;
How about if we dont' require that the components of a union be distinct, but just make them that way, in the same way that a set-union operator does. E.g., the type "string or integer or string" is equivalent to (and reduces to) "string or integer", simply because we automatically throw out the second string component. What I'm concerned with here is a potential problem that stems from the combination of structural equiv and inheritance as unions. Viz., in a definition of the form
In this case, the union that defines the inheritance contains two instances of the same type, which by the up-to-now thinking was an illegal union. Something has to happen to make this OK, which could be one (or more) of the following:obj Parent is integer and string; obj Kid1 < Parent is integer; obj Kid2 < Parent is integer;
There is however the fundamental conceptual appeal of sym lits as distinct from string data in pretty much exactly the same sense of its appeal in Lisp.
Bottom line -- I think it would be stupid to get rid of symbolic literals entirely. What we can do is describe them clearly and rationally in the ref man and say something along the following lines: ``The use of sym lits as enum idents is in some sense conceptually more "accurate" than the use of strings for the same purpose, but if the specifier does not care about this, then using strings as enum idents is just fine.'' The accuracy is concretely evident in the preferred GUI for a particular type of data. Viz., if the preferred GUI is a free-form text-entry area, with a limited set of possible values that are parsed but not enumerated in a combo-box-style menu, then the more accurate representation is a tuple of string values. On the other hand, if the preferred GUI is a completely fixed list of identifiers, then the more accurate representation is a tuple of sym lits. [And see below about a mixed-mode free-form entry plus combo-box style GUI.]''
Given this, all of the following defs are legal, and have essentially the same semantics:
definition using sym lits as enum idents:
obj Sex1 is 'Male' or 'Female';
definition using strings as enum idents:
obj Sex2 is "Male" or "Female";
definition using opaque types as enum idents:
obj Sex is Male or Female;
obj Male;
obj Female;
definition using defined string values as enum idents:
obj Sex is Male or Female;
val Male = "Male";
val Female = "Female";
Today's conclusion: I think I really like the idea of the concrete GUI hinting at the preferred specification representation. But fuck, I was just about to write the following as an example of the specification representation for a string-value combo box:
meaning that the "Alt X" strings are the initial given values and the "or string" part deals with a user-entered value. But this is not a legal union type in the latest FMSL semantics, since string subsumes all of the other values. But wait, if we mix and match sym lits with strings, we could have the following, which looks kind of sweet and may (help to) solve the age-old problem of the precise FMSL representation for such things:obj Selections is "Alt 1" or "Alt 2" or ... "Alt n" or string;
The reason that this may be a very accurate representation is that it can be paraphrased as follows:obj Selections is 'Alt 1' or 'Alt 2' or ... 'Alt n' or string;
entered by the user.The potential downside representationally here is that we don't consider Selections to be just a plain string with a set of system-suppled default possibilities, but rather a mixed-type union. But hey, this may be precisely the kind of semantics we've been looking for all along.
The I-don't-think-ever-explicitly-defined alternative to the mixed sym-lit string definition is something like the following:
The problem we've perennially had with such (assumed) defs is that an abstract op that that takes an input of type Selections, with the input tracing to a combo box, needs (seemingly) to take SelectionsBuiltInDefaults as well. Either that, or we need to get into the definition of "abstract" view specs, as we did e.g. with Rick Myers, and postulate some op that outputs SelectionsBuiltInDefaults to the display, and then some select-from- defaults-list that produces an integer (say) selector that is used by the actual abstract op as an input.obj Selections is string; obj SelectionsBuiltInDefaults is "Alt 1" or "Alt 2" or ... or "Alt n";
The mixed sym-lit/string type seemingly solves this perennial problem, but perhaps at the price of notational obscurity and complexity. Again, we can say in the ref man that the specifier may chose the representational form that suits her. And as always, we suggest that the same form is used consistently throughout a given specification.
I'll conclude with this thought. It seems potentially a bit hokey, or at best notationally strained, that
is illegal butobj Selections is 'Alt 1' or 'Alt 2' or ... 'Alt n' or string;
is not. I.e., it's a potentially hokey "convenience" that sym lits can be used to represent combo boxes so "elegantly". We need to think about it (but not too fucking much).obj Selections is "Alt 1" or "Alt 2" or ... "Alt n" or string;
Oh, and one final reason to keep (and use) sym lits -- as we've observed before, they're a good way to represent graphical icons symbolically, where a string overrepresents to some extent.
We may also want to fuck where clauses hard too. Check inputs/stack.rsl for a decent-looking parameterized type style of generics, that can easily (and clealy) replace the where clause crap. Then again, the idea that we can use type vars directly in a comp expression, without declaring them, may yet have its appeal. Definintely need to think about it.
Fuck symbolic literals -- hard. The point is that we'll allow the way-old definition of the form
given the fact that we can have non-boolean concrete values in a parts spec. The fundamentally important part about this is that a definition of such a form is NOT of type "string or string", but rather of type "(the string the following does not type check, for the indicated reasons:obj Sex is "Male" or "Female"
The point is that the following definitionvar s1:string := "Male"; value s2 = "Male"; op IsMale(s:Sex) = s = "Male"; op main() = ( IsMale(s1); -- ERROR s is not compat with s1 IsMale(s2); -- ERROR s is not compat with s2 )
defines S to be of type "the string string.obj S is "xyz";
And OK, this has all been made abundantly clear already in the 2jun00 entry, and others below it.
Based on what's said in exists-as-search.rsl it seems pretty darn
clear at this point that nil does NOT = false.
Based on all the formal-semantic error propagation work we've done, plus the
fact that a pre-cond-failing op is supposed to return nil
(NOT false), we need to have nil be a distinct value
for all types, including boolean. On this basis, making an
unconditional equivalence between nil and false is too much.
Now, we may want to make some kind of C-ish equivalence in the context of some
boolean ops, but I don't think we even want to do this. I.e., the expressions
"nil
I suppose we need to do one last search through all that's been said on this subject, but right how I'm pretty-to-quite happy with the decision.
OK, I just reread the 22mar02 injection and became thoroughly depressed. I think the notion that we can use inheritance to achieve what we're after is dead, when what we're after is genuine statically-typed polymorphism and generics. What needs to happen is the introduction of type variables, of the form we've been musing about for some time.
When this happens, I think we pretty much turn the opaque type business on its head, in terms of how we now see it. Viz., we currently see an opaque type as a topish kind of type from which all other types inherit, with the hope that this scheme makes opaque types act like type vars. As the 22mar02 injection makes clear, this appears to be a forlorn hope. So, what we'll do now with opaque types is put them (back) on the bottom where they belong. They're already there to some extent, in that we have the notion (though I'm not sure it's yet fully implemented) that a symbolic literal of an opaque type name is the only value of that opaque type, and further that that value is unique and unequal to any other value whatsoever. The reason that this is a complete head-turning is that instead of opaque types being compatible with all other types, they're now not compatible with any other types. E.g., the equality check in the following axiom can never type check correctly:
But wait, I just ran this through the type checker an it's fine, i.e., it does not type check.object X; object Y; axiom (forall x:X, y:Y) x = y;
But wait again (to reject the supposedly "very cool (re?)revelation here" below). Given the statement "... a symbolic literal of an opaque type name is the only value of that opaque type, and further that that value is unique and unequal to any other value whatsoever", I believe (seemingly CONTRARY TO the "very cool (re?)revelation" below), that the "x = y" equality check in the immediately preceding example should in fact NOT type check correctly. To be consistent with the values of distinct opaque types being distinct, the types themselves should in fact be distinct. The reason the "very cool (re?)revelation" works in the context of type-var- test.rsl is that there's inheritance going on that's not present in the immediately preceding example. The bottom line is that the immediately preceding example should NOT in fact type check, but the type-var-test example should. The way that the immediately-preceding example could type check correctly is if we liberalized the type checking rules for equality to say that any types can be compared for equlity, but that if the types are different, equals automatically returns false. My immediate reaction to this is that it subverts type checking pretty badly, but it's worth thinking through in the context of the "very cool (re?)revelation" that now follows immediately.
OK, I've just had a very cool (re?)revelation here. The revelation is that we shouldn't view the 22mar02 example an incorrect result from <, but rather as a vacuously false result from <. To recap, here's the type-var-test.rsl ref'd in the 22mar02 injection:
What we were depressed about here is the fact that given the significant structural difference between x and y, there's no way we can be doing a meaningful runtime comparison in the expression x < y. But hey, that's just fine here. What static type checking says in a case like this is that the only thing that can be compared in an op like CompareXY is whatever components x and y have in common, which in this case is no components at all, given that they inherit from an opaque type. Hence, in this case, the x < y comparison is always vacuously false, by the presumed rule that 'TypeVar' < 'TypeVar' is false. And the reason that x < y reduces to 'TypeVar' < 'TypeVar' in this case is because in the context of CompareXY where the inputs are both TypeVar, the only thing we can be comparing between x and y is the common TypeVarness, which means that, as they should be, the other components of x and y are ignored, i.e., unseen, in the context of CompareXY. If the body of CompareXY were changed to x = y, basically the same thing would happen, except the result would be vacuously true, given in this case we'd be comparing 'TypeVar' = 'TypeVar'.obj TypeVar; obj X < TypeVar = integer and string; obj Y < TypeVar = string and integer and string; op CompareXY(x:TypeVar, y:TypeVar) -> boolean = ( x < y ); op main(x:X, y:Y) -> boolean = CompareXY(x, y);
So, what this all boils down to is that the only really wrong thing in type-var-test.rsl is the mnemonic suggestion of the name "TypeVar" for the opaque type, because this type name isn't really acting like a first-class type var at all. (And this is in fact what we said in the 22mar02 revelation, but were (hopefully) unnecessary depressed about at the time.) And when all of this gets sorted out, it'll probably be just dandy to have an explicit section in the ref man that addresses this issue head on, perhaps even titling the section something like "Opaque Types are Not Type Variables". Cool.
Now, if we do want real type vars, we're gonna have to add them to the
language, which is what this discussion is all about. So, here's what a for-
real type var test would look like:
The proposed error message indicates what the problem with CompareXY
here is. Viz., the body of CompareXY requires that there is a
"<" operator defined between the two args of CompareXY.
Further, the signature of CompareXY requires that the actuals be of
the same type. (Error-messagewise, we may want to tone things down here in the
actual implementation, since the type failure in this case is not affected by
the lack of a less-than op, but just by the fact that the inputs to
CompareXY must be the same type. I'm sure we can work out these
details just fine. At this point, the error message example is useful to
explain the thinking here, and we'll take it as such.)
obj JustSomeOpaqueType;
obj X < JustSomeOpaqueType = integer and string;
obj Y < JustSomeOpaqueType = string and integer and string;
obj $TypeVar; -- Just for pre-type-var-implementation compilation
op CompareXY(x:$TypeVar, y:$TypeVar) -> boolean = ( -- $ (for emacs)
x < y
);
op main(x:X, y:Y) -> boolean = CompareXY(x, y); -- ERROR: actual parameters
-- are incompatible with type
-- variable constraints:
-- exists <($TypeVar,$TypeVar)
obj $TypeVar1; -- Just for pre-type-var-implementation compilation
obj $TypeVar2;
op CompareXYOK(x:$TypeVar1, y:$TypeVar2) -> boolean = (
x < y; -- Should be OK once type vars are properly implemented
);
op "<"(x:X, y:Y) = x.integer < y.integer; -- Should be OK once we get the
-- new/old field ref stuff working
op main2(x:X, y:Y) -> boolean = (
CompareXYOK(x, y); -- Should be OK once we get overloading finished
CompareXYOK(1, "xyz"); -- ERROR: actual parameters are incompatible with
-- type variable constraints:
-- exists <($TypeVar,$TypeVar)
CompareXYOK(y, x); -- ERROR: actual parameters are incompatible with
-- type variable constraints:
-- exists <($TypeVar,$TypeVar)
);
Now, in the CompareXYOK op, things can work out OK because of the use of two different type vars in the signature and the provided overload of "<". This means that the first call to CompareXYOK works, but the second does not, given that there's no built-in or user-defined overload of "<" for types integer and string. The third doesn't work either, given that the provided overload of "<" is not commutative.
In the Wirthian mode, we should test our new-found inspiration by writing some (more) programs. A particularly good example of where things can come together is in the specs for a Redraw op on graphic canvases, where the power of virtual-function dynamic binding is so evident. A nagging worry about fully static type checking has always been having to do the virtual-binding dispatch logic explicitly by using the `?<' and `.<' operators in a big switch statement that fans out on all subtypes of a parent type. What's particularly worrisome about this is the fact that whenever we add a new subtype, we need to go to all of the places where such dispatch logic exists and add a new case for the newly-defined subtype. This seems like exactly the kind of place where the language should be helping us out. As we know, the problem with the kind of help we get from the O-O crowd is that type checking must go dynamic, which we definitely want to avoid.
Well, I think I may have just come up (and it's probably a reinvention of someone else's smart wheel) with the way to do it with genuine type vars, including quantification over types. The idea is that instead of dynamically dispatching over a subtype, we'll statically dispatch using a type-valued quantification. Here's an example using the Redraw op:
Here's the same example with slightly different syntax, which I think I prefer:object GraphicObject is ...; object Line < GraphicObject is ...; object Rectangle < GraphicObject is ...; object Ellipse < GraphicObject is ...; object Canvas is graphics:GraphicObject* and ... ; op Redraw(canvas:Canvas)->canvas':Canvas post: forall (g in canvas.graphics) forall ($GraphicSubtype in GraphicObject) if g?<$GraphicSubtype then ProperlyDrawn(g.<$GraphicSubtype) end
Shit, if this is correct, or can be made to be, it's about as scarily invigorating and last night's rediscovery was depressing. What it means is that the type checker is doing the dispatch for us, but it's not at runtime. This is exactly what we're after here. I have a slight feeling that there may be some hidden conceptual problem lurking in here. However, given that we've defined inheritance strictly and completely in terms of unions, it seems pretty clear to me that when we're quantifying over types, it reduces to quantifying over the list-valued semantic representation of a parent type, which is just a list of its inheriting types. I.e., we can see with the formal semantics in mind that this should work out fine.object GraphicObject is ...; object Line < GraphicObject is ...; object Rectangle < GraphicObject is ...; object Ellipse < GraphicObject is ...; object Canvas is graphics:GraphicObject* and ... ; op Redraw(canvas:Canvas)->canvas':Canvas post: forall (g in canvas.graphics) forall (?GraphicSubtype in GraphicObject) if g isa ?GraphicSubtype then ProperlyDrawn(g.<?GraphicSubtype) end
As a bit of formalization, we can say that the new `in' form of the quantifier expression is equivalent to the following more basic form:
I think a really important observation here is that we're NOT going to define a Javaesque top Object or a generic any type. This means that the keyword "object" in the context of the above quantifier expression is not being used as a type name, but as a meta-type indicator that can only be used when the name part of a name/type pair is a type variable name, not a value-variable name. I.e., types are not first-class values here (though we might imagine that they could be, meaning we could allow both static and dynamic typing together, but I definitely think we need to keep this can-o-worms closed till we (at least) get the static type var stuff worked out, including, one would hope, an attribute/denotational semantics thereof). What I'm pretty clear we're looking at here is a clarification of our entity categories, that allows us to have value-valued variables (what we have now) as well as type-valued variables, but the twain does not meet between the two, and type-valued variables are only computable at compile time, not at runtime. To ease the lurking-qualm feeling a bit more, I think we're on pretty safe ground here given ML, the only potential problem being the issue of inheritance, which again given the straight-forward union-based semantics thereof should work out. Also, we should read the "adding inheritance to ML" paper that we came across in POPL during the 530 paper scan.forall ($GraphicSubtype:object) $GraphicSubtype?<GraphicObject
Change the precedence of and and or to be less than equality ops, as in CJ. DONE.
For convenience, the components of an object can be referenced directly by their type names. For example, given the following definition
access to the components of ABC can be made as followsobject ABC is A and B and C;
When an object has two or more components of the same type, the components are disambiguated using a positional suffix of the form #n, as in the following exampleoperation Op(abc:ABC) pre: abc.A = abc.B and abc.C = ... ; end;
The positional order is from left to right for each component of the same type. In this example, pa.A#1 refers to the first component in PairOfA, pa.A#2 refers to the second component.object PairOfA is A and A; operation Op(pa:PairOfA) pre: pa.A#1 = pa.A#2 end
Using the '#' disambiguator is the only way to reference compnents by name in objects with multiple same-type components. For example, the reference to PairOfA.A without a '#1' or #2 is an error.
As another example, consider this definition:
The components of an identifier a of type ABCABA can be referenced as follows:object ABCABA is A and B and C and A and B and A.
The auto-gen'd op type name is "OpFooXandYandZtoXandY". Somewhat ugly, but (a) quite consistent with intuitive type spellings of signatures; (b) fine given how infrequently users are likely to use auto-gen'd op type names; and (c) fine because this form of auto-gen'd name is unlikely to coincidently match the name of a user-defined type, particularly if normal RSL conventions are used, which disallow the use of lowercase "and" as a name word separator..)i where first, second, and third refer to the left-to-right position of components of the same type.
Reference Which Component a.A#1 The first component of type A a.A#2 The second component of type A a.A#3 The third component of type A a.B#1 The first component of type B a.B#2 The second component of type B a.C The only component of type C In an object with inherited components of the same type, the positional order is from the top of the inheritance hierarchy downward, and from right-to-left in multiple inheritance order. Consider the following example:
In this case, object Bottom has a total of eight components of type X. The ordinal position of these components is indicated by the names x1 through x8. That is, these names indicate the order in which components are referenced by a disambiguating suffix. For an indentifier b of type Bottom, b.X#i references the component that is named xi, for i between 1 and 8. This example also illustrates that a component can be referenced by either its type name or its explicitly-declared name.object Top is x1:X and x2:X; object Middle1 inherits from Top is x3:X and x4:X; object Middle2 is x5:X and x6:X; object Bottom inherits from Middle1 and Middle2 is x7:X and x8:X;In a tuple with one or more value components, those components can be referecned directly by that literal value. For example
A value component can only be referenced by its specific literal value, not by a variable or constant that contains the value. For example, the following is illegal based on the preceding definition of the tuple OneAndTwo:object OneAndTwo is 1 and 2; operation Op(ot:OneAndTwo) pre: ot.1 != ot.2; end;operation Op(ot:OneAndTwo, i:integer) pre: (i = 1) and (ot.i != ot.2); end;While it may be untypical, literal components of the same value can be referenced using '#' in the same was as other components. For example,
Here, since object OneAndOne has two literal components of the same integer value, they can be referenced using '#' to disambiguate them.object OneAndOne is 1 and 1; operation Op(oo:OneAndTwo, i:integer) pre: oo.1#1 + oo.1#2 = i; end;There is one additional form of reference for tuple components -- they be referenced directly by ordinal position, without a name at all. For example,
A numeric suffix of the form '#n' refers to the nth component of a tuple, in left-to-right order of component declaration. Hence, abc#1 refers to the A component of ABC, abc#2 to B component, and abc#3 to the C component. Probably not: The positional suffix '##' refers to the last component of a tuple.object ABC is A and B and C; operation Op(abc:ABC) pre: abc#1 = abc#2 and abc#3 = ... ; end;The '#' operator can be used in a sequence of component references in the same way as the '.' operator. Consider this example:
For a variable a of type ABC, a#1#2#1 is a reference to the F component of type E.object ABC is A and B and C; object A is D and E; object E is F and G;A sub-tuple can be accessed using a range of lower and upper ordinal positions. For example,
Operation ChangeC concatenates the first two components of a with a single component c, then with components in the fourth through the last positions of a. The range form #n.. refers to the sub-tuple from position n to the last position.object ABCDEFG is A and B and C and D and E and F and G; operation ChangeC(a:ABCDEFG, c:C) = a#1..2 + c + a#4..;The use of '#' as a positional component reference is of lower precedence than its use as a component name disambiguator. Consider this example:
obj AABC is a1:A and a2:A and b:B and c:C; obj A is d1:D and d2:Dand a variable a of type AABC. In this case, the expression "a.A#2" might be considered ambiguous, since it could refer to a.a2 or to a.a1.d2. The rule used to disambguate such cases is that '#' is interpreted first as a name disambigutor if necessary, second as an absolute position index. Therefore in this example, a.A#2 is interpreted as a reference to a.a2, not to a.a1.d2. In fact, the interpretation of "a.A#2" as a reference to a.a1.d2 is not possible, since "a.A" by itself is an illegally ambiguous reference to one of the two components of type A. To access a.a1.d2 using '#', the legal expression is X.A#1#2.It is worth noting that use of '#' as a component reference cannot be confused with its use as a length operator. '#' is used as a component referencer when its left operand is a tuple value or a type name. '#' is used a length operator when its right (and only) operand is a list, string, or numeric value.
To summarize the different forms of component reference, consider the following definition:
and an identifier a of type ABCA. There are four forms of reference by which components of a can be accessed:object ABCA is a:A and b:B and C and A and 5; object A is
Type of Reference Examples explicitly declared component name a.a, a.b unique component type name or value a.B, a.C, a.5 positionally-disambiguated type name a.A#1, a.A#2 absolute component position a#1, a#2, a#3, a#4, a#5 The positional ordering of components applies to union objects as well as to tuples. Consider for example
and an indentifier u of type U. The type-name and positional references to u's components are defined as follows:object U is i:integer or s:string or b:boolean;
Explicit-Name
ReferenceType-Name
ReferencePositional
Referenceu.i u.integer u#1 u.s u.string u#2 u.b u.boolean u#3 Given the higher precedence of and composition over or composition, care must be taken when using positional references in an object defined as a combination of and and or operators. Consider the following example:
Given the and/or precedence rules, the default structure of X isobject X is integer or string and real or boolean;That is, X is a three-elelment union, not a two-element tuple. For an identifier x of type X, the following table defines the legal type and positional references:integer or (string and real) or booleanSince the right-hand operand of '.' must be an identifier, the second component of x is only accessible positionally. That is, there is no such form as
Reference Meaning x.integer the value of x as an integer x#1 same as x.integer x#2 the value of x as the two-tuple (string and real) x#2.string the string value of the first component of x#2 x#2.real the real value of the second component of x#2 x.boolean the value of x as a boolean x#3 same as x.boolean to access the second union componennt of x. As explained earlier, such an anonymous component can be given a name, as in the following definitionx.(string and real)which allows the second component to be refereneced by explicit name as x.sr.i:integer or sr:(string and real) or b:booleanGiven the various forms of tuple reference, one might ask under what circumstances the different forms are convenient. This clearly needs to be finished, with some compelling rationale and examples. We should definitely note that consistency of notation is a very good idea, to make specs understandable.
Editorial Note: There is definitely a trade-off working here, between notational convenience versus complexity. An argument against convenience goes like this: "Since the main point of SpecL is to be precise and help specs be understandable, it may not be a good idea to have things work in some sense by accident. I.e., being able to refer to tuple components by their type names, which such references can cause understandability problems in a number of subtle ways."
23mar02 -- Limits on (Multiple) Inheritance
There can be no circularities in an inheritance hierarchy. E.g., the following is illegal:
object X inherits from Y; object Y inherits from X;An object cannot inherit more than once from any other object, directly or indirectly. E.g., the following is illegal
as is the followingobject X is ...; object Y inherits from X and X;object Top is ...; object Middle is ...; objct Bottom inherits from Top and Middle;22mar02
See the "(22mar02 injection)" under the 14ded00 entry below.
4dec01 -- Aux Functions as Vaporwear
In thinking about parsing, it seems to me that I might rather not write the aux functions to do this, but instead spec them out. This leads to the concept that one can write aux functions constructively as bodies or analytically with pre and post conds. Hmm, does this work? If so, let's make sure we can explain and rationalize it.
And one more observation on parsing -- we need to figure out how the objects- as-BNF-rules scheme fits into things wrt parsing. I.e., can we assume that we automatically have a parser if we write the grammar for something.
The motivation for this comes from Ciera asking about parsing a day/time string into a day/time tuple object. I started writing this up using parsing aux functions in ~/classes/205/examples/specs/parse-day-time.rsl, and started to see how ugly things were getting in terms of reinventing the parsing (or even just lexing) wheel.
4dec01 -- Yet More on Exists as Selector and nil <=> false
OK, in working today with Keiko Tamura on part of the specs for EClass, I had the distinct desire to be able to use exists as an Object-valued operator. If we go back to (or just stick with) the axiom that nil === false, then we could make exists Object-valued. Not-withstanding what's said in ./if-then-else- truth-table and ~/work/rsl/testing/implementation/acceptance/new-inputs/exists- as-search.rsl, I'm pretty comfortable at this point with saying that nil === false.
9nov01 -- More Real Formal Stuff
See the discussion in new-inputs/exists-as-search.rsl
We should define formally that the type of nil is Object. Then we need to read everything we've said about the equivalence of nil and false, and make sure things are consistent. In particular, with remarks to the effect of "I don't know if I'm happy with the notion that (false = nil) = true". Get this ironed out ASAP!
28sep01 -- Type Any, Yet Again
In recent thinking, e.g., retention-and-no-junk.rsl, q.v., we've been considering using the Javaesque identifyer "Object" to be the top of the RSL type lattice. While this does have some appeal, particularly in that it helps students and others famiilar with Java relate, I'm at the very moment leaning back to calling the top type "any". The reason is that we have a potential conceptual problem with where atomic types fit in the type hierarchy. Java in fact has this same problem, in that the atomic types don't inherit from Object. Given these observations, I think it might in fact be misleading to use a capitalized typename as the top of the lattice, rather than a keyword or at least keyword-like identifier such as any.
Whatever we call it, there is the seemingly ever-unresolved issue of what operators are applicable to any. I think the retention-and- junk.rsl provides at least one clearly legal op that can be applied to any, viz., in, as in o in out_list, where o is of type any and out_list is of type any*.
So, if we fully enumerate the ops that are applicable to any, and explain how it serves as the top of the inheritance hiearchy, and in some sense the archetype opaque type, then the ever-unresolvedness will be resolved, and life will be wonderful. You what to do, boy.
27jul01 -- Real-Life Example to Press Understanding of Where 2
Instantiation" 2The reason the instantiating where type needs to be a subtype of what it's replacing is to avoid type-breaking mutation. This is because the where clause is always used in the context of inheritance, which means that if we let an instantiating type arbitrarily change an inherited component, then ref to that component when a value is in a parent-type var could be bogus. E.g.,
obj Parent = o1:O1 and o2:O2; func Foo(p:Parent)->string = p.o1.s; obj O1 = i:integer and s:string; obj O2; obj Child < Parent where: O1 = O3; -- ERROR: this must be prevented per latest -- semantics that say instantiating type must -- be subtype of type it replaces end; obj O3 = i:integer and s:O1; func Foo2(c:Child)->string = c.o1.s; -- This egregiously breaks typing -- since if the bogus where clause -- actually worked c.o1.s is no longer -- a string, but rather an O1 -- We be in trouble here, since both the preceding and next lines should not -- be working. I.e., we can't have it both ways -- c.o1.s is either a string -- or an O1, but not both. Somethings wrong somewhere, either because we -- haven't finished type checking the return type of a function body, or the -- where clause instantiation is bogus, or some of both. func Foo3(c:Child)->string = c.o1.s.s; func Foo4(c:Child)->string = c.o1.s.s.s; -- Sanity check -- this is in -- fact an error, as it should be func Foo5(c:Child)->string = c; -- OK, it looks like the problem -- that the type checking of func bods -- against return types is not done -- yet. Never-the-less, the above -- bogosity observations still hold and -- need to be fixed. -- OK, the preceding misses the point that we're trying to get at. Here it is. func Foo6(c:Child)->string = Foo(c); -- Breaks types if where instantiation -- worked as it appears it should. -- The deal with Foo6 is that its call to Foo with c as an arg breaks types -- inside Foo, since c is sent in as a Parent, but when it gets there, its -- mutated o1.s component ends up mascarading as a string when its really -- another O1. This be the reason that where instantiators must be subtypes of -- what the sub for.The "real-life" place where this has just arrisen is in the discovery that recurring info needs to be different in Meetings versus MeetingRequests in the Cal Tool specs. Hopefully you get the idea well enough in terms of what needs to be done about this, both in the Cal Tool spec and RSL type checker. Get the fuck to it, dickhead.
8jun01 -- Why It's OK to Use Type Name in a Tag Expression
Since or'd components must be distinct, it would be OK to use the type name as the 2nd operand of '?'. This is way-old stuff, but it keeps coming back. The rule could go like this: "If an or'd component has an ident type name, then that type name can be used as the right operand of a '?'".
1mar01 -- Yet More on the Inverted Pyramid Issue
We want/need a way to create a function out of pre/postcond logic. There's mention of this in the book formal-spec chapter, but it's a bit equivocating. The deal is that we need syntax of the form
where the ... are all of the inputs (? and outputs?) to OpX. The deal is that the pre and post attributes are defined functions from any to boolean, where the any-valued input signature specializes to arity (?and coarity?) of each op.OpX.:post(...)The question marks here have to do with not yet understanding how we invoke a function in a postcond and then say that function's postcond is true. In particular, does the function denoted by OpX.:post(...) actually run OpX, or do we need to run OpX first, e.g., in a let clause, then call OpX.:post with the same inputs as we ran OpX with, plus it's returned outputs? Hmm, functionally, it seems at the moment that the former case can't work, given that it would involve some kind of call-by-var semantics we don't have. What we need is a notation that's not dreadfully unwieldy, but still fuctional. THIS NEEDS TO BE WORKED OUT. (Sorry, but I'm not really dropping off here, I just need to get home and wait until a bit later to figure this out.)
8jul02 update: Let's do this for sure. The javaization of the notation should allow ".pre" and ".post" as suffixes to a class' methods. This syntax is fine, since it doesn't conflict with an existing Java use of `.', I'm pretty sure.
16sep02 update: I'm not really sure what I was getting in the paragraphs a couple back, i.e., with the question mark biznis. The deal is, it seems to me, that the functionzied postcond is a boolean-valued function in its own right, to be used in the postcond of some other function as a convenience. So, the call-by-var semantics is not really an issue. I.e., here's what goes on: In op B's postcond we want to say that op A's postcond is true. But in order to say this, B must produce some or all of the same outputs as A. Therefore, the way we use the A.post function is to pass it the outputs of B so that they are validated as they would be for A. That is, what is true of A's outputs is true of B's. The reason, I think, that rasing the call-by-var issue is really a non-issue is functionally, the only way that B can have the same effect as A is for it to produce the same output(s).
But hey, why do we need A.post at all then? Why don't we just equate which ever of B's outputs we care about to the return value(s) of A? While this means that B's postcond "calls" A, why should this be a problem as long as the call is in the context of a boolean expression? Hmm, have we really misunderstood things this badly for so long? I.e., has there never been a need for a functionized postcond at all?? Fuck, this needs to be worked out by examining all of the contexts in which I thought that functionized postconds were necessary, wherever the fuck they are. Per the above note, we should start with the mentioned text chapter.
15dec00 -- Semi-Epiphany
OK, in pondering the 14dec00 TODO LOG entry for typechk.c, I think I'm onto the idea of tagged values at runtime, basically as we've always been onto it, but with an I think new embellishment based on spelling-based type checking. The embellishment is that values with discernable named types will be tagged with that type name, otherwise they're tagged with the structured type.
Well, in writing down the preceding paragraph, I realized that there's nothing really new there at all. Viz., we'll have tagged values at runtime that are tagged with a type struct that can be an ident type or not. Now, what needs to be clarified is exactly when values get tagged with ident types. The answer is under the following circumstances:
The tagged value will let
- when bound to a particular op arg of a named type;
- when built by the built-in constructor for an obj;
- perhaps other ways, but none that I can think of right now.
Now, the other thing to talk about today is why it's OK for '<' to be fully overloaded for all types, including opaque types. The reason is that we'll do structural typing for everything, unless there's an explicit overload of '<' for some type(s) that want it. In this way, it seems that we get the best of both worlds in terms of both full structural comparisons on values that are tagged with unnamed types, versus specialized comparisons on values that are tagged with ident types for which an overload of one or more comparison ops has been defined.
I think that's all there is to it, and so at this point, I'm not quite sure just how epiphanous it was, but it does seem that there were some things said explicitly that have not been so said up to now.
14dec00 -- Nearing Conclusion on 7dec98 BIG BUG
OK, from what I can tell, we do want to say the following:
We've now added this as a (currently the last) type compat rule in
- forall type T, opaque type OT, T < OT
The deal is that since an opaque type has no structure at all, it can have no ops that do anything structural to it, and hence any op that takes any opaque type as an input can have any other type sent to it, since there's nothing the op can do to the opaque type that could be illegal to do to any non-opaque type sent to it.
It would appear that the preceding paragraph effectively answers "Yes" to the question of whether an opaque type is the same thing as a type var. It is the same thing by the very reasoning in the preceding paragraph.
(22mar02 injection -- Look fucker, an opaque type is not the same as a type var in the ML sense, since there's no guarantee that two args of the same opaque type will be dynamically the same. See, e.g., new-inputs/type-var- test.rsl. Also, the deal that an opaque type has "no structure at all" and is therefore OK as the parent of any other type is bullshit given structural equality. The deal is that structural equality must know the structure of two types, and given the subtype compat defs we've allowed, all the built-in relational ops must be defined in terms of dynamic types. This is pretty fucking depressing right at the moment, given the type correctness of the "x < y" line in the new type-var-test.rsl file. Again, what this means is that the deal with ML's equlity type stuff is biting us again. I was thinking there may be some way to strengthen static type checking by doing some kind of dynamic binding. Also, I'm questioning right now the idea that we need to say something as strong as X < OT, forall X, OT. Why not just X < OT when that's the way it's explicitly declared? Fuck, this is getting really annoying going around in fucking circles seemingly endlessly here. One final piss-off tonight -- do the spelling rules really mean that X* < Object*, forall X? We need to work this out, since we just added this to typechk.c, where we let the basetypes of two arrays use assmntCompat instead of compat.)
We might ask at this point what good an opaque type is if they're all really the same structurally. Well, the answer has to be in the name itself, and hence there can only be a meaningful distinction between opaque types where the name itself makes a difference. This appears to be precisely the case when we're instantiating a where clause, since we're using the name in the where clause syntax. This seems therefore to obviate the need for any additional '<' or '<=' synatx in a where in addition to the normal '=' syntax. The deal in this regard is that, again, since all types are compat an opaque type, we don't need the '<' or '<=' syntaxes for where clause substitution.
Now, another thot is that the '<' in the above quantifier expression is strictly '<', not '<='. By the current type compat rule, it seems that '<' is strict in the sense that it implies compatibility, not equivalence. This needs (a bit) more thought.
If all this is true, it fixes the big BIG BUG problem, and it should be implemented, as soon as we do the last bit of thinking about it. So, get the fuck to the thinking, and thence implementation.
10dec00 -- Quick Reminder about DD Generator
In ref man revision, don't forget to describe how html tags can be used entity descriptions to improve formatting in DD. Very cool javadoc-like stuff, this.
3dec00 -- Some Quick Thoughts on RSL Manual Updates and Related Topics
In the manual, include an appendix of the current standard libaries.
It's pretty clear that we're going to have an RSL library of common forms. This will require that we finish the implementation of multiple inheritance and the new module import/include forms. Candidates for libraryhood currently include the following:
Upon looking at the AddWithRetentionAndNoJunk function, this seems to be like a very nice way to allow reuse of logic without having to go the (multiple) inheritance route. We should describe both ways (i.e., reusiong with inheritance versus aux functions) and be clear about why inheritance sucks. (Or then maybe it doesn't, give most recent musings in sum-list*.rsl about the seeming similarity if not equivalence of the OBJ theory/view concept with the Java interface concept.)object OrderedCollection, with ops to sort, etc. function SumList (see new-inputs/sum-list.rsl) function AddWithRetentionAndNoJunk object Color object TextHere's some verbiage in the manual about how to clearly explain the should-be simple concept of enumerations:
In a number of progamming languages, including C and C++, there is the concept of an enumeration. Here is a typical example:[Hmm, maybe we should just not worry about PLs (too much), and just explain enumerations directly in the several rsl forms. OK, here comes.]enum DaysOfTheWeek = {Monday, ..., Friday}A very typical use of or composition is to define what are typically refered to as enumeration types in programming languages. Here is an example:
From here we describe that Monday, etc. are the enum list, and then describe how to represent them as either strings or symbolic literals.enum DaysOfTheWeek = {Monday, ..., Friday}Here's some verbiage for the ref man about the difference between strings and symbolic literals:
The difference between double-quoted strings versus single-quoted symbolic literals is subtle but significant. First off, if you don't want to worry about the difference, you can get along just fine in RSL by using strings everywhere that you might otherwise use a symbolic literal, and just forget about symbolic literals entirely. If you care, read on.
Now go on to describe how symbolic lits are more abstract; that the string/symbolic-lit distinction is precisely the same as the string/symbolic- atomic distinction in Lisp; that when modelling views or other objects that have abstractly atomic but concretely non-atomic structure, symbolic lits are a more accurate model (albeit subtly so); also when that when modeling grammars, symbolic literals provide the means to define abstract versus concrete syntax.
And fuck me if that doesn't all make some pretty good sense.3dec00 -- Some Details on the New import/include/export Scheme
To get the effect of nested modules, we do like this
As in Java, we don't physically next module inside of each other, but create a form of "virtual" nesting via include. Now, it seems to me that the only reason we need nesting is for namespace control, which Java does by qualifying the name of the package that a file's worth of stuff is in. For us, we get the same effect by a definition of the Outer form above, since we can saymodule Outer; export *; include Inner1, Inner2, Inner3; end Outer; module Inner1; export X; obj X; end Inner1The point here is that if we assume that the names "Inner1" and "X" are both potentially widely-used symbols, then we want not qualification of "X" but of "Inner1" as well. If we didn't care about this, then we'd write User like this:module User; import Outer; op Use(x:Outer.Inner1.X) = ...;module User; import Inner1, Inner2, Inner3; op Use(x:Inner1.X) = ...;We should note that export makes available not only the entities directly listed, but the transitive closure of the components of the entities for those who use the '*' form of importation.
In summary, we have the following consistent and orthognal semantics:
- "import" and "include" both make exported symbols visible in module.
- The (only) difference between the two is is that "import" is qualified and "include" is unqualified.
- The '*' form means do a transitive closure on all components. We need to be careful and fully clear here about what happens when we transitively close through a series of nested modules. It shouldn't be a problem, we just have to make things completely clear. We may want to go back to the Mod-2 definitions where this was spelled out, including what I recall was a difference between version 1 and version 3, or something like that, as well as the idea of import/export of a record working transitively on the record fields.
30nov00 -- More Ideas on Executability
Problems and issues include:
- Implementing unbounded quantification by quantifying over all allocated object values, where allocation takes place by invocation of a constructive op. The problem here is that if we execute via only validation invocation, we don't really construct objects. Obviously, we need to think about this. Oh, I guess I have thought about it -- it's down in the next section under I had a flash today
- How to show, or distinguish, validation invocation in a DFD. A thought is to specify an overall mode for DFD (and postcond) invocation, rather than a special form as shown below. However, such global modes don't seem particularly appealing. Anyway, we need to think about this some more.
14nov00 -- Ideas on Executability
One form can be providing input/output pairs and having the pre and postconds evaluated. I'd like to have first-class RSL invocation syntax for this form of "validation invocation", which might look something like this:
The idea is that the invocation formop Foo(i:integer, s:string)->(i':integer, s':string) pre: i > 0; post: (i' = 10) and (s' = "abc"); op Test() = begin Foo(1,"xyz")?->(10,"abc"); (* Produces value {true, true} *) Foo(1,"xyz")?->(9,"abc"); (* Produces value {true, false} *) Foo(0,"xyz")?->(10,"abc"); (* Produces value {false, nil} *) endop-name(value, ...)?->(value, ...)always produces a tuple of type boolean and boolean. This looks awefully sweet to me at this point.Note that when the precond evals to false, the postcond must eval to nil.
Dealing with quantifier eval will of course be an issue. I had a flash today that we could deal with unbounded quantifiers by saying we quantifier over all the concrete objects of a given type that have been instantiated over the course of a test, where instantiation means that an object has been created and bound to a currently live location (i.e., in a live var directly or within a composite value that's bound to a live var). In this way, if we wanted to have a large pool of active values of a particular type over which to quantify, where the pool holds what we consider to be a representative collection of values, we could do something like this:
obj IntAndString is integer and string; op GenInstancesOfIntAndString(i:integer,s:string)->IntAndString* = ( if i>10000 then [] else [i, string(i)] + GenInstancesOfIntAndString(i+1, "") ); op Test() = begin var int_and_string_pool:IntAndString*; ...Another way to do things might be to quantify over all values of a given type ever created. In this way, we wouldn't need to stick the values in a list, though we could if we wanted. This is sort of MLish, in the sense that values live forever. And I think we have a firm handle on "created" -- value creation happens whenever an op that outputs a particular type of object is successfully executed.
This hints at the idea that we could define some sort of quantifier- generator function that defines over which specific values of a given type we quantify. We could define such a function as a set of constraints over all values within a given type. This sounds interesting, and definitely worthy of further thought.
19oct00 -- Making it up as we go along, some more
Hmm, it just occured to me that we may want to make quantification of types use name equivalence. We need to think (probably hard) about this.
19oct00 -- Making it up as we go along
Well, we've now decided, I hope correctly, that a forall with a failing quantification clause, e.g. forall (x in []), returns true. The rationale is, threefold. First, it seems by far more convenient and intuitive than returning false. Also, it's consistent with the truth table for implication, which returns true for true => false. Finally, if we look closely at the supposed redcution translation of the "in" form of forall, the implication truth table business seems to be supported.
16nov01 update. Here's the "look closely" part in the last sentence of the previous paragraph. Consider that
Now, if p1(x) is never true, this means that "if p1(x) then p2(x)" is always true, which means that the quantifier succeeds. What this means intuitively is that if the quantifier variable clause yields no values at all, the forall quantifier is (vacuosly) true. I.e., anything is true about nothing. This last statement has the same counterintuitive feel that logical implication has, viz., false implies true. Be that as it may, things do indeed work out here the way they're formulated, where "work out" means being sound wrt standard predicate logic, most particularly the truth table for implication, as funky as we sometimes think it is.forall (x:O | p1(x)) p2(x) <=> forall (x:O) if p1(x) then p2(x)Complete discussion of this needs to go into the ref man.
19oct00 -- Making if-then-else formal
Though I don't think I've openly acknowledged it excplcitly before, the problem with if-then-else is when it is used as something other than a bool-valued function. We really do need to clear this up formally. I.e., is if-then really equiv to implies, or do we need to define it in Lisp logic terms?
As of 15nov01, the answer to this is in ./if-the-else-truth-table.
19oct00 -- Clarifying Condition Inheritance in Ops
I actually don't remember the exact rules for op inheritance in the O-O sense. It occurred to me that it might be a very nice feature to say that an overloaded op only inherits pre- and postconds where the specifier explicitly requests such inheritance. We could do this by constraining the conditions underwhich effective ops are generated via inheritance. Viz., the effective op generation takes place only for ops that are listed explicitly in the operations clause of a parent class. In this way, one can allow logic not to be inherited, in the case that an effective op didn't want the inheritance.
We need a good example of this. The cruxt of what we're saying here is that an inheritance-generated effective op would not want some or all of the input constraints that were defined for the parent-object op. Again, I can't think of a good example right now, but it is worth investigating, and I think worth including as a feature to allow more specification flexibilty.
I'm not entirely sure about the last point (i.e., that flexibility is always necessarily a good thing), so the issue definitely requires more thought.
21sep00 -- Dumping Coarity-Only Overloading
Per the 7jul00 LOG entry, it's time to dump coarity-only overloading, for the reasons listed there. Implementationwise, there's not much to dump, since we're far from a complete implementation. What we need to get rid of in the implementation are the current quick hack that allows co-arity overloading for a very specific case, plus all the hooks that were put in to support the future full implementation of coarity-only overloading.
Conceptually, the decision is now final. Viz., there is no coarity-only overloading in RSL.
5jul00 -- Resurecting Formal Dataflow Defs
This is pretty bizzare, but I thought this was written down. Anyway, here's the idea. Oh fuck, I just found it (the example I was thinking about) in se- book/semi-formal-spec. Anyway, again, here's some more pertinent discussion.
We probably want to disallow anthing but AND'd op components. If not, then we need to specify the semantics of OR and STAR'd components. There has been some thinking (in the pretty dark past) about defining these as non-determinancy and unfolded loops, respectively. However, with OR we'd have to reconcile with the connections to make sense of it. Similarly, it would seem, for STARs. Anyway, I think we'll do best just to disallow anything but AND's and say that some future version of RSL may lift this restriction. One argument against never allowing ORs is that non-determinism sucks pretty much (at least for my meager brain at this point, and probably forever).
2jun00
In the new semantics, we need to be clear in the ref man about the following:
OK, forget about it in just the ref man, we need to be clear about it period. Question: "What's the type of TheStringXYZ and what is it compat with?". Ans: The type is "the string xyz" and as a type its compat (but not equiv) with string. The tricky bit we're getting at here is the following: A value-constrainted type cannot have unconstrained values assigned to it. I.e., an ident of type TheStringXYZ cannot have any string other than "xyz" assigned to it.obj TheStringXYZ is "xyz"; val TheValueXYZ = "xyz"; val TheStringABC:TheStringXYZ = "abc"; -- ERROR: the LHS type is -- "the string xyz" -- but the RHS type is -- "the string abc", which aren't -- compat by the normal type rulesThis value-constrained type-definition rule does not apply to value definitions (well now it does -- see "But wait" below). I.e., when a value identifier I is assigned a literal L that denotes a specific value of some type T, I is defined to be of type T, not (the T V) (actually not -- see again the "But wait" below). E.g., TheValueXYZ defined just above is of type string, not the string xyz. But since values are constants, they can never by rebound, so there's no issue of what it would mean to (re)assign some value other than "xyz" to TheValueXYZ. This effectively means that TheValueXYZ acts just like a variable of type TheStringXYZ, even though technically they're of differenct types. The point is by compat rules, TheStringXYZ is compat (but not equiv) with string, and TheValueXYZ is is equiv to string, but since it's a constant it cannot be rebound.
But wait! I think we want to fix the small anomoly in the preceding paragraph by saying that a value is in fact of a value-constrained type. Since value-constrined types are upward compat with their "parent" types, this won't put any damper on how value can be used. E.g., TheValueXYZ is in fact of type the string xyz, not of the more general type string. This does not constrain how TheValueXYZ can be used, and seems to allow for a more straightforward implementation where all constant literals are formally defined (and implemented) to be of value-constrained types, which means that the implementation of a value declaration just copies the value- constrained type of the constant into the type of of the constant ident.
Now, what we don't have to worry about now that obj's and val's are distinct, is that TheStringXYZ is a value -- it's not. Therefore it cannot be bound to any ident, which means we don't have to worry about what it's compat with in any binding context. Well, this isn't exactly (if at all) right, but there's still not a problem. Viz., an ident of type TheStringXYZ can be bound to an ident of type string, per the normal type rules. I.e., TheStringXYZ is compat with string. So fine.
Now here's an interesting question -- does a var of a purely value-restricted type ever need to be, or can it even be, bound? E.g., what's up with a var decl of the following form?
We might say that since it can only hold one (non-nil) value, that that value is automatically bound to it when it's declared. What I don't like about this is that it makes an exception to the rule that there's no auto-binding (except perhaps to nil) of vars. What this means is that the_string_xyz could in fact have the value nil, until it's bound to "xyz". And the reason this seems a bit funky is that we have to assign a value explicitly.var the_string_xyz:TheStringXYZ;And here's (perhaps) a bit more funkiness. When we declare a tuple field to be a specific value, should that tuple field as a var be automatically bound to that value? I think the confusion I'm verging on can be resolved with the following example:
Hmm, maybe this example can't quite resolve things (later note: but read on, because it can). The question is are these two types equiv? It looks like the spelling rules say yes. In either case, the deal is that we want a value of this tuple type to have constant value in the second element. But if we declare an ident of this tuple type, it has to be explicitly bound just like any ident, with the restriction that the second field can only be bound to the value "xyz". So it looks like there isn't any diff between the var and tuple case. Either way, a type with one or more value-constrained components is still just a type, not a value. If it's a degenerate case of all pure-valued components, so be it. The fact that we call it "degenerate" means that its behavior can be (a bit) funky. The funkiness in this case is that it still needs to be explicitly bound to a value, even though there's only one non-nil value it can possibly have.obj AnIntAndAnXYZ is i:integer and the_string_xyz:"xyz"; obj AnIntAndAnXYZ is i:integer and the_string_xyz:TheStringXYZ;Now, it appears that a union op turns a pure-valued type into a variable type, in the sense that more than one value can be bound to it. E.g., compare
The first can only be bound to one non-nil value, whereas the second can be bound to two. I'm not sure if we need to say anything more about this, other than to be clear that union types create a form of value multiplicity (even) when the union components are pure values.obj OneAndTwo is 1 and 2; obj OneOrTwo is 1 or 2;Now, whither the concept of "constant"? Well, I think we can say definitively that a value is synonymous with "constant". But as the preceding discussion (hopefully) makes clear, an object (aka, type) is not (can never be) a constant, since an object (aka, type) is never a value.
Now for the ref man explanation. When an obj definition is a specific value, we've created a type of object that can only hold the single specific value that it is defined to be. On the other hand, when we create a value of a specific literal-denoted type, the literal in general denotes a broader type than its specific value. It's only when that type is bound to a value- restricted ident that the type becomes less general. Some further examples here would almost certainly be in order.
One of the examples can make completely clear why the ERROR comment above is true. E.g., the value "abc" can be bound to a string, because by type rule (b) "the string abc", which is the type of the value "abc", is compat with "string". However, there is no rule that makes "the string abc" compat with "the string xyz". QED.
And finally, let's probably give a ref man example such as the following
which illustrates that declaring an object to be of a single value is just a round-about way of declaring a constant value, where the "round about" part is that one must also declare and assign a variable of that type in order to have a usable value on hand. The only (rather useless) distinction between the variable abc and the value ABC is that the variable can hold the value nil but the constant will never denote nil.obj ABC_Type is "abc"; var abc:ABC_Type := "abc"; -- holds the value "abc" val ABC = "abc"; -- denotes the value "abc" val strictly_ABC:ABC_Type = "abc" -- denotes "abc", but in a stupid wayAnd finally finally, we should probably have an example that illustrates that a fully contant tuple type isn't any more useful than value-constrained scalar type, in that it can only hold values of one type for all of its components, but that as a type it does not "contain" values at all, and therefore a variable (or constant) of the type must be declared.
Yet Further Analysis Related to 19oct99 Note (31may00)
This item, i.e., all of the discussion under this 31may00 heading, appears to draw a pretty definitive conclusion about subtyping and compatibility. As we'll see, the discussion here has led to an update of the spelling-based formal type rules, and succeeds (I'm pretty sure correctly) in demystifying the lingering unclarity about subtyping versus subsetting.
So, let's start by reconsidering the example presented in the "More Analysis" item just below:
The change here is that the assignment "i = oot" is now considered NOT fine, because we can reason by the (pre-31may00) spelling-based formal type rules that it doesn't work. Here's the reasoning:obj OneOrTwo is 1 or 2; op main(oot:OneOrTwo, i:integer) = begin oot = 1; (* Fine *) oot = 2; (* Fine also *) oot = 3; (* Clearly not fine *) i = oot; (* NOT fine *) oot = i; (* Not fine, perhaps not so clearly *) end;
- The type spelling of oot is "union of (the integer 1), (the integer 2)".
- The type spelling of i is "integer".
- By application of the spelling-based compat rules, "union of (the integer 1), (the integer 2)" is not compat with "integer". What the potentially applicable rules do say is:
- "the integer 1" is compat with integer (by compat rule b)
- "the integer 2" is compat with integer (again by compat rule b)
- "the integer 1" is compat with OneOrTwo (by compat rule c)
- "the integer 2" is compat with OneOrTwo (again by compat rule c)
- I.e., we can stick a "the integer X" into an integer or into a OneOrTwo. But just because we can do this does not mean that we can stick a OneOrTwo into an integer, or vice versa.
Now, here's an interesting slip I originally made at point c in the above reasoning:
...
- This means that oot is compat with i, but not equiv.
I.e., a value of the type spelled "the integer X", for any X, can be bound to an identifier of the type spelled "integer".
This reasoning is stupid and in fact totally going around in circles (based on the pre-31may00 type rules, which are about to be amended). It's stupid because the subpoints i and ii don't really support the superpoint c. In order for this to happen, one of the subpoints would have to say ``A value spelled "union of (the integer 1), (the integer2)" is compat with "integer"'', which is not true by the current (pre-31may00) formal type rules. The reasoning is going around in circles since it's the (kind of) reasoning that was (apparently) used to think originally (i.e., in and around 19oct99) that "i = oot" was OK. (Hmm, or even fuck!)- However, a value of the type spelled "integer" cannot be bound to an identifier of the type spelled "the integer X", for any X.
Anyway, let's investigate what it would take to get OneOrTwo to be compat with integer. It looks like we'd have to have a (new) type rule of the form:
a union type is compat with type X if all components of the union type are compat with XWhat's going on here is that we're considering a union type in which all elements have a common compat-parent, but none is directly compat with each other. This appears to be essentially the subtyping rule for compatibility. I.e., each element of common-compat-parent union acts like subtypes in the sense that none is compat with each other but they all share a parent with which they are compat. (This is just rethinking of the basic oo rule that subtypes are compat with parent types).Now, another way to look at things is that what we've got going here are two ways for a type to be compat with another: (1) via being an element of union, which makes the element type compat with the union type, which by the union- based definition of inheritance is the same as subtyping; (2) via being a type of the form "the X Y", which makes the "the X Y" type compat with X (which I guess can be called the "subsetting" rule (and see the conclusion of this note item below).
Next consider that it's silly to have an explicitly-declared union of types that have a common inherited parent, because that union type would be completely redundant with the automatically-generated inherited parent type. (Recall again that with the union-based definition of inheritance, a parent type is in fact a union of its subtypes.)
Consider, however, that unlike with subtypes that are explicitly declared, there is no explicit set of (infinite) declarations that say a type X is defined as the (infinite) set of all types of the the form "the X Y". So, it does make sense to have an explicitly-declared union of value-style (i.e, of the form "the X Y") types that share a common compat parent. And, since there are only two ways to be compat (as opposed to equiv), the kind of union we're talking about must be value-style. And it looks like what we've uncovered here is another way to define a collection of types with a common parent. Viz., the set of all types of the form "the X Y" have the common (compat-)parent "X". Given this, it seems reaonable to add the above new compat rule, since it provides a measure of uniformity for things. What's a little funky is the idea that it's silly to have a union of types that all inherit from the same type, but that's what we get for having the automatic rule that maps inheritance to subtyping.
To make everything crystal clear here, I think we must also consider the relationship between automtically-defined parent-class union types and the new type rule. An automatically-defined parent-class union type is of the form
"union of (tuple of original components), (tuple of original components, child 1 components), ..., (tuple of original components, child n components)".Now, by the new type rule, what is this auto-gen'd type comapt with, if anything? I.e., does there exist a type T with which each of the auto-gen'd components is compat? Well, except in the degenerate case when all child types are exactly the same type, it appears that the answer is "no". This is because each component type is at type-level an explicit tuple, so the new rule cannot be (recursively) applied to any of the component types to get a compatible type. And unless each child type is the same, there will exist some component of one of the tuples that is distinct from the others, hence precluding the exisitence of a common parent-compat type T.So, now that we've reasoned that such a new type rule would be OK, the question is do we want to bother with it? I.e., does it really buy us anything useful? Well, by our reasoning, the only kind of union types we're dealing with to which the new rule would usefully apply are those with value-style components. For these types, is it a big deal (or even much of a deal at all) to be able to do the kind of bindings that the new rule would allow? Consider again the example we've been working with
Is it useful to make this type of assignment (and type-comparable parm bindings) legal? Well, if we don't make them legal, what would we have to do to make them happen? I.e., how (if at all) can we assign a value of the OneOrTwo to a var of type integer? Well, it looks like we cannot do such a whole-variable assignment, but rather must do explicit projection of the component elements. And to do this in the current structure of the language, it looks like we must give each component a name. So, e.g., what we're trying to do would look like this:obj OneOrTwo is 1 or 2; op main(oot:OneOrTwo, i:integer) = begin (* ... *) i = oot; (* NOT fine *) (* ... *) end;obj OneOrTwo is one:1 or two:2; op main(oot:OneOrTwo, i:integer) = begin (* ... *) i = oot.one; i = oot.two; if oot?one then i = 1 else i = 2; (* ... *) end;But this looks pretty brain damaged conveniencewise. However, what's all this accomplishing any way? I.e., why do we want to assign a restricted integer value to an unrestricted integer? It's akin to, if not essentially the same as, being able to assign an enumeration literal to an integer, which in a nicely strongly typed language need or ought be allowed. The point is the following. By creating a type to be one of a restricted set of of constant values, why not just require that we deal with it in a little island of restricted value manipulation that includes only explicit references to the constant values?
Well, a good answer to the last question is not in binding, but in conditional logic. E.g., consider the following example:
Without the new rule, it seems we'd have to do the following:obj OneOrTwo is 1 or 2; op main(oot:OneOrTwo, i:integer) = begin i = SomeIntFunction(...); if (i = oot) then (* ... *) else (* ... *); end;which looks pretty painful.obj OneOrTwo is 1 or 2; op main(oot:OneOrTwo, i:integer) = begin i = SomeIntFunction(...); if ((i = 1) and (oot = 1)) or ((i = 2) and (oot = 2)) then (* ... *) else if ( (* ... *); end;So, I've done it. I've added the new rule in the formal-type-rules file. It seems a bit hokey, given the power of the other rules versus the relative special-caseness of the new rule, but I think we can live with it. To mitigate things in this regard a bit, the only reason that compat rule c is all that powerful is because of the unionization of subtyping, without which we'd have some more rules about subtype compat. Anyway, it's done (at least for now) and let's live with it (at least for now). Don't forget that we still have to implement all of the formal type rules!!
So now, I think we can at last, once and for FUCKING ALL, defuckingmystify the issue of subtyping not being subsetting. The demystification, it appears, has been in the spelling-based formal type rules all along. Here's the deal:
- We have the following two distinct type compat rules:
- a type spelled "the X Y" is compat with X;
- the type X is compat with any union type of which X is a component;
- Rule i addresses the subsetting part of the issue whereas rule ii address the sutyping part of the issue. The reason that ii addresses subtyping is, again, due to the unionization of inheritance.
- The deal is, subtyping and subsetting are not the same thing, but by these rules they do individually imply the same thing.
- In strctly logical terms:
be the propsition subtyping
- let st
- let ss be the propsition subsetting
- let c be the propsition compatibility
- The compat rules say effectively the following:
(st => c) and (ss => c)- But this logic certainly does not imply that st <=> ss, or even a one-way implication between st and ss.
The point is that we've apparently been suffering in some anti-logical fog about all of this, thinking that there's some kind of relationship between ss and st, but not being completely (if even nearly) clear on what it is. Well, it now appears clear the the relationship is that they both imply compatibility, which in logical terms leaves them otherwise unrelated. Cool (I think and I hope).
The ``More Analysis'' Called for at the End of 19oct99 Note
One of the things we need to get straight in terms of type safety is that a type with specific enumerated values of, say, type int, is NOT compat with type int. E.g.,
obj OneOrTwo is 1 or 2; op main(oot:OneOrTwo, i:integer) = begin oot = 1; (* Fine *) oot = 2; (* Fine also *) oot = 3; (* Clearly not fine *) i = oot; (* Fine *) oot = i; (* Not fine, perhaps not so clearly *) end;What we have going here appears to be inheritance-oriented subtyping in reverse, at least in terms of the cardinality of the types. E.g., when T1 < T2, #T1 > #T2.). What's curious about this, which in fact has been a very long-standing curiosity, is that subtype is IS NOT subsetting, since the cardinalities are reversed.
Here We Go Again (19oct99)
(3dec00 addendum to the addendum -- I already know I'm a stupid dickhead. See the 31may00 item above.)
(3dec00 addendum -- This was already figured out earlier, stupid dickhead. So, the "I'm not sure" heading is irrelevant. Viz., it is the case that the latest spelling-based semantics allow values to be part of types.)
I'm not sure if the latest semantics (defined below and in be defined as a specific set of values. This will let us get back to the old way of doing enum literals like
What seems to have been considered problematic before was to have this definition of type Sex be "string or string". As observed below, this isn't a good idea, since a union of the same types isn't all that sensible. But really, Sex should not be considered of type "string or string", but rather of type "value "Male" or value "Female"", which may or may not be a subtype of string or string, but is decidedly not equal to type string or string.object Sex = Male or Female; value Male = "Male"; value Female = "Female";What still remains problematic is a syntax that would allow unnamed constants to appear directly in type defs, as in the following version of the above
I believe the seat of the syntactic problem here is the redundancy of the "and" and "or" operators in comp exprs and value exprs. Hence, we can't have the fully general syntax we'd like of allowing any value expr as a component of a type. However, we can have numeric and string literals as syntactically allowable components, which renders the immediately preceding definition syntactically legal. (This syntax has been tested in a version of the parser, but is currently commented out since it causes type checking problems. See newtests/string-and-num-lits-as-components.rsl.)object Sex = "Male" or "Female";There is a small potential problem of having an ident designate both a type and a value. Probably the easiest and most sensible way to avoid this problem is to simply disallow it. I.e., an ident cannot designate both a type and a value in the same scope. This may be inconsistent with some part of the current wanna-be formal type semantics, but I don't think it's a fundamental problem at all. I.e., the current semantics can be easily reconciled with the idea that type and value name spaces must be disjoint in a given scope.
What's nice about this thinking is that we've come full circle to some extent, in that we'll now allow the old notion of strings as enum literals. What may also be nice is to do away with symbolic literals entirely, if possible, since they really are excess baggage to a certain (if not large) extent. The old thinking in this regard was always to have strings be a single concept of actual string data as well as enumeration literals. We may be able to get back to this original RSL concept.
And here's one last little nugget in this topic area. Question: should a type like "value "Male" or value "Female"" be (automatically, definitionally, formally) a subtype of "string"? If so, by what formal rule? Similarly, should the type "value 1 and value 2" be a subtype of "int and int"? Again, by what rule.
An initial crack at a rule like this might go something like this:
- an or'd value set of values of the same type is a subtype of that type
- an or'd value of a set of values of different types is a subtype of an or of those types
- an and'd value set is a subtype of an and of the value types
This does require some more formal analysis to make sure it really works.
(And another 3dec00 Metanote: you're a still a double (if not triple) stupid fucking dick head, because the 31may00 item above totally says, and better so, what's in the next 3dec00 parenthetical remark.)
(Another 3dec00 Note: I think you're a stupid dickhead. I'm pretty sure that a careful reading of the (long) extant spelling based compat rules naturally allow what we're gettin at here. Specifically, the following compat rules
handle the (exactly) the case we're talking about here, given that the spelling rule type "the X Y" is the same thing as we're above as "value X".
- a type spelled "the X Y" is compat with X (but very importantly, not vice versa)
- a type X is compat with any union type of which X is a component (but again, pretty obviously, not vice versa)
I'm pretty sure that the conclusion here is "wake the fuck up, the spelling- based type rules really are what we want". So, among other things, stop calling things "subtypes" when you shouldn't be.)
More Syntactic Cleanup
In addition to the stuff about value keyword ... damn my eyes -- I fell off here again.
Answer to the 7dec98 'BIG BUG' LOG Entry
The observation about a major conceptual bug in the 7dec98 LOG entry is evidently correct. The immediate answer, after some thought, is that rather than require that an instantiating type be a subtype of the generic type, we should require that all generic types be opaque. This makes sense from at least one major standpoint, viz., the basic structure of the where clause. When we say "X = Y" in the where clause, the intuitive interpretation is that of full substitution. In this sense, saying that Y must be subtype of X, the more sensible where clause notation would be "X > Y", which (perhaps flash), we might actually add. But let's discuss some more basics first.
The reason that requiring all generic types to be opaque makes sense is that when we do the where instantiation, we are intuitively substituting one type, in total, for another. Hence, that a generic type has any component properties at all seems stupid from this perspective, since any such properties are completely wiped out by the instantiation.
Now, we have apparently had in mind that instantiation is not really of the "wipe out" variety, but more of the "specialization" variety, wherein any properties that the generic type has are specialized by rather than replaced by the instantiating type. And this seems to make some sense, though this may be the first time that we've actually written it down in this way.
Now, what may be possible, either notationally, or via further clarification/extension of opaque type semantics, is to have it both ways. Viz., we will effectively, if not explicitly say the following: "If a generic type is opaque, then it can be instantiated with any type; if a generic type is not opaque, then the instantiating type must be a subtype of the generic type".
Now here comes the potentially cool part. Suppose we say that any type is in fact a subtype of any opaque type. In this way, we wouldn't have to state the instantiation rule conditionally (!).
So now what we need to do is to investigate if the rule that all types are (automatically) subtypes of an opaque type makes sense. We'll do this in the next item or two.
Type Definition and Subtyping at the Margins
OK, en route to our goal of all types being subtypes of opaque types, let's do a little lemma-like reasoning, probably again, on the nature of opaque types. If we've not said so already, it seems reasonable to say that an opaque type is a zero-tuple. (It turns out that grepping for {0,zero}-tuple turns it up only once in typechk.c, where the comment is to revert objs that turn out to be 0-tuples back to opaque types. This comment is apparently consistent with what we're saying now, viz., that we formally regard opaque types as 0-tuples.)
Anyway, this being the case, here's an interesting postulate:
andobj X is Tare the same definition. In the first case, X is a one-tuple of T. In the second case, X is a zero-tuple that inherits from T, which turns it into a one- tuple of T. OK, so this really isn't too startling.obj X < TSo, what we're saying here is that at the margins, a one-tuple is-a opaque obj that inherits from one type that is-a identity for the defining/inherited-from type. And I don't think this is new at all in the fundamental semantics as currently defined in the typechecker, documentation, and our heads.
Back to the Latest 'Fundamental' Question
So, are we OK to say that an any opaque type is automatically a parent type of any (non-)opaque type? Here are some more questions in this regard that we'll need to answer:
- Do we need the "(non-)" just above; i.e., can opaque type be a subtype of another opaque type, including hence itself? I'm leaning "no" initially on this.
- Should we soften this idea to say that an opaque type is automatically allowed to be the parent type of any non-opaque type? In this way we're talking about the potentiality of being a parent type, which may possibly be better (if it is in fact any different formally).
- Should we add a "<" (or, per the change in syntax recommended below, "<=") form to the where clause, that explicitly specifies inheriting rather than replacing instantiation semantics. It seems that this would be cumbersome and potentially confusing notationally.
- Will this rule for opaque types mix OK with the probably-not-yet-fully- articulated idea that an opaque type is a form of type variable?
- Oh, and one last somewhat related, and I think still nagging question: does it make any sense to inherit from an atomic type?
I'm pretty happy for now to stop here, think about it some more, and come back to finish things up. I'll be back.
OK, I'm back now, with the following thoughts (and hopefully a conclusion).
As usual, I was the total fucking that I am, and did not conclude things here. The latest thinking, which is pretty close to a conclusion, is in the 14dec00 entry way above.
Possible Syntactic Superflash
OK, if we go ahead as stated below and put in value as a keyword, the "obj =" form of definition is now deprecated. How about if we now use "=" in place of "is" and loose "is" altogether from the syntax. The benefit is that we won't be dealing with the loaded English word "is" anymore, which has all the "is-a" inheritance baggage.
So, what we'll be left with are "=" and "<" for the two shorthand definition forms, or we might even want to use "<=" instead of "=". So, we'll have
andobj X = A and B and Cobj X < Z = A and B and CSome more examples, to feel things out:
obj PersonRecord = Name and ID and Age; obj SalariedEmployee < PersonRecord = Salary and Step;Having both short forms together does look a little hard to follow, but we can say that the user can always go to long form. We might want to throw in a semi-colon, as in
but I'm not sure this helps much. Bottom line, I think the long-form notation is plenty fine to alleviate the possibly confusing look of using both "<" and "=" in the same short form definition.obj X < Z; = A and B and C obj SalariedEmployee < PersonRecord; = Salary and Step;An initial reaction here is that the only (probably slight) drawbacks are we loose the someone nice-feeling "is" for the most early introduction to formal specs, and we change the semantics of "=" compared to earlier versions of RSL. The first I expect we can live with just file, by saying "Hey, look, we're starting to get formal here, so we're OK using the slightly more 'formal looking' is A and B". Plus, the major win here is we loose the potentially misleading use of "is" as what amounts to "has". What we're saying is that we'll loose both "is" and "has" entirely, to avoid altogether the whole confusing pseudo-English definitional mess.
The second drawback is a bit more serious, but hey, probably not that many people (e.g., mostly Dan) ever used the "=" form of object defs anyway, so we'll be OK. Plus we'll write him a nice Emacs script to go through and change all of his RSL examples. Plus he'll probably agree with the idea of eliminating the confusion around the (mis)use of "is" in RSL. Another mitigating factor regarding the semantic change of the older "obj =" definition form is that we seem to have committed to the value keyword syntax, which totally replaces the old meaning "obj =" anyway. So it appears that we in fact have serendipity here with do together the change to the value form and the new meaning of "obj =".
Regarding using "<=" instead of "<", upon looking at it in the above examples, I think plain "<" looks better. Plus, "<=" when used together with "=" connotes a potentially confusing semantics. Admittedly, "<" by itself connotes a potentially misleading semantics, since if there are no specializing components, "<" does in fact mean "<=". However, on the whole, I think "<" wins over "<=", at least that's the current thinking.
So, the final bottom line is that we should do this, in conjunction with adding the value keyword.
Yet, Yet, ... Yet More on Types versus Values
Today's mindset: it seems pretty foolish not to syntactically distinguish between objects and values, given that they are semantically separate things. This separation, is after all, quite fundamental, since types are statically defined and elaborated whereas values are dynamically defined and evaluated. Therefore, I suggest, most humbly, that we just do it once and for all -- introduce a value keyword. This new keyword need not precluded partially instantiated types, necessarily, but the discussion on type spellings in
While we're at the new value keyword, let's just go whole hog with definition for consistency and say that the currently unnamed entities -- axioms and theorems, now get names. Hence, we'll have the following complete set of entity categories:
object name ... operation name ... value name ... var name ... axiom name ... theorem name ...Reminder
See below about changing the keyword pair "instance of" to the pair "inherits from". It's a good and important idea.
Caution Re. Structural Equiv and Axioms
It seems that structural equiv should not be used in the case of axioms that quantify over types. E.g., consider
In this case, the hoped for meaning seems clearly to be that the axiom applies only to values of type MonthlyDate, not to all integers. Think clearly about this.obj MonthlyDate is integer; axiom forall (md:MonthlyData) (md >= 1) and (md <= 31);Long-Missing Discussion of Auto-Generation of Op Types
I believe that we've discussed this problem, but I do not find it explicitly explained in these notes. The problem is that the automatic generation of an obj from and op is made messy by op overloading. Viz., with two or more overloads of an op, we get overloaded objs, which cannot work exactly.
The problem with the thinking thus far in this area has been that the duality should and probably cannot be complete in this area. The statement of this would-be duality is
For any obj named "X", there is an (automatically-generated) op named "X" that is its constructor AND for any op named "X" there is an (automatically- generated) obj named "X" that is its op type.The problem, I now believe, is with the second conjoin in this statement, viz. the auto-gen of an op type named "X". The problem is that this two- way naming cannot really work. What's confused here is that we have three semantic concepts going on here when we need two for the duality. I.e., we have one semantic concept too many. The concepts areWhat we're saying here is that the last of these three concepts has to go when it comes to same naming. I.e., the op-type obj for an op should not (cannot) have the same exact name as the op. We'll talk some more about this below, but a good candidate name for this op type is the mangled name, with some kind of "Op" prefix or suffix, the prefix or suffix being necessary for opaque types where the mangled name and actual name of an op are identical.
- Every obj has a dual op of the same name called its constructor, the signature of which is the obj's type.
- Every op that has the same name as an obj and signature that matches the obj's type is the constructor for that obj.
- Every op has a type which is its type as an op type
To further clarify the matter, here are some points the complements of which are not true (that's twisted, eh?):
- Not every op is the constructor for some obj.
- Not every obj is the op type of some op.
So, what about the naming convention for auto-gen'd op types? How about
Op<op-namee>E.g., for the op definition (t op Foo(X,Y,Z)->(X,Y)to<op-inputs>
What have we learned from studying UML? That the RSL treatment of relations can be generalized and sanitized as follows:
which means that X is a tuple of 2 Y's and between 5 and 10 Z's.obj X is Y[2] and Z[5..10];
OK, let's look at a revised nomenclature for built-in obj and op attributes:
is the short form ofobject X inherits from Y and Z is A and B and C;
Here is a long form showing all of the built-in object attributes:object X inheritance: Y and Z; components: A and B and C; end X;
Note here that we've dropped the "is" in the long form. Note also that a "value" attribute can be used in place of the components attribute for a fully concrete object.object X inheritance: Y and Z; components: a:A and b:B and c:C; operations: P and Q and R; equations: P(x) == Q(y); actions: a.a1 = z.a1; description: (* ... *); where: D = E; end X;
Yet another possibility for solving the abstract versus concrete value dilemma is to allow only constant values in object value expressions, where constant values do not include (some) expression operators. This makes sense in general, given strong typing. Here are some more thoughts, rapidly.
We've always liked the ideas of partially concrete types, where, e.g., components could be something like "integer and 2 and integer". I.e., one or more components can be concrete values. If we go with only allowing constant values syntactically as values, we can probably go back to the syntax that allows composition expressions to be mixed types and values. Recall that then problem with this syntax is that operators like "and" are ambiguous in a composition expression that allows both value and type expression operators.
The notion of a fully concrete object is that all of its components are values, not types. This is consistent with the nomenclature used in the type spelling rules where we say that the spelling of a value is "the ...".
BOTTOM LINE for this discussion: It appears that the current solution of allowing value identifiers in type expressions is probably the best one. However reasonable it may seem to disallow boolean operators in constant expressions, this really is an arbitrary restriction. After all, we cannot disallow all expression operators in value expressions, since this would prevent constructing concrete composite values with list and tuple operators. Given that we must admit some expression operators in concrete value defs, it can only be arbitrary what ones to eliminate after that. When we think about it, it really is just a syntactic coincidence that "and" has two meanings in type versus value expressions.
So I think things are just fine if we stick with the current syntax that allows value identifiers in type expressions, but does not otherwise allow mixing and matching of type and value expressions.
I believe it's also OK to continue with the short form "is" versus "=" notations. The former is for any form of abstract object, including partially concrete ones. The latter notation is reserved only for fully concrete objects. Syntactically, this is what allows us to segregate type and value expressions.
As hinted above, we might now consider dropping the "is" from the long form of definition. This will leave "is" strictly as short form syntax. I think I like this.
Given the graceful transition from abstract to concrete values, the components attribute should be usable for both the type and value parts of an object. Here's the ground rule for this: An object identifier denotes either a type or a value, never both. 1 Hence, the components attribute can be used to hold an object's type or value, since an object will never have both at the same time. There is a potentially annoying syntactic problem with this, which has actually been around almost forever. Viz., the plural noun "components" seems an inappropriate name to used for the single value of an object, just as it does for a single type. E.g., if we define type X as identically type integer, the usage "components: integer": has always been strained. Similarly, if we define X as identically the integer 10, the use of "components: 10" again seems strained.
To some extent, at least formally anyway, we've alleviated the "components" as singular attribute problem by defining a one-tuple type to be the same as component type itself. This solves the problem for values as well. However, for something as fundamental as this, it seems hokey to have such a severe nomenclature problem.
Since we have worked out the semantics satisfactorily, we should be able to work out some accompanying notation. How about this: (1) we'll allow both the singular "component" as well as the plural "components" as a nod to objects that have exactly one component; we could even have the checker check for this; (2) we'll allow "value" to be substituted for "component(s)" when an object is a fully concrete value; the checker should in fact check for this, since we'll be evaluating the attribute differently in the two cases (i.e., type versus value expression). I was just about to say that I'm not even sure about the "value" attribute, since "component: 10;" would be as good as "value: 10;". However, we must recall the ambiguity between type and value exprs, hence the necessity for "value" as a separate attribute name. After all of this is done, if the user still has some doubt about the nomenclatural clarity of specifying type identity with a "component", we can sugar up the "one tuple of X = X" argument and take care of things pretty well.
We have toyed with the idea of a default value for an object, which could be present in addition to components. I'd rather not build such a thing in. Rather, we'll just specify an Initialize operation that creates objects and specifies certain concrete values objects must have for starters. See the next item for some further interesting discussion about this.
To make an object read-only, with a given concrete value throughout execution, specify it with an axiom as follows:
Hmm, this is pretty cute.obj X is ...; obj DefaultXVal = ...; axiom forall (x:X) (x = DefaultXVal);
In light of preceding discussion, this item is bogus.
Oh, please, say this is really the last word on the subject. We now will goto a separate value keyword, instead of the 'is' versus '=' object notation. Here's how we'll explain it.
There are two main kinds of entities -- objects and operations. Objects denote data types and operations denote functions. In addition, there are the following ancillary kinds of entities:
Here is some further discussion/explanation to rationalize the keyword syntax, in particular. In V4, let's have 'is' and '=' be synonyms. This sounds pretty good, even though it isn't quite in keeping with the so-called is-a versus has-a distinction. We can easily, I think, rationalize this with the "hey, we're in the specification language, not programming language business" argument. Given this, in V4 here's a revised bit of syntax table:
keyword equivalent symbol meaning
is = is defined as; aka, is composed of instance of < inherits from
This seems pretty clean syntactically, since it allows defs of the form
obj X < Y = A and B and C; obj Y instance of Z is D and E and F;
Further rationalization goes like this. The reason we use the keyword "value" instead of "constant" is again because we're a specification language, not a programming language. This seems to be shades of ML, which is fine. (Global) variables break the functional semantics, but are considered necessary in order to specify certain state-based semantics, including concurrency based on shared variables.In thinking about how to resolve op inheritance, it occurs that we are not doing obj inheritance properly. In particular, in order to be able to do struct equiv properly, we need to do the following:
See:
Farg -- all of the crap below is yet another circle. Try this. A definition of the following form
defines both a value and a type. Contrary to what we may have been thinking, however, the type of X is NOT string. Rather, its type is X, which is a unique opaque type with is a subtype of string. It seems that we may have had this idea before, but were unable to make it work (except maybe in the "THE, Ultimate" discussion below). I think the reason is we didn't have one-tuple-of-X = X type equiv rule in place at the time.obj X = "abc";
To unify things completely, it might be nice to go back to the old, old idea that values can be first-class operands in components exprs. However, this leads to very serious overloading problems, as in:
wherein we cannot tell if obj x should be a tuple type or a boolean type. I.e., one interpretation of "a and b" takes "and" to be tuple construction, whereas the expression interpretation of "a and b" takes "and" to be boolean conjunction. This wont do. Therefore, I think we need to live with the slightly hokey f(s:Sex)->boolean = s = "male";obj x is a and b;
OK, so we go with "value =" instead of "object =" -- what really do we buy? Seems to me that if we don't really solve any big problem, might want to stick with "object =" simply for backward compat and notational orthogonality. So, the question remains, do we solve any big problem simply by subbing "value" for "object"? As I speak, I'm getting the feeling that the answer is no. Essentially, what we're coming to is that an "obj =" definition does not in fact define a type at all, but rather a (const) variable. Hence, the stuff about "restricted" type need not exist. So what we're left with is simply a keyword thing. Viz., should be use "obj =" or "value ="? Is this really all there is to it? Let's play with the implementation and see, since we now need to fix the problem with rsl3/tests/dan-lawyers-and-judges.rsl.
OK, now we've implemented a bit. To be clear, what we've just implemented is that the "obj X = ..." form now defines X as of class C_Var, not C_Obj. This makes things pretty darn clear cut semantically, but has lead to some new issues. One consequence of leaving the "obj =" form instead of the semantically potentially more descriptive "val =" form are the error messages that say "Components of objects must be objects (X is not)". This error message will happen if X is a concrete value, which is quite confusing to the hapless user. Hence, for now, we've dealt with the problem by changing the text of this error message "Components of objects must be *abstract* objects (X is not)". A little hokey, but maybe we can live with it.
Another potentially more troublesome consequence of the new C_Var implementation is what's happened to the tests/bnf.rsl example. Recall that this example is cited below in the nicely concluded discussion of symbolic literals. What's happened now is that bnf.rsl has gone bad (again?). Specifically,
now produces four error messages since PlusSign, etc. are now concrete objs (a.k.a. vars), which are not legal in comp exprs. At the same time, bnf-v2.rsl is (still) OK. It contains justobj Operator is PlusSign | MinusSign | TimesSign | DivideSign; obj PlusSign = '+'; obj MinusSign = '-'; obj TimesSign = '*'; obj DivideSign = '/';
The seemingly ugly part here is that "equals-for-equals" semantics appears to have broken down, at least superficially. Viz., '+' is OK in directly in a comp expression, but not indirectly via having been defined as the value of a concrete object. I believe that the semantics are still sound here, in that the symbolic literal notation is define clearly as a special form of designation that is the only notation that designates either a type or value, depending on context. Given this, the declaration "obj PlusSign = declaration, PlusSign has now become a concrete value only, and since idents do not have the special dual designation property of sym lits, the use of such a concrete object ident such as PlusSign will (and should) be illegal in a comp expression. All of this reasoning not withstanding, I'm starting to think at this point, after this latest round of implementation, the that "value =" notation might now be better, since it clearly denotes when something is a value and not, therefore, an abstract object. Hmm., need to cogitate some more.obj Operator is '+' | '-' | '*' | '/';
Once and for all, let's do the following:
- Throw out the syntactic form "obj X = ...", replacing it with "value X = ...". (Well, as can seen above, we're still thinking about this one.)
- Implement the short-hand usage if 'E' lists in comp exprs.
- Implement full struct type equiv and the equivalence of obj exprs and ins and outs. (As a practical matter, we may never get to this; rather we'll just implement as much of it as necessary to get auto decl of constructor ops, and whatever other nice effects we need. We're still thinking about it as of now.)
- Implement the "top" opaque type any, from which all types implicitly inherit. This will obviate the need to implement '*' as a type, I think. The only op defined on any will be '=', which is always false for function types. (Still thinking about this one too. Some recent work with opaque types as vars (in particular in newtests/master-list.rsl) may obviate the need for a true top type.)
It seems to me that about the only reason we've been clinging to the "obj is" versus "obj =" notation, instead of an explicit value keyword, is for enum literals as strings. It seems further that the idea of enum literals as strings has pretty much gone away. Therefore, I don't think we need to mess with the idea of a "restricted" type as we were below, since I recall that this stemmed from the whole enums as strings mess.
Therefore, if we do stick with "obj = ..." instead of "value = ..." it's for notational orthogonality, not because we want to stick with restricted types or strings as enum literals. See discussion about the latter to follow.
In PLs, there is are generally two separate constructs for enums versus unions. In RSL, there is the single or for both. This, it seems, has led to us going around in circles for so long about how exactly to represent enum lits. Let's see if we have a clarifying summary of where we've come to at this point, which is hopefully a sound solution to the problem of how to represent enum lits.
First off, enum lits are NOT strings and NOT integers. Actually, enum lits have two identities -- as types and as values. As a type, an enum lit is an opaque type. As a value, an enum lit is a constructed value of an opaque type, which is fundamentally denoted as E(), for opaque type E. Since enum lits are an oft-used construct, the syntactic sugaring of 'E' is provided.
Conceptually, the following rule may seem in order:
An opaque type E is a type, NOT a value; an enum lit 'E' is a value, NOT a type.This leads to the more general rule that
No single denotation can represent both a type and a value.A, perhaps the only remaining, problem with this is exemplified by the bnf- v2.rsl file, where enum lit values are used as type names. The problem with this bit of notational trickery is that it violates the preceding general rule, and leads back to the idea of restricted types, which we've never fully formalized properly. (But see bottom line below.)
A conclusion could be that we could probably live with the preceding rule, if it gives us a simple and sound semantics, even if we don't get quite as nice a bnf notation. Since BNFs in RSL are presumably not that common, this probably isn't a major problem.
However, if we want to allow the convenience of the notation in examples like bnf-v2.rsl, a (the?) fundamental question is:
Can we devise a sound semantics that will support enum lit designations being used as both types and values?Here's a crack at how we might implement the type checking of such:
Actually, this may be a pretty sweet little solution to the enum lit thorn in the side. Viz. the following works fine now:
- Whenever a 'E' lit appears in a type expression, enter an opaque type of that name ("E") in the Level 0 symtab if one does not already exist.
- Complain clearly if the enum lit is already def'd as a non-opaque type.
- Whenever 'E' appears in a value expression, it denotes a value of type E.
- Be sure to clarify in the users manual that 'E' is never = "E".
and doesn't even look half bad. Also, it looks like it can be explained pretty easily. Hmm., maybe we've fixed things finally.obj Sex is 'Male' or 'Female';
Also, bottom lineishly, except for idents 'E' lits, no denotation is both b type and a value.
WRONG REASONING: Because such a constructor would always have to create a nil value, but given that nil is totally overloaded anyway, one might as well just use nil in the first place.
CORRECT REASONING: Because such a constructor will NOT always create a nil value, but rather for tuples will create a tuple value with potentially initialized fields, per the object definition. See note below about nice duality between obj's and op's vis a vis init'd name/type pairs.
See the very important discussion in newtests/one-tuple-niceties.rsl.
If we've not already, we need to decide if this is allowed. Apparently the 3.1 type checker thinks it's OK, since it passes tests/subtype-poly-tst.rsl, q.v.
As noted under the "POSSIBLE FLASH" heading below, we can hopefully implement the following rule: there is no distinction between a one-tuple of type T and type T itself. We'll talk more about this, and its lovely consequences, soon.
Consider
The idea is that allowing '*' by itself to be a type designator, we're writing a generic function that can be applied to any list. Now, what happens if we overload this? Well, I think the same basic "lowest" rule for overload resolution can apply is with subtype polymorphism. Viz., such fully generic list functions are the "highest" in the chain, and will be overridden by any other functions of the same name that take more specific forms of lists.function IsSorted(l:*) forall (i,j: integer | (i in [1..#l]) and (j in [1..#l]) and (i<j)) l[i].id < l[j].id;
We might even consider that any list object is automatically a subclass of generic type list, which means that it inherits things that are true for all lists. We could make something conceptually nice and uniform out of this, if we could get the different forms of instantiation done straight.
On the practical side, here's a sketch of the (seemingly ML-like) way that such generic (list) functions would have to be type checked:
- When we type check the generic itself, the base type of the list will have to be some type var, which checks like type Any (i.e., compat with everything, except maybe function types, but we'll have to see about this).
- At any call to the function, we'll have to (re)instantiate the function's body, with the appropriate generic instantiation.
- We need to give much better error messages that ML does, obviously. Something like the following would be nice:
after which we simply proceed to list the error messages that normal type checking on the instantiated body produces, probably with line numbers for the cite of the generic definition.Generic function invocation fails because of the following type problems: ...
The argument under "OLD DISCUSSION" below is evidently wrong, or at least confused. Contrary to what's said there, it seems that it is OK to use opaque types as type vars, particularly if we use the rule that {T} and T are the same type. Here's the deal.
Where the discussion below goes wrong is in the following statement:
The earlier position was to treat opaque types as type vars, such that any type could bind to an opaque type, modulo varname use in a signature.In particular, it's the italicized part that's wrong. Viz., we do not let any type bind to an opaque type, only explicitly declared subtypes can so bind. And this is precisely the normal subtype polymorphism rule -- that any type T1 can bind to any type T2, iff T1 <= T2.
Hence, we're OK with considering opaque types to be a form of type var, but not the fully general form of type var, as in ML. OLD DISCUSSION:
It seems from the discussions below that we've switched positions diametrically with respect to the equivalence rule for opaque types. Viz.,
Type latticewise, the current position has opaque types on the bottom of the lattice, whereas the earlier position has them at the top.
What this means at present is that we should stick with the current position, but add a syntactic notation for type vars, if necessary. We discuss the type var issue below.
So, the question is, do we need type vars, and if so for what? While we're at it, we need to fix all of the generic instantiation crap that's in the ref man. In particular, when thinking about type vars in generics we probably don't want to have where clause substitution types to have to be subtypes of what they substitute for (see the ref man).
The point of confusion is to exactly what extent the seemingly true statement "inheritance" is not "generics" is true, particularly in RSL. Let's look that the following very familiar example, the current where-clause way:
Let's try a more conventional parameterized type way:obj Stack is Elem* ops: Push, Pop, Top; end Stack; obj IntStack < Stack where: Elem = integer; end IntStack;
A key problem we're suffering with is the nature of inheritance as tuple creation. Even if a subobject does not add new components, it automatically becomes a tuple. I think this needs to be reconsidered.obj Stack(Elem?) is Elem?* ops: Push, Pop, Top; end Stack; obj IntStack is Stack(integer);
A promising-looking idea would be to eliminate the immediate conversion of a subtype to a tuple. Rather, wait until the second inherited or defined component is found before converting to a tuple.
POSSIBLE FLASH (see full discussion above): make no semantic distinction between a one-tuple and a non-tuple. E.g., an object of type integer is the same as an object of type {integer}. One nice benefit of this approach is that the existing anomaly of not being able to access a single-tuple elem via '.' its name would go away. The rule would be that a one-tuple could be accessed with or without a field name selector. Hmm, this seems pretty-darn nice so far.
Another mess we've gotten ourselves into is not knowing the difference between generic instantiation via '<' versus 'is'. Consider the following example (excerpted from tests/stack.rsl):
Lest we think where-clause notation might fix things here, it doesn't. I.e., we can have a where clause with either '<' or 'is'.obj Stack(<) is < *; obj IntStack is Stack(integer); (* BIG QUESTION: what's the diff *) obj InsStack < Stack(integer); (* between these two?? *)
Here are some pertinent questions related to this issue:
- Should both forms of instantiation be allowed?
- If we use the parm'd types notation, should an instantiation form be allowed anywhere that a type is allowed, or should we allow it only in a parts_spec? FORGET THIS: this wont make any diff if op sigs are now the the same as parts_specs.
In an effort to define automatic constructor ops, we need yet again to address precisely how to declare op args. We have discussed at length that we would like to allow a fully general parts_spec in the input and output fields of an op. To do this, we need to clarify if we have the ML-like single-arg model, or if not, exactly what we do have. Let's try the following:
The following discussion is historical only in that we have decided to be happy with '<' as the instance-of operation. We'll also add subclass (of) for some additional syntactic sugaring.
We'd like to use ':' as the instance-of operator. This leads to an syntactic ambiguity with uses of ':' to declare type in a concrete obj decl. This, somewhat serendipitously, suggests that we should eliminate the latter use of concrete object of a particular named type. This leads to the idea of an automatically defined overload of object constructor that takes an argument of the structured type of its components. Ah, but we probably already have this.
A motivating example for this is the biggie burger food menu, which in the past we have declared something like this:
If we eliminate this use of ':', we get the following:obj DefaultMenu:FoodMenu = [ ["Burger", 1.89, [...]], ... ];
which probably looks better mnemonically anyway.obj DefaultMenu = FoodMenu( MenuSection("Burger", 1.89, Acessories(...)), ... );
There seems to be really little problem with eliminating this use of ':', since it's only of consequence in concrete obj decls. In other binding contexts, we'll have a declared formal (LHS) to determine the type, and in effect do the casting for us. The "casting" to which we refer here is that of casting a structured value specified with the generic bracket constructors into a specific named type.
All of this is related, yet further, to the next heading on concrete values.
I thinking again that we should add the keyword "value" to the language. Ah shucks, I wish I could make up my mind about this! Maybe we can make "let" be a top-level (i.e., module-level) construct, so that values can be created that way. E.g., now we have:
With a value keyword, we'd have:module Foo; object X = 10; end Foo;
Allowing top-level let, we'd have module Foo; let X = 10; end Foo; I kinda like that last one (the let). Let's think about it (pun partially intended).module Foo; val X = 10; end Foo;
Summary:
Given below is an adaptation of the following DISL code:
Here's the RSL-like equivalent (see ../../rsl/alpha/newtests/sample-demo- code.rsl):INTERFACE DummyName{ Obj1 := DrawRect(342, 110, 472, 142); Obj0 := DrawRect(88, 112, 218, 144); MAPPINGS Obj0: LEFTDOWN: op0; OPERATION op0(s : Selection; l : integer; b : integer) { STIMULUS MoveSelection(s, l, b); RESPONSE { MoveSelection(Obj1, integer(3.250000 * real(l)), integer(1.600000 * real(b))); } } }
where module Demo is:(* interface *) module DummyName; from Demo import Selection; define object attribute leftdown; define operation attribute stimulus, response; object Obj1 = DrawRect(342, 110, 472, 142) leftdown: op0; end Obj1; object Obj0 = DrawRect(88, 112, 218, 144); operation op0(s:Selection, l:integer, b:integer) < DemoOp stimulus: MoveSelection(s, l, b); response: MoveSelection(Obj1, integer(3.250000 * real(l)), integer(1.600000 * real(b))); end op0; end DummyName;
Note that this requires that we expand RSL syntax to allow attribute values to be exprs, which we've done. This, BTW, lead to the new reduce/reduce error, which we've determined to be OK.module Demo; export Selection; object DemoWorld; object Selection; operation DemoOp(DemoWorld)->DemoWorld; end Demo;
The latest idea is that we may be able to have MLish parm bundling and unbundling, as long as we do it at the top-level of parm nesting only. Consider the following familiar-looking defs: .(t obj X is A, B, C; op F1(X)->(X); op F1(X)->(A, B, C) op F2(A, B, C)->(X) op F3(A, B, C)->(A, B, C); NEED TO EXPAND THIS, looking carefully at what's already been said below.
There are also some important, seemingly deeper thots on subtype poly vis a vis constructed types, e.g., with "[" ... "]" and "{" ... "}". If it's not completely clear in what's said below, these constructors construct types that are structurally compat with list and tuple types, but NOT subtype compat with ANY types. This is because these constructors build anonymous types, which cannot, by definition, be subtype compat. It seems like what could be emerging here is a version of name type equiv that relates to subtyping. I.e., we have a structural equiv rule at the level of straight types, but structural equiv does not extend into subtypes. Probably should get off butt and see what research there is about this out there. It seems likely that this kind of rule already exists, maybe from some time ago.
Thot: provide pre-def'd type vars "any", and "anyn", for all n, with the intent that any1, e.g., used in a signature is (must be) the same type in all signature occurrences, but is not necessarily the same as, e.g,, type any2. Drawback here is using up ident names, but this may not be a problem. Need to think about it. Alternative, used in scattered examples, is to use '?' an ident prefix or suffix to denote a type var. Prob with this is lexical/syntactic confusion with '?' as the infix tag selection operator.
It may just be possible to have it all, viz., to have full subtype polymorphism; structural equivalence; complete compatibility between comp exprs and op sigs; and overloading. Here are some observations and rules in this regard, which need further discussion,
Note that things get funky in cases such as the following, wherein we'll need some clear disambiguating rules:op Op(integer and string)->integer; op main() = ( Op(1, "abc"); (* OK *) Op({1, "abc"}); (* Also OK, and equiv to the preceding. *)
obj ThreeInts is integer and integer and integer; op Op(integer and integer and integer and ThreeInts integer*); op main() = ( Op(1,2,3, 4,5,6, 7,8,9,10) );
obj X is string; obj Y < X is integer and integer; op DoX(X)->X; var y:Y; axiom ... and DoX("abc",1,2) and ... ; (* ^^^^^^^^ This wont work since ("abc",1,2) is in fact compat with * type Y, however it cannot easily be inferred to be a Y in this * context, and hence a subtype poly match of DoX wont happen. *) axiom ... and DoX(Y("abc",1,2)) and ... ; (* ^^^^^^^^^^^^^^^^^ This *will* work since Y("abc",1,2) is directly * discernible as a Y, and hence the subtype poly match can be * found. *)
Despite observations to the contrary, we may well want to allow single-elem tuples to be equiv (say, structurally) to types of the elem type. The reason is a good one, when one considers the existing examples of generic list- structured objects, such as the ref man generic DB example. Consider:
The reason, I believe, that we were clinging to non-equiv of single-elem tuples and the elem type is to cling to name type equiv, which in turn was, among other possible reasons, to allow op ordering dependencies to be specified by unique types. We probably want to avoid this anyway, but if it's really necessary, say in a comm protocol specification, we can do it via unique opaque types, since these are always unique. I think we'll be hosed pretty badly if we say that all opaque types are equiv, so we can rely on their uniqueness. While this may make the spec of sequential control dependency a bit uglier than it might otherwise be, we can live with this since we want to frown on control dependencies anyway.obj GenericList is Elem*; obj IntList < GenericList where: Elem = integer; end IntList; op f(l:IntList, i:integer)->boolean = l[1] = i; (* !! Wont work, because IntList is of type (integer* and) * NOT of type integer*. Oops -- this ain't so nice. *)
The stuff below about reintroducing direc op inheritance still seems good. We need to make the following refinement. An op definition will define a type all right, but a value-constrained type. The subtle problem we need to deal with is an invocation of subop, bound to a var of a parent type. E.g., in the example below we have:
The call gop.<Move(parms.<SelectClassParms) is just fine in this context, since gop is bound to op Move. Suppose, however, that the body of ExecuteSelectClassOp wasop class GenericGraphicsOp(Canvas)->Canvas; op Move(SelectClassPaZOrms) < GenericGraphicsOp; op Scale(ScaleClassParms) < GenericGraphicsOp; op ExecuteSelectClass(gop:GenericGraphicsOp, parms:GenericGraphicsOpParms) = gop.<Move(parms.<SelectClassParms); op main() = ( let s = Selection(...); ExecuteSelectClass(Move, SelectClassParms(s, 10, 20));
The deal here is that we can't know that gop is bound to Move, yet at runtime we must be able to run something. So, we'll have some behavior inheritance. What this means is than subops inherit the body of their parent op. It seems that the runtime rule can be that any subop will inherit the body(ies) of all parent ops, an analogous manner to how subobjs inherit fields. Then, in situations such as the above, when a subop instance is hanging out in a parent-type variable, the inherited parent op body can be executed by extracting it from the subop. Cool.gop(parms);
In general, the concept of behavior inheritance should always be on, to be precisely the dual of structural inheritance. To be precise, behavior inheritance will work akin to constructor behavior inheritance in C++. !BUT WOA HERE -- this can't work in a functional setting, since we seem to have no way to capture the return value of parent execution. So, maybe what has to go on here is to execute only parent behavior in cases such as the above where we have a subop value hanging out in a parent var. !NO BUT WOA AGAIN -- maybe we do in fact have a way to get at the return value of the parent. Viz., it's in the output parm(s) of the parent. E.g., adding parm names to the op defs above:
Given this, we can refer to the output of GenericGraphicsOp as a parent op from within the definition of one of its children, say Move, as in:op class GenericGraphicsOp(c:Canvas)->(c':Canvas); op Move(scp:SelectClassParms) < GenericGraphicsOp; op Scale(scp:ScaleClassParms) < GenericGraphicsOp;
which simply means that Move returns just the c' that its parent GenericGraphicsOp returned. (This could be way cool, if it pans out.) Move could do more if it chose, including not referring to c' at all, in which case c''s return value from Move would be whatever its parent computed it to be. If Move wants to do more with c', then it can, as inop Move(scp:SelectClassParms) < GenericGraphicsOp = c';
for some function f.op Move(scp:SelectClassParms) < GenericGraphicsOp = f(c');
OK, let's see if we can define some type checking rules that will make sense out of all of the possible ways that unions can be used, probably most particularly as enumeration types. Rule 1: If all of the components of a union type are unique types, then the checker can infer the necessary injection in binding contexts. Rule 2: If all of the components of a union type are the same type, then the checker can infer the necessary injection in binding contexts, which type is simply the component type. This rule will make the old-style enumerations work as expected. Rule 3: If component types of a union are neither unique nor the same, then no injection inference is possible. PERHAPS we want to disallow this case altogether. To make all of this work, how bout if we define concrete obj defs to mean the following.
<==>obj x = val
where val must be a const expression. E.g.,obj x < typeof(val); obj 'val' < typeof(val);
<==>obj x = "xyz"
Do think about this some more.obj x < string; obj 'xyz' < string;
OK, from the real demo.rsl example, here's what we'd like to do:
Given such defs, can we reasonably expect the following form of application to work?(* * Primitive Graphic Operations *) op Select(Canvas, Selection, Location)->Canvas; op Move(Canvas, Selection, Location)->c':Canvas; op Scale(Canvas, Selection, ScaleFactor)->Canvas; (* Etc., ... *) obj Location is Coord and Coord; obj Coord is integer; (* * Generic Graphic Op Parm Classes *) obj class GenericGraphicOpParms is Canvas; obj SelectClassParms < GenericGraphicOpParms is Selection and Location; obj ScaleClassParms < GenericGraphicOpParms is Selection ScaleFactor; (* * The Generic Type of All Graphic Ops *) obj GraphicsOperation is op(GenericGraphicOpParms)->(Canvas);
The key questions/points here are:op ExecuteSelectClass(gop:GraphicsOperation, c:Canvas, s:Selection, x:Coord, y:Coord) = gop(c, s, x, y); op main() = ( let s = Selection(...); ExecuteSelectClass(Move, s, 10, 20); );
Let's look a little closer at how things might work out here:
- Is the type of operation Move, as it is defined, compat with type GraphicsOperation? If so, can we define a sound bundling/unbundling scheme that will let us do what we hope for?
- If so, what kind of bundling/unbundling deal have we done, and how might it adversely affect overloading, in particular by increasing the amount of work we have to do to check for equivalent bundled versus unbundled proc defs during op definition checking?
The hard part here seems to be the ">" inference step. Another way to look at things is that from a vanilla definition like
type of GraphicsOperation = op(GenericGraphicsParms)
> op(Canvas, Selection, Coord, Coord)
= op(Canvas, Selection, Location)
type of Move = op(Canvas, Selection, Location)
we can't necessarily inferop Move(Canvas, Selection, Location)->c':Canvas;
While we probably don't need to infer this at op definition time, we do at binding, which may still be troublesome, if not impossible. Another issue is that whatever resolution algorithm is used to check bindings will also need to be used to check redundant overloadings. We might try to ameliorate the situation be definingop Move(GenericGraphicParms)->c':Canvas;
as an overload of Select, i.e., leaving the original definition of Select in place. However, in order to have the kind of bundling we need to make the other stuff work, this should probably be a redundant over load of the extant, i.e., ofop Select(SelectClassParms)->Canvas;
Hmm, things dont look real promising at this juncture. We could brute force it as follows:op Select(Canvas, Selection, Location)->Canvas;
but we don't have a SelectClassOp. Also, even if we did, is this too gross and is it type safe? OK, let's just try it in a more doable way and see if even that works:op ExecuteSelectClass(gop:GraphicsOperation, c:Canvas, s:Selection, x:Coord, y:Coord) = gop.<SelectClassOp(c, s, x, y);
Hmm, we still seem to have a fundamental problem here. The cruxt of the problem is that given the following defs(* * Primitive Graphic Operations *) op Select(SelectClassParms)->Canvas; op Move(SelectClassParms)->Canvas; op Scale(ScaleClassParms)->Canvas; (* Etc., ... *) obj Location is Coord and Coord; obj Coord is integer; (* * Generic Graphic Op Parm Classes *) obj class GenericGraphicsOpParms is Canvas; obj SelectClassParms < GenericGraphicsOpParms is Selection and Location; obj ScaleClassParms < GenericGraphicsOpParms is Selection and ScaleFactor; (* * The Generic Types of Graphic Ops *) obj GenericGraphicsOp is op(GenericGraphicOpParms)->(Canvas); obj SelectClassOp < GenericGraphicsOp; obj ScaleClassOp < GenericGraphicsOp; op ExecuteSelectClass(gop:GenericGraphicsOp, parms:GenericGraphicsOpParms) = gop.<SelectClassOp(parms.<SelectClassParms); op main() = ( let s = Selection(...); ExecuteSelectClass(Move, SelectClassParms(s, 10, 20)); );
it seems that it cannot be the case that F (as a value) is compat with type OpType, even though the signature of F is (subtype) compat with the signature of OpType. This lack of compatibility seems to pretty well nix the kind of generic op creation we're attempting above. For some concrete data in this area, see the C++ attempt at doing this, in ~/c++/demo-rsl.c. It appears that this is a genuine problem area, since we get the following extremely interesting at the point where we try to bind Move (the function value) to a parm of type GenericGraphicsOp:obj O1; obj O2 < O1; obj OpType is op(O1); op F(O2);
Fascinating. Here's a flash. Perhaps we want to go back to the idea of operation subclassing, with an interpretation that args are anded into subclasses. This seems like it might be the key to the idea of an op type that allows more args in its subtypes. E.g., try this:warning: contravariance violation for method types ignored
Hey, I think this, at last, has some promise. What we may well be opening the door to is that an op def automatically defines a type, in a dual way to an obj automatically defining a (constructor) op. E.g., check this out:op class GenericGraphicsOp(Canvas)->Canvas; op Move(SelectClassParms) < GenericGraphicsOp; op Scale(ScaleClassParms) < GenericGraphicsOp; op ExecuteSelectClass(gop:GenericGraphicsOp, parms:GenericGraphicsOpParms) = gop.<Move(parms.<SelectClassParms); op main() = ( let s = Selection(...); ExecuteSelectClass(Move, SelectClassParms(s, 10, 20));
automatically defines the typeop X(integer, integer)->integer;
Woe, this is getting a little scary, but maybe really cool. Given the remarks above about ops defining values, evidently, it appears that an op definition defines all three kinds of entities -- type, value, and op. Hopefully we can use context to sort out the name overloading. Hmm, this is getting quite interesting indeed. [But as we've discovered in the future, it can't work just like this; see discussion above about auto-gen of op types.]obj X is op(integer, integer)->integer;
Instead of inventing yet another imperative language, allow op decls of the form
Note the use of curly braces to enclose the block, signaling C code is contained therein. We could even enforce large-grain functionalism, but disallowing global vars and pointers, which is pretty natural to do anyway.op(...)->(...) = { /* Normal C code, not including data type decls */ }
Is it in fact possible to stick with the idea that types and values are the same thing? Specifically, can we meaningfully reconcile the "obj is ..." construction (aka, abstract objects) with the "obj = ..." construction (aka, concrete objects)?
The closest we've seemed to be able to get to it is to think of concrete objects as a restricted type. E.g., with
The type of Male is string, restricted to the specific value "male". At type checking time, this means that Male is treated exactly as if declared to be string. At runtime, it means that the only "male" can be bound to it. If an attempt is made to bind another string value to a variable of type Male, the variable will be set to nil.obj Male = "male"
It seems that this is possible when the elements of a union are unique, named types, otherwise it's not. Rather than restrict union elems arbitrarily/artificially to be named types, we'll just have the checker do its work silently, making injection inferences when it can, and complaining, possibly not explicitly to the point, when it cannot. For a bit of supportive human kindness, see Tennent page 215.
That T2 is a subtype of T1 most emphatically does NOT mean that the value set of T2 is a subset of the value set of T1. In fact, it means the opposite. This concept is counter-intuitive when we consider the notion of subtyping integers and reals. E.g., with
it is not the case that type subint is a subrange of integers. Rather, subint is a single-elem tuple type, the element of which is an integer.obj subint < integer;
More could definitely be said here, and it should be.
OK, maybe it's OK. (Make up your mind, clown!)
We don't need it any more. Forget mangling; we'll type check overloading by checking each def alternative.
This is a summary of the discussion below on type checking op calls.
Now, is the following call possible?obj Thingy; obj ThingyList ops: Insert(ThingyList,Thingy)->ThingyList; FindNth(ThingyList,integer)->Thingy; end;
Well, it just might be, and it might be type-safe as well, for the following reason. While the list can contain a bunch of different typed things, there's nothing that can be done with them unless we define some more ops on type Thingy.Insert(1, Insert(2.5, Insert("xyz", null)));
OK then, let's try this:
I.e., we've added a requirement to the definition of Thingy. Now, can we do the following?obj Thingy ops: "_*_"(Thingy,Thingy)->Thingy; end;
!!!!!!!!This question must be answered!!!!!!!! We'll do so at our next meeting. Or at least try to do so.FindNth(Insert(1, Insert(2.5, Insert("xyz", null))), 1) * FindNth(Insert(1, Insert(2.5, Insert("xyz", null))), 3)
Here's a crack a (partial) answer now -- the solution may lie in the variable nature of opaque types in an op signature. Viz., if there is > 1 opaque type in a signature, then as soon as the first parm of that type is bound, it fixes all other occurrences of that opaque type in the signature to be the initially-bound type.
OK, here is, I believe, the definitive answer on this subject. Viz., there is NO subtype polymorphism allowed, and, in fact, no genuine polymorphism at all. Rather, apparently polymorphic functions are obtained via the generic instantiation mechanism. See rsl3/tests/generic-list.rsl, about which more should be written here.
Is it possible to combine overloading and the ML-style single-arg functions? Let's see. Consider the following definitions:
Are F1 through F4 all the same function type? More precisely, do all of the following calls type check?var x:X, y:Y; obj X is int,real,boolean; obj Y is string,boolean; op F1(X)->(Y); op F2(int,real,boolean)->(Y); op F3(X)->(string,boolean); op F4(int,real,boolean)->(string,boolean);
Emerging rule: a function of a single arg can be called with unbundled actuals, but not vice versa. I.e., a function of multiple args cannot be called with a single arg. I.i.e, actuals will be bundled automatically when necessary, but not unbundled. Now we need to define the unbundling precisely. Also, we need to explain why this unbundling strategy has been chosen. And, it better be a good explanation. I.e., this strategy should only be chosen if it avoids some actual ambiguity or if it saves a significant about of conceptual and/or implementation difficulty.F1(x,y); F2(x,y); F3(x,y); F4(x,y); F1(1,2.5,true); F2(1,2.5,true); F3(1,2.5,true); F4(1,2.5,true);
obj class Any; obj FullyGenericFunc is op(Any)->(Any);
They are. Use "?<" op to check for membership. But this is an old idea, now out of date. Say what in that last sentence??!? I do believe that "?<" and its ilk are alive and well.
In order to support type vars with name type equiv, the following rule seems reasonable:
Are opaque types are distinct, unequivalent to any other type.Precisely how this rule actually affects things needs still to be worked out. See the next section for further, hopefully clarifying discussion.
OK, let's try the following simple example:
OK, try this rule on for size:obj Any is any; obj AI instance of Any is integer; obj AR instance of Any is real; op w(a:Any)->() = WriteReal(a); (* Should not type check. *) op e(a1:Any, a2:Any) op AddRecord(gdb:GenericDB, gr:GenericRecord)->(gdb':GenericDB); (* *) obj PRecord instance of GenericRecord; obj SRecord instance of GenericRecord; var gdb:GenericDB; pr: PRecord; sr: SRecord;
An opaque type represents, among other things, a type variable.Let us consider carefully the ramifications of this rule, particularly in comparison to the concept of representing type vars lexically, as in ML, with some notation such as a leading or trailing question mark. The disadvantage of opaque types as type vars is that it creates sort of a special case for type checking proc calls, in that the type of an actual corresponding to a type var can be any type for which equality is defined. This is really no different than an ML type var, except I'm still a bit uneasy about it somewhat. The advantage of opaque types as type vars is its general orthogonality. In particular, constraints can be placed by the normal operations and equations attributes rather than by some special-purpose notation. E.g.,
versusop F(x:X?, y:Y?) where: exists G(X?,X?)->Y?;
Now consider:obj X ops: G(X,X)->Y; end;
What exactly can be done with/to these objects, and what is the nature of the polymorphism enabled by considering obj Some to be a type var? E.g., is Selects1(SomeAndSome)->(Some) a polymorphic function for any type Some? Seems that things are OK, since SomeAndSome needs to be thrown in to any call. Let's consider some operations:obj Some; obj SomeAndSome is s1:Some and s2:Some; obj SomeList is Some*
Both of these ops can be invoked with a pair of ints. But, let us say, that the 2nd will cover the first, since it is more specific. I think this is a viable rule.op SomeOp(s1:Some, s2:Some)->(s1':Some, s2':Some); op SomeOp(s1:integer, s2:Some)->(s1':Some, s2':Some);
Let's see if we can (begin to) articulate the parm binding type checking rule:
Somewhere in there we need to handle the chameleon functions -- those overloaded based on need. The best built-in example is [...] (but see revision below). It seems that this class may include in general functions overloaded on coarity only.
- If the function desig is a name, get the list of signatures available for that name.
- If the function design is not a name, type check it to obtain the single signature designated.
- Determine the types of each of the actuals, and look first for an exact match, probably via mangled name lookup.
- Then look for a polymorphic match.
- Finally, look for an exact or polymorphic match via bundling.
For starters, see tests/deriving-types-from-obj-exprs.rsl for an example and a bit of discussion. Botton line of this test file is that the type of expr [c1,...,cn] is T* if typeof(ci) = T for all i, else the type is typeof(c1) and it could be interpreted as former if necessary in a given context.
In general, I believe that the definition of a "chameleon" expression is one which can be of two or more types, depending on the context. From a purely functional foundation, such chameleons derive solely from chameleon functions, which are defined as functions overloaded on coarity.
Major news -- let's nuke {...} as a comment delimiter and use it instead as ando constructor. This is largely in keeping with extant PLs, such as C. It also means that [...] is no longer a chameleon, which will no doubt simplify things. It also opens the door to an interim version 3 that does not allow user-defined chameleons.
Try this one on for size. In the past, we have considered hopping back and forth between modules at the top level in order to change contexts. As an alternative to this we could say that there is exactly on top-level scope, conceptually the module Main. The effect of switching scopes can be accomplished by opening and closing files, using the Open and Close environment ops, explained shortly.
Import/export rule summary:
- Default for any module is to export all of its symbols, thereby making all of its symbols available for import.
- If a module contains an explicit export declaration, then only those symbols exported are available for export. (This is a bit odd, but seems to fit best with the "say-the-least, get-the-most-expected" principle.)
- "import x", where x is not a module, imports x unqualified if
- x is an export of exactly one opened module
- x is not defined in the importing scope
- "import x", where x is not a module, imports x qualified if
- x is an export of exactly one opened module
- x is defined in the importing scope
- (a warning is issued upon qualified import)
- "import x", where x is not a module, fails with message if x is export of no module or more than one module.
- "import m", where m is a module name, imports all of the exports of m, per the individual "x" import rules defined above.
- There is an Open function that opens a file, and imports all of its exports into the current scope.
- There is a Close function that unimports all the exports of a file.
Before any module can be imported from, in must be opened. Opening happens when a file containing one or more modules appears as a command-line arg, or when a module name is given as an argument to the Open environment function.
- Only module, object, and operation names may appear in import/export decls.
- Exports may not be third party. I.e., a module may not export any of its imports.
- Import names may be qual idents, indicating from which module an import should come. Qual ident imports follow the same import rules defined above. In particular, an import as a qual ident does not automatically need to be qualified in its usages.
- Given the two-pass nature of importing described below, imports will be processed after all defined module entities. Therefore, an entity definition will never be processed after an import, so that an error of the form "obj already def'd as import" will never occur.
Note that one of the effects of the behavior of Open is that all command-line files containing modules are opened and have their exports put in the Main scope. This means that when the environment starts up, the top-level contains all exports of all modules.
This seems to be pretty good. It means that the normal expectation of complete visibility happens automatically at the top level. At the same time, modules themselves do not automatically gain access to other modules symbols without explicitly importing those symbols.
Another effect of the preceding rules is import at the top level is basically meaningless, or at best redundant. This is the case because command-line files are automatically opened and opening in turn automatically imports into the current scope. To counter this, we might say that Open only makes symbols "available" for importing into the top-level, rather than automatically importing them. However, this seems like a bit of a nuisance, since the (vast) majority of the time, users will want to open and then import at the same time. The only possible advantage of not importing on open is that we could avoid conflicts by creating a top-level "pool" of modules with symbols available for import. Then by selective symbol import, we could grab things into the top level. This seems far less likely to happen, particularly since we're trying to deemphasize information hiding altogether at the specification level.
A low-level implementation observation is that import/export must (should) be accomplished in the second pass. Viz., the first pass will parse all modules, thereby making each of their exports available for import. The second pass will process all imports (which must lexically precede all other module decls), so that:
- there is no forward ref problem with importing
- imported symbols in a particular module will be processed at the beginning of the second pass, after which module type checking will precede.