TypeSystem3
A type system is as set of rules that assign types to terms,
e.g., the type int
can be assigned to the literal 2
.
In Monticore, the type system implementations assign SymTypeExpressions to
expressions (e.g., 2
) and types (e.g., int
).
This is made possible first and foremost by traversing the AST
of the expression or type,
calculating the SymTypeExpressions of its subnodes,
and combining their information to the SymTypeExpression currently calculated.
Given infrastructure in MontiCore
- TypeCheck3
(offers
typeOf
, etc., to query the SymtypeExpressions of AST nodes.)- MapBasedTypeCheck3
(default implementation of TypeCheck3)
- Type4Ast (maps ASTNodes to SymTypeExpressions, filled by the TypeVisitors)
- MapBasedTypeCheck3
(default implementation of TypeCheck3)
- SymTypeExpression
(calculated by the TypeVisitors, represents types and "pseudo-types")
- ISymTypeVisitor (interface for traversal of SymTypeExpressions)
- SymTypeArray
(subclass of SymTypeExpression, represents arrays,
e.g.,
int[]
) - SymTypeObscure (subclass of SymTypeExpression, pseudo-type representing typing errors)
- SymTypeOfFunction
(subclass of SymTypeExpression, represents functions,
e.g.,
int -> void
) - SymTypeOfGenerics
(subclass of SymTypeExpression,
represents (non-primitive) generic nominal data types,
e.g.,
java.util.List<Person>
) - SymTypeOfIntersection
(subclass of SymTypeExpression, represents intersections of types,
e.g.,
Car & Ship
) - SymTypeOfNull (subclass of SymTypeExpression, represents the null type)
- SymTypeOfObject
(subclass of SymTypeExpression,
represents non-primitive non-generic nominal data types,
e.g.,
java.lang.String
) - SymTypeOfRegEx
(subclass of SymTypeExpression, represents subsets of Strings,
e.g.,
R"gr(a|e)y"
) - SymTypeOfUnion
(subclass of SymTypeExpression, represents unions of types,
e.g.,
TreeInnerNode | TreeLeaf
) - SymTypePrimitive
(subclass of SymTypeExpression, represents primitive types,
e.g.,
int
) - SymTypeVariable
(subclass of SymTypeExpression, represents bound type variables,
e.g.,
T
inList<T>
) - SymTypeInferenceVariable (subclass of SymTypeExpression, represents free type variables)
- SymTypeVoid
(subclass of SymTypeExpression, pseudo-type corresponding to
void
)
- SymTypeExpressionFactory
(factory for creating instances of the subclasses of SymTypeExpression)
- MCCollectionSymTypeFactory (factory for CollectionTypes, convenience methods)
- StreamSymTypeFactory (factory for Stream types, convenience methods)
- Functionality to work with SymTypeExpressions, Expressions
- SymTypeRelations
(relations over SymTypeExpressions, e.g.,
isSubTypeOf
,isCompatible
) - MCCollectionSymTypeRelations
(relations over MCCollection SymTypeExpressions, e.g.,
isList
) - FunctionRelations
(relations regarding functions, e.g,
canBeCalledWith
) - SIUnitTypeRelations
(SIUnit relations, e.g.,
multiply
,isOfDimensionOne
) - StreamSymTypeRelations (relations over Stream SymTypeExpressions, e.g., isEventStream)
- WithinScopeBasicSymbolsResolver
(resolves contained variables, functions, ect. within a given scope;
unlike symbol resolving this returns SymTypeExpressions)
- OOWithinScopeBasicSymbolsResolver (resolves using OO-specific rules)
- WithinTypeBasicSymbolsResolver
(resolves contained fields, methods, etc. within a given type;
unlike symbol resolving this returns SymTypeExpressions)
- OOWithinTypeBasicSymbolsResolver (resolves using OO-specific rules)
- TypeContextCalculator (provides context information for an expression wrt. types, e.g., whether a type's private members can be accessed)
- LValueRelations
(whether an expression is an L-value, e.g., a variable)
- CommonExpressionsLValueRelations (implementation of LValueRelations for languages with CommonExpressions and no further LValues)
- TypeVisitorLifting (used in TypeVisitors to provide consistent handling of, e.g., union types)
- TypeVisitorOperatorCalculator (used in TypeVisitors to provide consistent handling of operators and similar constructs)
- SymTypeRelations
(relations over SymTypeExpressions, e.g.,
- TypeVisitors traverse the AST and
store the calculated SymTypeExpression in the Type4Ast map
- Expressions
- AssignmentExpressionsTypeVisitor (calculates the SymTypeExpressions for the expressions in the grammar AssignmentExpressions)
- BitExpressionsTypeVisitor (calculates the SymTypeExpressions for the expressions in the grammar BitExpressions)
- CommonExpressionsTypeVisitor (calculates the SymTypeExpressions for the expressions in the grammar CommonExpressions)
- ExpressionsBasisTypeVisitor (calculates the SymTypeExpressions for the expressions in the grammar ExpressionBasis)
- LambdaExpressionsTypeVisitor (calculates the SymTypeExpressions for the expressions in the grammar LambdaExpressions)
- OCLExpressionsTypeVisitor (calculates the SymTypeExpressions for the expressions in the grammar OCLExpressions)
- OptionalOperatorsTypeVisitor (calculates the SymTypeExpressions for the expressions in the grammar OptionalOperators)
- SetExpressionsTypeVisitor (calculates the SymTypeExpressions for the expressions in the grammar SetExpressions)
- StreamExpressionsTypeVisitor (calculates the SymTypeExpressions for the expressions in the grammar StreamExpressions)
- TupleExpressionsTypeVisitor (calculates the SymTypeExpressions for the expressions in the grammar TupleExpressions)
- UglyExpressionsTypeVisitor (calculates the SymTypeExpressions for the expressions in the grammar UglyExpressions)
- Literals
- MCCommonLiteralsTypeVisitor (calculates the SymTypeExpressions for the literals in the grammar MCCommonLiterals)
- MCJavaLiteralsTypeVisitor (calculates the SymTypeExpressions for the literals in the grammar MCJavaLiterals)
- SIUnitLiteralsTypeVisitor (calculates the SymTypeExpressions for the literals in the grammar SIUnitLiterals)
- Types
- MCArrayTypesTypeVisitor (calculates the SymTypeExpressions for the types in the grammar MCArrayTypes)
- MCBasicTypesTypeVisitor (calculates the SymTypeExpressions for the types in the grammar MCBasicTypes)
- MCCollectionTypesTypeVisitor (calculates the SymTypeExpressions for the types in the grammar MCCollectionTypes)
- MCFullGenericTypeVisitor (calculates the SymTypeExpressions for the types in the grammar MCFullGenericTypes)
- MCFunctionTypesTypeVisitor (calculates the SymTypeExpressions for the types in the grammar MCFunctionTypes)
- MCSimpleGenericTypesTypeVisitor (calculates the SymTypeExpressions for the types in the grammar MCArrayTypes)
- RegExTypeTypeVisitor (calculates the SymTypeExpressions for the types in the grammar RegExType)
- SIUnitTypes4ComputingTypeVisitor (calculates the SymTypeExpressions for the types in the grammar SIUnitTypes4Computing)
- SIUnitTypes4MathTypeVisitor (calculates the SymTypeExpressions for the types in the grammar SIUnitTypes4Math)
- Expressions
- Generics infrastructure is documented separately!
- TypeCheck1 Adapters (adapts the TypeSystem3 to the deprecated TypeCheck1 interface, offering implementations for IDerive and ISynthesize, not compatible with generics, s. TypeCheck1 documentation)
What is the difference between BasicSymbols and SymTypeExpressions?
The type system uses the Symbols of the BasicSymbols grammar
and the handwritten SymTypeExpressions.
While they are very similar,
there is a big difference between them and when to use them.
The symbols represent definitions,
including nominal data type definitions (e.g., in Java: class List<T>
),
while the SymTypeExpressions represent a type usage
(e.g., in Java: List<String> listOfStrings;
or List<T> tempList;
).
There is only one type definition,
but there can be many type usages.
The SymTypeExpression knows its corresponding Symbol (if applicable):
* SymTypeOfGenerics, SymTypeOfObject, SymTypePrimitive, and SymTypeVariable
know their corresponding TypeSymbol
* SymTypeOfFunction may have a corresponding FunctionSymbol
(e.g., a named function declaration)
or not (e.g., a lambda function definition)
* Other SymTypeExpressions do not have a corresponding symbol.
A type symbol, as it defines a nominal data type, is present only once in the symbol table. A SymTypeExpression is not stored in the symbol table (except as an attribute of a symbol, s. e.g., VariableSymbol), but, as far as applicable, refers to the definitions / declarations in the symbol table. Thus, multiple identical SymTypeExpressions can be used at the same time.
How to use the type system implementation?
In MontiCore, the type system implementations have multiple usages.
For example:
* writing context conditions;
The CoCos reduce a set of models to those,
that adhere to the typing rules of the language.
An example would be a CoCo that checks
for the condition of an if
-statement to be a Boolean expression.
* printing dependent on the types;
As an example, The model contains the expression f(1)
with f
being a variable of function type int -> int
and the expression is to be printed to a Java expression.
In Java, functions are not first-class citizens.
An option is to use Java's functional interfaces
and print f.apply(1)
.
To these ends, MontiCore's type system implementations offer the following functionalities:
- Given an expression, the type of the expression is deduced
(e.g., given expression
2+2
, a SymTypeExpression forint
is created) - Given a type,
the SymTypeExpression of this type is constructed
(e.g., given MCType
int
in the model, a corresponding SymTypeExpression is created) - Given one or more types, a relation is checked
(e.g., whether an expression of type
int
can be assigned to a variable of typedouble
)
In the first two cases,
SymTypeExpressions are assigned to ASTNodes by the use of TypeVisitors.
In the third case,
the SymTypeRelations class is queried using SymTypeExpressions.
This implies how to select a specific type system implementation
in the first place:
To select a type system one selects a set of TypeVisitors
and the implementations of type relations to use.
This is described in detail further below.
How to initialize the Type Visitors?
Types can be calculated for ASTNodes
representing either expressions (2+2
)
or types (String
).
This functionality is offered by the class TypeCheck3,
which uses a static delegate pattern;
This static delegate needs to be initialized;
The default (and currently only) implementation is MapBasedTypeCheck3.
First, a Type4Ast map has to be constructed to store the typing information, thus avoiding recalculation if they are queried again, e.g., by different CoCos. After creating the map, a traverser is created with the TypeVisitors of the language components; The TypeVisitors are given the Type4Ast instance. Note: Multiple type visitors, which contain different typing rules, may be available for a given sub-grammar, the visitor to select is to be specified by the language. In the end, a MapBasedTypeCheck3 has to be created and set as the delegate of TypeCheck3.
Example:
// traverser of your language
// no inheritance traverser is used, as it is recommended
// to create a new traverser for each language.
MyLangTraverser traverser = MyLang.traverser();
// map to store the results
Type4Ast type4Ast = new Type4Ast();
// one of many type visitors
// check their documentation, whether further configuration is required
BitExpressionsTypeVisitor visBitExpressions = new BitExpressionsTypeVisitor();
visBitExpressions.setType4Ast(type4Ast);
traverser.add4BitExpressions(visBitExpressions);
// create the TypeCheck3 delegate
new MapBasedTypeCheck3(traverser, type4Ast)
.setThisAsDelegate();
How to select the Type Relations?
The traverser handles the part of the type calculations that is AST-dependent;
Rules such as Student[]
is a subtype of Person[]
are not AST-dependent.
The non-AST-dependent rules of the system are available through static methods,
to modify their behavior, the corresponding static delegate has to be replaced.
Each of the following classes contains typing rules that are AST-independent, and their behavior should be modified if required, s.a. Given Infrastructure;
- SymTypeRelations,
- MCCollectionSymTypeRelations,
- WithinScopeBasicSymbolsResolver,
- WithinTypeBasicSymbolsResolver,
- LValueRelations (this uses the AST, but similarly has the same static delegate pattern)
The following are further classes, that are unlikely to be required to be modified for a given language. They still use the same static delegate pattern. * FunctionRelations, * SIUnitTypeRelations, * TypeContextCalculator, * TypeVisitorLifting, * TypeVisitorOperatorCalculator.
The implementation to use can be selected by using the init()
-method
of the corresponding class.
E.g.,
per default, TypeCheck3 will ignore any OOSymbol-specific rules while resolving
types, variables, and functions.
As such, a private
method can be accessed from outside the class.
To only allow access to public
methods from outside the class,
the corresponding resolvers have to be selected;
OOWithinTypeBasicSymbolsResolver.init()
will select
OOWithinTypeBasicSymbolsResolver
to be used instead of the default
WithinTypeBasicSymbolsResolver,
thus changing the TypeCheck to one that checks for access modifiers.
These static delegates should be initialized once with the Mill; It is expected that (nearly) each language only ever uses one set of typing rules. Any modification during the usage of the Mill should be avoided as much as possible.
An example initialization of static delegates;
// use OO rules to access types, fields, etc.
OOWithinTypeBasicSymbolsResolver.init();
OOWithinScopeBasicSymbolsResolver.init();
// CommonExpressions are used and no further LValues exist in the language.
CommonExpressionsLValueRelations.init();
// All other delegates will use their default implementation
// Explicitly initializing all other delegates the default
// will avoid issues if multiple Languages are used in the same Java process.
The delegates can be reset()
,
which removes the current selected integration/
This can be integrated into the Mill's reset()
-method,
to further avoid issues with multiple languages in the same Java process.
Full Instantiation
An example of instantiating a traverser can be found
here.
It is recommended to initialize the TypeCheck3 directly after the Mill.
Alternatively, the Mill's init()
and reset()
methods can be overridden,
to add the typecheck's (de-)initalization to the Mill's (de-)initialization.
After initializing the TypeCheck3 delegates,
TypeCheck3 can be used to query SymTypeExpressions of expressions
TypeCheck3.typeOf(expr)
,
as well as MCTypes
TypeCheck3.symTypeFromAST(mcType)
.
Note: If the language supports generics, additional steps have to be taken.
How to check relations on types?
To check relations of SymTypeExpressions,
the SymTypeExpressions are passed to the corresponding method
of SymTypeRelations or one of its subclasses.
A non-exhaustive List of relation methods:
* boolean isCompatible(SymTypeExpression assignee, SymTypeExpression assigner)
(whether an assignment is allowed in the type system)
* boolean isSubTypeOf(SymTypeExpression subType, SymTypeExpression)
(whether one type is considered a subtype of the other)
* SymTypeExpression normalize(SymTypeExpression type)
(converts a SymTypeExpression into normal form)
* boolean isInt(SymTypeExpression type)
(whether the type is an int
or boxed version thereof)
It is strongly recommended to make oneself familiar
not just with the functionality offered by the
SymTypeRelations
class,
but also its subclasses,
as they can offer further functionality like, e.g.:
boolean isList(SymTypeExpression type)
(whether the type is considered a list).
As different languages have different typing rules, the corresponding set of rules has to be selected. While this is partially done by selecting the TypeVisitors, relations between types are unrelated to the TypeVisitors and have to be initialized accordingly.
As an example, the default type relations are initialized using
SymTypeRelations.init()
.
The default typing relations are initialized per default
and calling, e.g., OCLSymTypeRelations.init()
of the OCL language project
changes the relations according to the rules of the OCL language.
According to the default, java-inspired type relations
List<Student>
is not a subtype of List<Person>
and after initializing the OCL type relations
List<Student>
is calculated to be a subtype of List<Person>
.
Initializing the OCL type relations does not just allow
usage of OCLSymTypeRelations
,
but also changes the behavior of the methods in SymTypeRelations
.
This is done to allow reuse of CoCos between languages.