MontiCore Best Practices - Symbols, Scopes, Symboltables
MontiCore provides a number of options to design languages, access and modify the abstract syntax tree, and produce output files.
The newest MontiCore release gives powerful capabilities to define and use symbols. Symbols, scopes, and symboltables are somewhat complex to design, but powerful in their use.
Designing Symbols, Scopes and SymbolTables
How to define a Symbol Usage without a given Symbol Definition
grammar E {
A = Name@S;
symbol S = Name;
}
- If you want to use a special form of symbol that shall neither be defined inside the grammar of a language, nor shall it be imported.
- We can define symbols of kind
S
in the grammar in a grammar rule that is never reached by the parser from the start production. Through this, MontiCore generates: - symbol table infrastructure for handling
S
symbols - symbol table infrastructure for resolving these in
E
scopes, and - integration of
S
symbols with the AST ofA
. - However,
S
symbols are not automatically instantiated. This has to be described manually, e.g., by extending the symbol table creator or via providing an adapter translating a foreign symbol into anS
symbol. - This can be used, e.g., in these scenarios:
- A name of a certain kind is introduced automatically the first time it occurs in a model. If it occurs more than once, all other occurrences of the name do not introduce new symbols. (e.g., this happens with features in FDs, and works because features do not have a body.)
- A name in a language
E
refers to a named element of another language, but the language shall be decoupled fromE
. Therefore,E
introduces a symbolS
and an adapter maps other, foreign symbols toS
symbols. - Defined by: AB, BR
Symbol Definition prepared for Reuse
grammar E {
symbol Bla = "bla" Name AnotherNT;
}
ASTBla
and (c) a symbol BlaSymbol
.
* Reuse of the symbol BlaSymbol
currently only works together with a reuse
of the syntax too, i.e.
grammar F extends E {
Blubb extends Bla = "blubb" Name;
}
AnotherNT
to be included in Blubb
as well.
* To allow individual reuse of symbol BlaSymbol
we recommend to
restructure its definition into an interface that does not preclude
create syntax and only a minimal constraint on the abstract syntax:
grammar E {
symbol interface Bla = Name;
Bla2 implements Bla = "bla" Name AnotherNT;
}
grammar F extends E {
Blubb implements Bla = "blubb" Name;
}
- Please note that MontiCore allows that a nonterminal implements
multiple interfaces. However, only one of them may carry the
symbol
keyboard property, because the newly defined symbol then is also a subclass of the inherited symbol (in Java).
Loading (DeSerializing) Symbols of Unknown Symbol Kinds
Specific languages (e.g., CD
) may provide specific symbols, of specific kinds.
A symbol import of these symbols into another language L1
has to cope with
potentially unknown kinds of symbols, even though the super kind could be known.
E.g., TypeSymbol
is extended by CDTypeSymbol
providing e.g., additional
visibility information.
Upon loading an CD
-symboltable into an L1
-tool
it may be that neither AST-class CDTypeSymbol
nor superclass information about
it is available.
But, the symbols of the unknown kind should (and can) be loaded as symbols of a more abstract kind.
Loading the symbols of the unknown kind as symbols of the specific known kind is possible in multiple ways.
Options would be
1. adapt the L1
-tool to know about the new symbols, or
2. the L1
-tool has been written in such a way that new classes can be added
through appropriate class loading, or
3. the L1
-tool is configurable in handling unknown symbol kinds as explained below.
Loading Symbols as Symbols of Another Kind
Symbols of an unknown source kind (e.g., CDTypeSymbol
) may easily be loaded as
symbols of a known kind (e.g., TypeSymbol
) when the source kind provides
all mandatory attributes (i.e. those without defaults) of the symbol class.
This is especially the case if the source kind is a subclass of the known
kind.
This behavior can be configured in the global scope by calling the method putSymbolDeser(String, ISymbolDeser)
,
where the unknown source kind is encoded as string (here: CDTypeSymbol
) and is mapped to
an appropriate DeSer (here for TypeSymbol
).
For instance the call would be
putSymbolDeSer("de.monticore.cdbasis._symboltable.CDTypeSymbol", new TypeSymbolDeSer())
.
Because the global scope is a singleton, this configuration can be e.g., called in or shortly
after constructing the global scope. However, this would still encode the name of the unknown
symbol kind in the L1
-tool, although it prevents any actual dependency to the imported tools.
The method can also be called from a CLI to dynamically configure the deserialization,
e.g., the information be fed to the L1
-tool via parameters, e.g., like
java L2Tool --typeSymbol=de.monticore.cdbasis._symboltable.CDTypeSymbol
--functionSymbol=de.monticore.cdbasis._symboltable.CDMethodSymbol
Converting Stored Symbol Tables
If the unknown symbol kinds do have different attributes or some extra information
needs to be calculated in the new symbols, then either the L1
-tool needs to be adapted or
the serialized symbol table can be transformed to another
serialized symbol table where the kind information is transformed as required as an
intermediate step between the tools providing and reading the symbol tables.