Getting Started with MontiCore

This page describes the technical installation and usage of MontiCore for language developers. This page further inspects a simple example grammar and the Java classes and other artifacts generated from this grammar. After installing MontiCore as described on this page, it can be used to develop new modeling languages and generators as described in subsequent chapters.

MontiCore provides a command line interface (CLI) tool and can easily be used with Gradle. The Gradle integration enables developers to easily employ MontiCore in commonly used integrated development environments (IDEs), such as Eclipse and IntelliJ IDEA. This page contains information about an example MontiCore project and the files generated by MontiCore. It also shortly explains some key features of MontiCore.

Detailed information about all configuration options that can be used in the MontiCore CLI tool and in MontiCore Gradle projects are explained in Chapter 16 of the handbook. More information about the example Automata language are available in Chapter 21 of the handbook.

Prerequisites: Installing the Java Development Kit

We start with the JDK: Please perform the following steps to install the Java Development Kit (JDK) and validate that the installation was successful:

Install a JDK with at least version 11 provided by Oracle or OpenJDK.
Make sure the environment variable JAVA_HOME points to the installed JDK, and not to the JRE, e.g., the following would be good:
- /user/lib/jvm/java-11-openjdk on UNIX or
- C:\Program Files\Java\jdk-11.* on Windows. You will need this in order to run the Java compiler for compiling the generated Java source files.
Also make sure that the system variable is set such that the Java compiler can be used from any directory. JDK installations on UNIX systems do this automatically. On Windows systems, the bin directory of the JDK installation needs to be appended to the PATH variable, e.g. %PATH%;%JAVA_HOME%.
Test whether the setup was successful. Open a command line shell in any directory. Execute the command javac -version. If this command is recognized and the shell displays the version of the installed JDK (e.g., javac 11.0.5), then the setup was successful.
(Optional) Install Gradle version 7.6.

Now we have the prerequisites to run MontiCore from the command line (CLI) or alternatively using Gradle.

Use the MontiCore Command Line Interface

This section describes instructions to perform the following first steps to use MontiCore, either as a CLI tool or with Gradle:

Installation of the MontiCore distribution file.
Grammar inspection
Running the MontiCore generator
Compiling the product
Running the product, i.e. the Automata tool with an example model example/PingPong.aut.

Installation

For installing MontiCore for either CLI or Gradle usage, select the suitable tab below, and perform the following steps:

CLIGradle

Download the example Automata MontiCore project:

// MontiCore zip distribution source
https://www.monticore.de/download/monticore.tar.gz

Unzip the archive. The unzipped files include a directory called mc-workspace containing the executable MontiCore tool monticore.jar along with a directory src containing handwritten Automata DSL infrastructure, a directory hwc containing handwritten code that is incorporated into the generated code, and a directory example containing an example model of the Automata DSL.

// MontiCore zip distribution content in directory mc-workspace
Automata.mc4
monticore.jar
monticore-rt.jar
src/automata/AutomataTool.java
src/automata/visitors/CountStates.java
src/automata/prettyprint/PrettyPrinter.java
src/automata/cocos/AtLeastOneInitialAndFinalState.java
src/automata/cocos/StateNameStartsWithCapitalLetter.java
src/automata/cocos/TransitionSourceExists.java
hwc/automata/_ast/ASTState.java
hwc/automata/_symboltable/AutomatonSymbol.java
hwc/automata/_symboltable/AutomataSymbols2Json.java
hwc/automata/_symboltable/AutomatonSymbolDeSer.java
hwc/automata/_symboltable/AutomataGlobalScope.java
example/PingPong.aut

Download the example Automata MontiCore project for Gradle:

// MontiCore-Automaton zip distribution source
http://www.monticore.de/download/Automaton.zip

Unzip the archive. The unzipped files include a directory called automaton-master containing the Gradle build script along with the directories src/main/grammars containing the Automata.mc4 grammar, src/main/java containing both handwritten Automata DSL infrastructure and handwritten code that is incorporated into the generated code, and src/test/resources containing an example model of the Automata DSL.

// Automaton zip distribution content
build.gradle
gradle.properties
settings.gradle
src/main/grammars/Automata.mc4
src/main/java/automata/AutomataTool.java
src/main/java/automata/visitors/CountStates.java
src/main/java/automata/cocos/AtLeastOneInitialAndFinalState.java
src/main/java/automata/cocos/StateNameStartsWithCapitalLetter.java
src/main/java/automata/cocos/TransitionSourceExists.java
src/main/java/automata/_ast/ASTState.java
src/main/java/automata/_symboltable/AutomatonSymbol.java
src/main/java/automata/_symboltable/AutomatonSymbolDeser.java
src/main/java/automata/_symboltable/AutomataGlobalScope.java
src/test/resources/automata/parser/PingPong.aut

Note: When mentioned, this document will specify file locations based on the CLI project setup. The CLI directories src and hwc are both represented by the same src/main/java directory using Gradle.

The CLI and Gradle setups both work on the same project, albeit with a different directory structure.

Inspect the Example Grammar

MontiCore is a language workbench. It supports developers in developing modular modelling languages. The core of MontiCore is its grammar modelling language (cf. Chapter 4 of the MontiCore handbook), which is used by developers for modelling context-free grammars. A MontiCore grammar defines (parts of) the abstract and concrete syntax of a language. Each grammar contains nonterminals, production rules, and may extend other grammars. At most one rule is marked as the start rule.

It is a key feature of MontiCore that it allows a grammar to reuse and extend other grammars. In an extension all the nonterminals defined in the extended grammars can be reused or even overridden. This form of extension allows to achieve several effects:

Language (i.e. grammar) components can be reused and integrated in larger languages, composed of several components.
Individual nonterminals can be reused (like classes) from a library.
A given language can be extended, allowing to add additional alternatives inside a language.

Component grammars and grammar extensions are detailedly discussed in Chapter 4 of the MontiCore handbook.

grammar Automata extends de.monticore.MCBasics {

  symbol scope Automaton =
    "automaton" Name "{" (State | Transition)* "}" ;

  symbol State =
    "state" Name
    (("<<" ["initial"] ">>" ) | ("<<" ["final"] ">>" ))*
    ( ("{" (State | Transition)* "}") | ";") ;

  Transition =
    from:Name "-" input:Name ">" to:Name ";" ;
}

Listing 2.3: The Automata grammar.

In the following, we inspect the MontiCore grammar of the Automata language. Navigate your file explorer to the unzipped mc-workspace directory. The directory contains the file Automata.mc4. This file contains the MontiCore grammar depicted in Listing 2.3. MontiCore grammars end with .mc4.

The definition of a MontiCore grammar starts with the keyword grammar, followed by the grammar's name. In this example, the grammar is called Automata. The grammar's name is optionally followed by the keyword extends and a list of grammars that are extended by the grammar. In this example, the Automata grammar extends the grammar de.monticore.MCBasics.

Tip 2.4 MontiCore Key Feature: Composition

The MontiCore language workbench allows to compose language components by composing grammars and also to reuse all infrastructure, such as context conditions, symbol table infrastructure, generator parts and handwritten extensions.

In the example the Automata grammar extends the grammar de.monticore.MCBasics and thus reuses its functionality.

MontiCore comes with an extensive library of predefined language components.

Grammars can also have a package and import other grammars. If a grammar has a package, then the package declaration must be the first statement in the grammar and is of the form package QualifiedName where package is a keyword and QualifiedName is an arbitrary qualified name (e.g. de.monticore). The optional grammar imports follow the package definition. Every import is of the form import QualifiedName. The Automata example grammar file does neither contain a package declaration nor imports. The grammar extended by the Automata grammar is specified by its fully qualified name.

As usual in context-free grammars, production rules have a left-hand side and a right-hand side. The left-hand side contains the possibly annotated name of a nonterminal. The left-hand side is followed by the terminal = and the right-hand side. Nonterminal names start with an upper-case letter. For instance, the Automata grammar contains the nonterminals Automaton, State, and Transition. A single nonterminal can be provided with the start keyword. Then, the nonterminal is the starting symbol of the grammar. If no nonterminal is marked with start, then the first nonterminal of the grammar becomes the starting symbol by default. In the Automata grammar, the Automaton nonterminal is the starting symbol.

The other possible keywords for nonterminals influence the generated classes for the abstract syntax tree as well as the generated symbol table infrastructure. Details can be found in Chapter 4 and Chapter 9 of the MontiCore handbook. For example, the Automaton nonterminal is marked with symbol and scope. The keyword symbol makes the MontiCore generator generate a symbol class for the nonterminal. Intuitively stated, the keyword scope instructs the MontiCore generator to construct a symbol table infrastructure that opens a scope when the production is processed. The following sections explain the effects of specifying the Automaton nonterminal with the keywords symbol and scope in more detail. Terminals are surrounded by quotation marks. The Automata grammar, for example, inter alia contains the terminals automaton, state, {, }, and ;.

The right-hand sides of grammar productions consist of nonterminals, terminals, and semantic predicates, may use cardinalities (*, +, ?), and introduce alternatives via the terminal | as known from regular expressions. Details can be found in Chapter 4 of the MontiCore handbook. The right-hand side of the production defining the nonterminal Automaton, for example, uses the terminal automaton and the nonterminals Name, State, and Transition. The nonterminal Name is not defined in the grammar Automata. Thus, it must be defined in one of the extended grammars. In this case, Name is defined in the grammar MCBasics and is reused by the grammar Automata. For distinguishing different usages of nonterminals on right-hand sides, they can be named. For example, the right-hand side of the production defining the nonterminal Transition uses the Name nonterminal twice. The first usage is named input and the second usage is named to. MontiCore also supports interface and external nonterminals for introducing extension points as detailedly described in Chapter 4 of the MontiCore handbook. However, the example grammar does not use these concepts.

automaton PingPong {
  state NoGame <<initial>>;
  state Ping;
  state Pong <<final>>;

  NoGame - startGame > Ping;

  Ping - stopGame > NoGame;
  Pong - stopGame > NoGame;

  Ping - returnBall > Pong;
  Pong - returnBall > Ping;
}

Listing 2.5: A model conforming to the Automata grammar.

Listing 2.5 depicts an example model conforming to the Automata grammar in its concrete syntax. It depicts a simple game of Ping Pong. The automaton consists of three states: the initial state NoGame, such as the states Ping and Pong, for identifying on which side the ball is located. Initially, the automaton starts in the state NoGame. The game starts at the corresponding event. During a run, the automaton switches states by returning the ball from one side to the other. Additionally, it can be stopped at each stage of the game, resulting in the initial configuration. You can find the model in the file PingPong.aut contained in the example directory of the unzipped mc-workspace directory.

Run MontiCore

The MontiCore generator takes a MontiCore grammar as input and generates an infrastructure for processing models conforming to the grammar. When a grammar E extends another grammar G, then all the infrastructure generated for the grammar G is reused and only the extending part from E is generated.

Tip 2.6 Infrastructure Generated by MontiCore

MontiCore itself as well as the infrastructure generated by the MontiCore generator are implemented in Java. This infrastructure includes:

a parser for parsing models conforming to the grammar and transforming textual models into abstract syntax tree instances abstracting from the concrete syntax.
a symbol table infrastructure to handle the symbols introduced or used by models conforming to the grammar. The symbol table infrastructure is used for resolving dependencies between model elements that are possibly defined in different files.
a context-condition checking framework for checking well-formedness rules that cannot be captured by context-free languages.
a visitor infrastructure for traversing models respectively their abstract syntax instances. The abstract syntax of a model consists of its internal representation as an abstract syntax tree abstracting from the concrete syntax of the model (the instance of the data structure obtained from parsing) and the symbol table of the model.
a mill infrastructure for retrieving objects for language processing, such as parsers, builders for abstract syntax trees, visitors and objects for the symbol tables of the language. A mill serves as a dynamic factory, adapting to the current modeling language. The possibility to configure the mills is crucial for reusing the functionality implemented for a sublanguage (cf. Section 5.9, Section 5.10.2, and Section 11.5 for details).
a code generating framework that extends the FreeMarker template engine by various modularity enhancing features.

For executing MontiCore using the Automata grammar as input, perform the following steps:

CLIGradle

Open a command line shell and change the working directory to the unzipped directory (mc-workspace).
Execute the following command in order to generate the language infrastructure of the Automata DSL:
```
java -jar monticore.jar -g Automata.mc4 -hcp hwc/ src/ -mp monticore-rt.jar
```
The only required argument Automata.mc4 denotes the input grammar that shall be processed by MontiCore. The processing includes the generation of the language infrastructure. Using the option -hcp enables specifying the path to a directory containing the handwritten code that is to be incorporated into the generated infrastructure. In this case, passing the argument hwc/ to the option -hcp makes MontiCore consider the handwritten code located in the directory hwc/. Providing handwritten code enables to easily incorporate additional functionality into the generated code. For example, this enables developers to extend generated abstract syntax classes as detailedly described in (cf. Section 5.10 of the MontiCore handbook). Passing the argument -mp enables specifying the paths to directories or archives containing paths to grammars and Java classes that are imported by the processed grammar and the related tooling. In this case, the archive monticore-rt.jar contains the grammars and handwritten extensions of the monticore standard library. More information about the standard library can be found in Chapters 17- 20 of the handbook.

Open a command line shell and change the working directory to the unzipped directory.
Execute the following command in order to generate the language infrastructure of the Automata DSL:
```
gradle generateMCGrammars
```
In case this command fails, ensure that you are using the correct versions of Java and Gradle by using the commands java --version and gradle --version. The correct versions are listed in the prerequisites step of this page. By default, the MontiCore Gradle plugin shall process all input grammars in src/main/grammars. The processing includes the generation of the language infrastructure. Handwritten code in src/main/grammar is going to be incorporated into the generated infrastructure. Providing handwritten code enables to easily incorporate additional functionality into the generated code. For example, this enables developers to extend generated abstract syntax classes as detailedly described in (cf. Section 5.10 of the MontiCore handbook).

The MontiCore standard library, containing grammars and Java classes, is imported via the de.monticore:monticore-grammar dependency. More information about the standard library can be found in Chapters 17- 20 of the handbook.

A more detailed description of the MontiCore Gradle plugin is given here.

Executing the command launches MontiCore, which results in the executing of the following steps:

The specified grammar is parsed and processed by MontiCore.
Java source files for the corresponding DSL infrastructure are generated into the default output directory out. This infrastructure consists of the directories
- out/automata/ containing the mill (cf. Section 5.9, Section 5.10.2, Section 11.5).
- out/automata/_ast containing the abstract syntax tree data structure (cf. Chapter 5 of the MontiCore handbook).
- out/automata/_auxiliary containing adapted mills of sublanguages, which are required for configuring the mills of sublanguages (cf. Chapter 11 of the MontiCore handbook).
- out/automata/_cocos containing the infrastructure for context conditions (cf. Chapter 10 of the MontiCore handbook).
- out/automata/_od containing the infrastructure for printing object diagrams for reports produced during processing the models.
- out/automata/_parser containing the generated parsers, which are based on ANTLR (cf. Chapter 6 of the MontiCore handbook).
- out/automata/_symboltable containing the infrastructure for the symbol table (cf. Chapter 6 of the MontiCore handbook).
- out/automata/_utils containing infrastructure for typecasting and identifying subtypes of ast nodes, symbols, and scopes.
- out/automata/_visitor containing the infrastructure for visitors (cf. Chapter 9 of the MontiCore handbook).
- out/reports/automata containing reports created during the processing of the grammar.
The output directory also contains a log file of the executed generation process with the generation time in its name.

In the following, we review the classes and interfaces generated from the Automata grammar that are relevant for language engineers in more detail. We do not review the classes and interfaces that are only internally relevant for MontiCore and are usually not intended to be used by language engineers.

Abstract Syntax Tree Data Structure

The tree data structure is generated into the directory out/automata/_ast. Details about the generation of AST classes can be found in (cf. Chapter 5 of the MontiCore handbook). For each nonterminal contained in the grammar, the MontiCore generator produces AST and corresponding builder classes. The AST classes implement the abstract syntax tree data structure.

The builder classes implement the builder pattern for constructing instances of the respective AST classes as usual. For example, the class ASTAutomaton is the AST class generated for the Automaton nonterminal (cf. Listing 2.3) and the class ASTAutomatonBuilder is the corresponding generated builder class.

Parts of the AST data structure generated for the Automata grammar.

Figure 2.7: Parts of the AST data structure generated for the Automata grammar.

The contents of the AST and builder classes are generated systematically from the grammar. The attributes of each AST class resemble the right-hand side of the corresponding production rule. In the following, we mainly speak of attributes, but please be aware that all attributes come fully equipped with access and modification methods, which should normally be used.

For instance, Figure 2.7 depicts parts of the generated AST infrastructure for the Automata grammar. The class ASTAutomaton contains the attributes name, states, and transitions. The AST class does not contain an attribute for the terminal automaton as it is part of every word conforming to the production of the Automaton nonterminal. The type of the attribute name is String whereas the attributes states and transitions are lists of the types of the AST classes corresponding to the used nonterminals. This is the case because exactly one Name is parsed with the right-hand side of the production of the nonterminal Automaton, whereas multiple states and transitions can be parsed.

The ASTAutomaton class further contains the attributes symbol, spannedScope, and enclosingScope. These attributes are specific to the symbol table of Automata models and are used for linking the symbol table of a model with its abstract syntax tree. Details can be found in Chapter 9 of the MontiCore handbook.

Tip 2.8 Generated Symbols and Scopes in the AST

Each AST class contains access to the enclosingScope.

When a production contains the keyword symbol, the generated AST class contains the attribute symbol (see Chapter 9 of the MontiCore handbook).

Keyword scope indicates that a nonterminal also defines a new local scope, stored in attribute spannedScope.

The parser builds the abstract syntax tree of a model and the available scope genitor creates the symbol table of the model, consisting of symbols and scopes.

The ASTAutomaton class further contains several straight-forward methods for checking different instances for equality and accessing the attributes. Similar to the ASTAutomaton class, the ASTAutomatonBuilder class contains attributes resembling the right-hand side of the corresponding production. It further contains methods for changing the values of the attributes (e.g., addState), checking whether the AST instance that would be constructed from the current builder state is valid (cf. isValid), and for building the AST instance corresponding to the builder's state (cf. build). The contents of the other AST and Builder classes are constructed analogously.

Tip 2.9 Handwritten AST Class Extensions

If the generator detects that an AST class for a nonterminal is already implemented in the handwritten code, then it produces a corresponding TOP AST class instead.

This TOP mechanism allows developers to add handwritten extensions to any generated class, while reusing the generated TOP class via extension.

This gives a very close integration between handwritten and generated code that even adapts builders accordingly, while preventing the very bad habit of performing manual changes to the generated code.

Option -hcp tells the generator where to look for handwritten integrations.

The following section presents the methods of the classes for parsing textual models (possibly stored in files) into AST class instances at runtime. For now, it suffices for you to understand that (1) MontiCore generates an extensible AST data structure that resembles the nonterminals and productions of the grammar in a straight-forward way and (2) that all models of a grammar have an AST data structure representation for internal processing.

Parser

The infrastructure is generated into the directory out/automata/_parser. Details about the generated parsers and their uses are described in Chapter 6 of the MontiCore handbook.

Parts of the class AutomataParser generated from the Automata grammar.

Figure 2.10: Parts of the class AutomataParser generated from the Automata grammar.

Parts of the generated class AutomataParser are depicted in Figure 2.10. The class implements the generated parser for the Automata grammar. Usually, developers are solely concerned with the methods parse(String) and parse_String(String). For now, it suffices if you remember that parsing textual Automata models stored in files is possible by calling the method parse(String) of an AutomataParser object with the fully qualified name of the file as input.

Tip 2.11 Methods for Parsing

The class AutomataParser contains the methods

parse(Reader r),
parse(String filename), and
parse_String(String content).

All of the methods return an object of type Optional<ASTAutomaton>, where absence means failure of parsing and errors have been issued.

For each nonterminal in the grammar, the class further contains methods for parsing a sub-model described by this nonterminal.

Symbol Table

The infrastructure is generated into the directory out/automata/_symboltable. Details about the generated symbol table infrastructure and its use are described in Chapter 9 of the MontiCore handbook. The symbol table infrastructure is used for resolving cross-references concerning information defined in different model elements that are potentially defined in different models stored in different files.

Figure 2.12: The scope classes generated from the `Automata` grammar.

Tip 2.13 Scope Classes

For the Automata grammar, the generator produces the classes

AutomataScope,
AutomataArtifactScope, and
AutomataGlobalScope

as well as respective interfaces. The relationships between these classes and interfaces are depicted in Figure 2.12.

The singleton AutomataGlobalScope contains all AutomataArtifactScopes of all loaded Automata artifacts. AutomataScopes represent scopes spanned inside of models.

Figure 2.14: Parts of the symbol classes generated from the Automata grammar.

Figure 2.14 depicts parts of the symbol classes generated for the Automata grammar. As the nonterminal State is annotated with symbol in the Automata grammar, the generator produces the class StateSymbol. The StateSymbol class, inter alia, contains the attributes name, enclosingScope, and spannedScope. The attribute name stores the name of the symbol. The attributes enclosingScope and spannedScope store the enclosing and spanned scopes of the symbol. The class further contains methods for accessing and setting the attributes. For all symbol classes, the MontiCore generator also produces builder classes (e.g., AutomataArtifactScopeBuilder and StateSymbolBuilder).

Tip 2.15 Extending Symbol Classes

It is possible to add further methods and attributes in two ways:

adding a symbol rule in the grammar (described in Chapter 9 of the MontiCore handbook) or
using the TOP mechanism applied to the generated symbols.

The generated class AutomataScopesGenitor is responsible for creating the scope structure of Automata artifacts and linking the scope structure with the corresponding AST nodes. For this task, it provides the method createFromAST that takes an ASTAutomaton instance as input and returns an IAutomataArtifactScope instance. The returned IAutomataArtifactScope instance can be added as a subscope to the (during runtime unique and administrated by the mill) AutomataGlobalScope instance.

Developers can create visitors for complementing the symbol table (creating symbols and filling the extensions introduced via symbol rules or the TOP mechanism) of an Automata artifact. After creating the scope structure, the visitor should be used to traverse the AST instance of the artifact for complementing the symbols and scopes. The following sections explain the generated visitor infrastructure in more detail.

Optional<AutomatonSymbol> resolveAutomaton(String name)
List<AutomatonSymbol> resolveAutomatonMany(String name)
Optional<StateSymbol> resolveState(String name)
List<StateSymbol> resolveStateMany(String name)

Listing 2.16: Different resolve methods.

For each nonterminal annotated with symbol in the grammar Automata, the scope interfaces contain a symbol-specific resolve method taking a string as input. The method can be called to resolve symbol instances by their names. The name given as input to a resolve method should be as qualified as needed to find the symbol. For instance, Listing 2.16 lists the signatures of four of the resolve methods provided by the interface IAutomataScope.

For now, it suffices for you to understand that (1) MontiCore generates an extensible symbol table data structure that resembles the scope and symbol structure as specified in the grammar in a straight-forward way and (2) that all models of a grammar have a symbol table data structure representation for internal processing and (3) that symbols can be resolved from scopes via calling the resolve methods.

(De)Serialization of Symbol Tables

MontiCore also supports the serialization and deserialization of symbol tables. The (de)serialization is crucial for incremental code generation and efficient language composition via aggregation. Details about this are explained in Chapter 7 and Chapter 9 of the MontiCore handbook.

For the (de)serialization, the generator produces the class AutomataSymbol2Json. It provides the public methods store and load. The former can be used to serialize IAutomataScope instances into their string representations encoded in JSON and persisting these to a file at a location that is passed as method argument. The latter can be used to load a stored IAutomataScope into its objects representation. For now, it suffices that you understand which methods to call for the (de)serialization.

Visitor

Figure 2.17: Parts of the visitor infrastructure generated from the Automata grammar

The infrastructure is generated into the directory out/automata/_visitor. Details about the generated visitor infrastructure are described in Chapter 8 of the MontiCore handbook. For each grammar, the generator systematically produces several classes and interfaces implementing the visitor infrastructure. For the Automata grammar, for example, the generator produces the interfaces AutomataTraverser, AutomataVisitor2, and AutomataHandler and the class AutomataTraverserImplementation. The relationships between these interfaces and classes are depicted in Figure 2.17.

The interfaces Traverser, Visitor2 and Handler together realize the Visitor pattern. Conceptually, the traverser is the entry point for traversing. The traverser manages visitors for the different sublanguages and realizes the default traversing strategy. Whenever an AST node is traversed, the traverser delegates the visit to the corresponding visitor implementation. If a special traversal is to be implemented that differs from the default, it is possible to add handlers to the traverser that realize the alternative traversal. For a more detailed explanation consider reading Chapter 8 of the MontiCore handbook.

Tip 2.18 Visitors

MontiCore provides the visitor pattern in a detangled and thus flexible variant.

AutomataTraverser is traversing the AST. AutomataVisitor2 contain the actual functionalities, added through subclassing. Many visitors can be added to the traverser for parallel execution via the method add4Automata.

The visitors are compositional, allowing to maximize reuse of visitors from sublanguages, and they can be adapted through the TOP mechanism.

For example, the handwritten class PrettyPrinter, which can be found in the directory mc-workspace/src/automata/prettyprint, implements functionality for pretty printing an Automata model, which is given by its abstract syntax tree. Listing 2.19 depicts the attributes and the constructor of the class. The PrettyPrinter class implements the AutomataHandler interface. Its constructor instantiates a printer (a helper for printing indented strings) and retrieves an AutomataTraverser object from the mill (which is explained later on). It sets the handler of the traverser to itself. This ensures that the pretty printer becomes the handler of the traverser. We will execute it in a following section.

public class PrettyPrinter implements AutomataHandler {
  private final IndentPrinter printer;
  private AutomataTraverser traverser;

  public PrettyPrinter() {
    this.printer = new IndentPrinter();
    this.traverser = AutomataMill.traverser();
    traverser.setAutomataHandler(this);
  }
  // further methods
}

Listing 2.19: Attributes and constructor of the PrettyPrinter for the Automata language.

For now, you should understand that (1) for implementing visitors it is often sufficient to implement the visitor interfaces and to add them to a traverser and (2) custom traversals can be realized by implementing handlers and adding those to the traverser.

Context Conditions

The infrastructure is generated into the directory out/automata/_cocos. Details about the generated context condition infrastructure are described in Chapter 10 of the MontiCore handbook.

For each nonterminal of a grammar, the generator produces a context condition interface for implementing context conditions for this nonterminal. For the Automata grammar, for example, the generator produced the interface AutomataASTStateCoCo. The interface solely contains the method check(ASTState). Each class implementing the interface should represent a predicate over subtrees of abstract syntax trees starting at a node with the type corresponding to the nonterminal.

The check method should be implemented such that it reports an error or a warning if the input node does not satisfy the predicate. Thus, context conditions implement well-formedness rules that cannot be captured by context-free grammars (or that are intentionally not captured by the grammar to achieve a specific AST data structure). For producing the error or warning, the static methods error and warning of the MontiCore runtime class Log should be used.

For the Automata grammar, the generator also produced the class AutomataCocoChecker. For each nonterminal of the grammar, the class contains a method for adding context condition instances to an AutomataCocoChecker instance. For checking whether an AST node satisfies all registered context conditions, the method checkAll can be called with the AST node as input. Calling the method makes the checker traverse the abstract syntax tree and check whether each node satisfies the context conditions registered for the node. Thus, AutomataCocoChecker instances represent sets of context conditions that are required to be satisfied by abstract syntax tree instances.

For now, you should understand that (1) implementing context conditions is possible via implementing the generated CoCo interfaces and (2) context conditions can be checked via instantiating the Checker class, adding the CoCos, and calling the checkAll method.

Mill as Factory for Builders

The for the Automata language is generated into the directory out/automata/. Details about the generated mill and the mill pattern in general are described in Section 11.5. The generated mill class AutomataMill is responsible for providing ready to use and correct parser, scope genitor, scope, and builder instances. The mill of each language is a singleton.

Tip 2.20 Mill Use and Automatic Initialization

A mill is a factory for builders and other commonly used functions, such as parsers or visitors. The mill was introduced to ensure compositionality of languages, while retaining reusability of functions developed for sublanguages.

Only one mill instance exists, even though in composed languages it is available under several static signatures. Let language G2 extend another language G1. Then G2Mill initializes the G1Mill appropriately, such that all the code of the sublanguage G1 can be reused in the tools developed for the language G2, even when creating new AST nodes, symbols, etc.

Cool mechanism and the developers don't have to bother.

public static IAutomataGlobalScope globalScope()
public static IAutomataArtifactScope artifactScope()
public static IAutomataScope scope()
public static AutomataScopesGenitor scopesGenitor ()
public static AutomataScopesGenitorDelegator scopesGenitorDelegator()
public static ASTAutomatonBuilder automatonBuilder()
public static AutomatonSymbolBuilder automatonSymbolBuilder()
public static AutomataParser parser()
public static AutomataTraverser traverser ()

Listing 2.21: Some method of the AutomataMill.

Developers should retrieve all instances of the classes and interfaces provided by the mill by using the mill. Instances of the classes and interfaces that are provided by the mill should never be instantiated manually. Otherwise, it may be the case that not all of the code implemented for the language can be reused as expected in other languages extending the language. Listing 2.21 shows some signatures of the methods of the AutomataMill.

Tip 2.22 Mill Methods

A mill provides public static methods for retrieving the instances of the parsers, scope genitors, scopes, and builders. For that is acts like a factory. Because a mill is realized using the static delegator pattern (cf. Section 11.1), it still can be adapted at will.

This combines the advantage of general availability with the advantage of being able to override the functions.

For now, you should understand that (1) the methods of the mill should be used for creating ready to use and correct parser, scope genitor, scope, and builder instances and (2) how to call these methods.

Compile the Target

Section 2.2.3 of the MontiCore handbook describes how to generate the desired Java code from a MontiCore grammar. For these Java classes, generated for the Automata DSL, execute the following command in the mc-workspace:

CLIGradle

With Powershell on Windows

javac -cp monticore-rt.jar -sourcepath "src/;out/;hwc/" `
                                  src/automata/AutomataTool.java

With Bash on Unix

javac -cp monticore-rt.jar -sourcepath "src/:out/:hwc/" \
                                  src/automata/AutomataTool.java

With cmd on Windows

javac -cp monticore-rt.jar -sourcepath "src/;out/;hwc/" ^
                                  src/automata/AutomataTool.java

Please note, on Unix systems paths are separated using ":" (colon) instead of semicolons.

Providing the option -cp with the argument monticore-cli.jar makes the Java compiler consider the compiled MontiCore runtime classes contained in the file monticore-cli.jar.

The option -sourcepath enables to specify paths to directories containing the source files that should be considered during the compilation.

In this case, executing the command makes the Java compiler consider all generated classes located in and all handwritten classes located in src and hwc. The last argument instructs the Java compiler to compile the class src/automata/AutomataTool.java.

gradle compileJava

Gradle will take care of running the Java compiler.

Please note that the structure of the handwritten classes follows the package layout of the generated code, i.e. there are the following sub directories (Java packages):

src/automata contains the top-level language realization for using the generated DSL infrastructure. In this case the class src/automata/AutomataTool.java constitutes a main class executable for processing automata models with the automata DSL.
src/automata/cocos contains infrastructure for context condition of the automata DSL.
src/automata/prettyprint contains an exemplary use of the generated visitor infrastructure for processing parsed models for pretty printing.
src/automata/visitors contains an exemplary analysis using the visitor infrastructure. The exemplary analysis counts the states contained in the parsed automata model.
hwc/automata/_ast contains an exemplary usage of the handwritten code integration mechanism for modifying the AST for the automata DSL. Details about the integration mechanism are described in Section 5.10.
hwc/automata/_symboltable contains handwritten extensions of the generated symbol table infrastructure. Details about implementing handwritten symbol table infrastructure extensions are described in Chapter 9 of the MontiCore handbook.

Please, also do not mix the code for the Automata tool vs. the code for the final product, generated from that tool, although both have a similar package structure.

We already described the contents of the directories hwc/automata/_ast and hwc/automata/_symboltable in the previous section. They contain handwritten extensions of the abstract syntax of the Automata language.

public class CountStates implements AutomataVisitor2 {
  private int count = 0;

  @Override
  public void visit(ASTState node) {
    count++;
  }

  public int getCount() {
    return count;
  }
}

Listing 2.23: The CountStates visitor implementation

The directory src/automata/visitors contains the file CountStates.java. The class is depicted in Listing 2.23. It implements a simple visitor for counting the number of states contained in an Automata model. To this effect, it implements the AutomataVisitor2 interface. It has an attribute count of type int for storing the current number of counted nodes. It overrides the visit method for ASTState to increase the counter whenever a state is visited.

The directory src/automata/cocos contains the context-condition implementations for the Automata language.

public class AtLeastOneInitialAndFinalState
       implements AutomataASTAutomatonCoCo {
@Override
  public void check(ASTAutomaton automaton) {
    boolean initialState = false;
    boolean finalState = false;

    for (ASTState state : automaton.getStateList()) {
      if (state.isInitial()) {
        initialState = true;
      }
      if (state.isFinal()) {
        finalState = true;
      }
    }

    if (!initialState || !finalState) {
      // Issue error...
      Log.error("0xA0116 An automaton must have at least one initial 
                 and one final state.",
          automaton.get_SourcePositionStart());
    }
  }
}

Listing 2.24: Context condition implementation for checking that there exist at least one initial and at least one final state.

Listing 2.24 depicts the class AtLeastOneInitialAndFinalState. The class implements a context condition for checking whether an Automata model contains at least one initial and at least one final state. To this effect, the class implements the interface AutomataASTAutomatonCoCo. The class StateNameStartsWithCapitalLetter is implemented similarly.

public class TransitionSourceExists
                 implements AutomataASTTransitionCoCo {

  @Override
  public void check(ASTTransition node) {

    IAutomataScope enclosingScope = node.getEnclosingScope();
    Optional<StateSymbol> sourceState =
        enclosingScope.resolveState(node.getFrom());

    if (!sourceState.isPresent()) {
      // Issue error...
      Log.error(
        "0xADD03 Source state of transition missing.",
         node.get_SourcePositionStart());
    }
  }
}

Listing 2.25: Context condition implementation for checking that states used in transitions exist.

Listing 2.25 presents the implementation of the class TransitionSourceExists. The class implements a context condition for checking whether the source states used in transitions are defined. To this effect, the class uses the resolving mechanisms of the symbol table. For each transition, the context conditions tries to resolve the state symbol corresponding to the source state of the transition. If the resolving fails for the state, then the context condition logs an error.

The class AutomataTool is the main class of the Automata language. It is defined in the file AutomataTool.java contained in the directory src/automata.

public class AutomataTool extends AutomataToolTOP {
  // main method missing in this listing

  public ASTAutomaton parse(String model) {
    try {
      AutomataParser parser = AutomataMill.parser();
      Optional<ASTAutomaton> optAutomaton = parser.parse(model);

      if (!parser.hasErrors() && optAutomaton.isPresent()) {
        return optAutomaton.get();
      }
      Log.error("0xEE840 Model could not be parsed.");
    }
    catch (RecognitionException | IOException e) {
      Log.error("0xEE640 Failed to parse " + model, e);
    }
    System.exit(1);
    return null;
  }
}

Listing 2.26: Methods for parsing and creating symbol tables.

Listing 2.26 presents the implementation of the method parse of the AutomataTool class which can be used for parsing Automata models. The class extends the generated abstract superclass AutomataToolTOP that provides methods to be used in the run method. One example is the method createSymbolTable that uses the global scope and genitors available in the mill to create a symbol table for Automata.

public static void main(String[] args) {
  // delegate main to instantiatable method for better integration,
  // reuse, etc.
  new AutomataTool().run(args);
}

public void run(String[] args) {

  // use normal logging (no DEBUG, TRACE)
  AutomataMill.init();
  Log.ensureInitialization();

  Options options = initOptions();
  try {
    //create CLI Parser and parse input options from commandline
    CommandLineParser cliparser = new org.apache.commons.cli.DefaultParser();
    CommandLine cmd = cliparser.parse(options, args);

    //help: when --help
    if (cmd.hasOption("h")) {
      printHelp(options);
      //do not continue, when help is printed.
      return;
    }
    //version: when --version
    else if (cmd.hasOption("v")) {
      printVersion();
      //do not continue when help is printed
      return;
    }

    Log.info("Automata DSL Tool", "AutomataTool");

    if (cmd.hasOption("i")) {
      String model = cmd.getOptionValue("i");
      final ASTAutomaton ast = parse(model);
      Log.info(model + " parsed successfully!", "AutomataTool");

      AutomataMill.globalScope().setFileExt("aut");
      IAutomataArtifactScope modelTopScope = createSymbolTable(ast);
      // can be used for resolving things in the model
      Optional<StateSymbol> aSymbol = modelTopScope.resolveState("Ping");
      if (aSymbol.isPresent()) {
        Log.info("Resolved state symbol \"Ping\"; FQN = "
            + aSymbol.get().toString(),
          "AutomataTool");
      } else {
        Log.info("This automaton does not contain a state called \"Ping\";",
          "AutomataTool");
        }
      runDefaultCoCos(ast);

      if(cmd.hasOption("s")){
        String storeLocation = cmd.getOptionValue("s");
        storeSymbols(modelTopScope, storeLocation);
      }

      // analyze the model with a visitor
      CountStates cs = new CountStates();
      AutomataTraverser traverser = AutomataMill.traverser();
      traverser.add4Automata(cs);
      ast.accept(traverser);
      Log.info("Automaton has " + cs.getCount() + " states.", "AutomataTool");
      prettyPrint(ast,"");
    }else{
      printHelp(options);
    }
  } catch (ParseException e) {
    // e.getMessage displays the incorrect input-parameters
    Log.error("0xEE752 Could not process AutomataTool parameters: " + e.getMessage());
  }
}

Listing 2.27: Main method of the AutomataTool class

The AutomataTool provides a main method, which can be called from the command line. The implementation of the method is depicted in Listing 2.27. It expects two inputs. The first is the name of a file containing an Automata model. The second input is the name of the file in which the tool should store the symbol table of the model given as first input.

The method

parses the input model,
creates the symbol table,
resolves a state,
executes context conditions,
stores the symbol table by using the serialization,
executes the visitor for counting the states, and
pretty prints the model to the standard output.

Inspect the main method and try to understand the implementation for the executed tasks. Read the above descriptions again if necessary.

Run the Tool

The previous command compiles the handwritten and generated code including the Automata tool class AutomataTool. For running the Automata DSL tool, execute the following command:

CLIGradle

With Powershell on Windows

java -cp "src/;out/;hwc/;monticore-rt.jar" `
                    automata.AutomataTool -i example/PingPong.aut `
                    -s st/PingPong.autsym

With Bash on Unix

java -cp "src/:out/:hwc/:monticore-rt.jar" \
                    automata.AutomataTool -i example/PingPong.aut \
                    -s st/PingPong.autsym

With cmd on Windows

java -cp "src/;out/;hwc/;monticore-rt.jar" ^
                    automata.AutomataTool -i example/PingPong.aut ^
                    -s st/PingPong.autsym

Please note again, on Unix systems paths are separated using ":" (colon) instead of semicolons. Executing the command runs the Automata DSL tool.

Using the option -cp makes the Java interpreter consider the compiled classes contained in the paths specified by the argument.

The argument automata.AutomataTool makes the Java interpreter execute the main method of the class automata.AutomataTool contained in the directory src.

With JUnit
Run the AutoamtaToolTest JUnit test.

With Powershell on Windows

java -jar target/libs/automaton-7.7.0-tool.jar `
                    -i src/test/resources/automata/parser/PingPong.aut `
                    -s st/PingPong.autsym

With Bash on Unix

java -jar target/libs/automaton-7.7.0-tool.jar \
                    -i src/test/resources/automata/parser/PingPong.aut \
                    -s st/PingPong.autsym

With cmd on Windows

java -jar target/libs/automaton-7.7.0-tool.jar ^
                    -i src/test/resources/automata/parser/PingPong.aut ^
                    -s st/PingPong.autsym

You will have to run gradle build to create the tool.jar.

Executing the command runs the Automata DSL tool. The automaton-...-tool.jar is a packaged jar file with the AutomataTool as its entry point.

The argument example/PingPong.aut is passed to the main method of the Automata DSL tool class as input. Inspect the output on the command line, which displays log messages concerning the processing of the example Automata model.

The last argument st/PingPong.autsym is also passed to the main method. It makes the tool store the serialized symbol table of the input model into the file example/PingPong.aut.

The shipped example Automata DSL (all sources contained in mc-workspace/src and mc-workspace/hwc) can be used as a starting point for creating your own language. It can easily be altered to specify your own DSL by adjusting the grammar and the handwritten Java sources and rerunning MontiCore as described above.

Using MontiCore in Eclipse

The MontiCore plugin can be used in Eclipse. Section 2.4.1 describes the process of setting up Eclipse. Section 2.4.2 presents how to import the example project in Eclipse. Finally, Section 2.4.3 explains how the MontiCore Gradle plugin can be executed in Eclipse.

Setting up Eclipse

Before you import the example project and run MontiCore as a Gradle plugin, please make sure that a current version of the Gradle plugin is installed in Eclipse. When installing a new version of Eclipse, the Gradle plugin is installed by default. If the Gradle plugin is not yet integrated into your Eclipse installation, download the latest Eclipse version or perform the following steps to install the Eclipse plugin:

Download and install Eclipse (or use an existing one).
Open Eclipse.
Install the needed Plugins.
- Help > Eclipse Marketplace...
- Type 'gradle' in the search box and click Enter.
- Install the 'Buildship Gradle Integration' plugin.
Make sure to configure Eclipse to use an JDK instead of an JRE.
- Window > Preferences > Java > Installed JREs.

Importing the Example

The shipped example Automata DSL can be used as a starting point. Once imported into Eclipse, it can easily be altered to specify your own DSL by adjusting the grammar and the handwritten Java sources and rerunning MontiCore as described in Section 2.4.3. To import the example, perform the following steps:

Download and unzip the Automata example:

http://www.monticore.de/download/Automaton.zip

Open Eclipse and select
- File > Import > Gradle (if you are required to choose a Gradle version, then choose version 7.6.4) > Existing Gradle Projects > Next.
- Click on the Browse.. button and import the directory that contains the file build.gradle from the Automata example.

Running MontiCore

To execute the MontiCore Gradle plugin, perform the following steps:

Select the Gradle Task menu (at the top or bottom, depending on your installed Eclipse version).
There select automaton > build > build (double click).

This makes Eclipse execute the MontiCore Gradle plugin as described in Section 2.3 of the MontiCore handbook . After installing and executing MontiCore in Eclipse, your workspace should look similar to Figure 2.28.

Figure 2.28: Eclipse after importing the example project and executing MontiCore

Using MontiCore in IntelliJ IDEA

The MontiCore plugin can be used in IDEA. Section 2.5.1 describes the process of setting up IntelliJ IDEA. Afterwards, Section 2.5.2 presents how to import the example project in Eclipse. Finally, Section 2.5.3 explains how the MontiCore Gradle plugin can be executed in IntelliJ IDEA.

Setting up IntelliJ IDEA

For setting up IntelliJ IDEA, perform the following steps:

Download and install IntelliJ IDEA (or use your existing installation).
- Hint for Students: You get the Ultimate version of IntelliJ IDEA for free.
Open IntelliJ IDEA.

Importing the Example

The shipped example Automata DSL can be used as a starting point. Once imported into IntelliJ IDEA, it can easily be altered to specify your own DSL by adjusting the grammar and the handwritten Java sources and rerunning MontiCore as described in Section 2.5.3. For importing the example, perform the following steps:

Download and unzip the Automata example:

http://www.monticore.de/download/Automaton.zip

In the IDE select: File > Open.
Select the directory containing the build.gradle (if you are required to choose a Gradle version, then choose version 6.7.1).

Running MontiCore

To execute the MontiCore Gradle plugin, perform the following steps:

Select the Gradle Projects menu on the right.
From there select automaton > Tasks> build > build (double click).

This makes IntelliJ IDEA execute the Gradle plugin as described in Section 2.3. If you do not see the Gradle Projects menu yet, right-click on the build.gradle file and select 'Import Gradle Project'. Now the Gradle Projects menu should occur on the right side and you can follow the above mentioned steps for the execution. After installing and executing MontiCore in IntelliJ IDEA, your workspace should look similar to Figure 2.29.

Figure 2.29: IntelliJ IDEA after importing the example project and executing MontiCore

Using MontiCore with GitPod

Installing all the prerequisites and an IDE can take some time. Alternatively to this, you can use Gitpod, an open-source Kubernetes application for ready-to-code developer environments. It already has all the prerequisites and an operational Web IDE similar to Microsoft's Visual Studio Code installed. You need to login with an existing GitHub account to use it.

This link can be used to access the Gitpod project for the Automata language. First, an environment for the project with the proper Java and Gradle version will be prepared and initialized automatically. After that, you will be directed to the Web IDE. The project will be built with Gradle first, and after that it is ready-to-use. The Web IDE also has a built-in terminal which can be used to build the project via gradle build or execute other tasks.

The Web IDE can be used to change existing project files, such as the Automata grammar or the handwritten classes for the language. Simply navigate to the grammars or classes in the file explorer on the left-hand side of the IDE and edit the files. This makes experimenting with MontiCore possible. The changes will be compiled by the IDE immediately and compilation errors will be marked with red color. To run the project, execute the command gradle build in the terminal.

You will notice that the link to the Gitpod project is generated and always has the same pattern. An example for a link is https://indigo-ostrich-8psdfoer.ws-eu18.gitpod.io. After 30 minutes of non-use, Gitpod will "freeze" the environment. It can be reactivated by using the same link to access it. The environment is reactivated, and you do not even need to rebuild the project with Gradle to use the project again.

Getting Started with MontiCore

Prerequisites: Installing the Java Development Kit

Use the MontiCore Command Line Interface

Installation

Inspect the Example Grammar

Run MontiCore

Abstract Syntax Tree Data Structure

Parser

Symbol Table

(De)Serialization of Symbol Tables

Visitor

Context Conditions

Mill as Factory for Builders

Compile the Target

Run the Tool

Using MontiCore in Eclipse

Setting up Eclipse

Importing the Example

Running MontiCore

Using MontiCore in IntelliJ IDEA

Setting up IntelliJ IDEA

Importing the Example

Running MontiCore

Using MontiCore with GitPod

Further Information