======= Handling TLA⁺ Statements =======
Here we cover [[https://craftinginterpreters.com/statements-and-state.html|Chapter 8: Statements and State]] of book //Crafting Interpreters//.
We'll implement the ability to associate identifiers with values, then subsequently use those identifiers in expressions.
While Lox was designed with a dual REPL/file execution model in mind, TLA⁺ was not - so the behavior of our interpreter will vary quite a bit from the book!
We'll have to implement parser & interpreter //modes//.
When the user runs our TLA⁺ interpreter as a REPL, we want to support operator redefinition and the ability to see the value of simple expressions - like the Python REPL.
In contast, when the user runs our TLA⁺ interpreter against a file it should be an error for multiple operators to use the same name.
The parser also should not accept standalone expressions in the middle of the file.
====== Section 8.1: Statements ======
In [[https://craftinginterpreters.com/statements-and-state.html#statements|section 8.1]] we add a new parse tree node class hierarchy: ''Stmt''.
The book initially defines two statement types: ''Expression'', and ''Print''.
''Expression'' statements are standalone expressions, which are evaluated then have their result discarded.
''Print'' statements are identified with a keyword and print out the expression result instead of discarding it.
We will be combining these into a single odd construct!
We want to support the user typing standalone expressions like ''x + y'' in the REPL then seeing their result.
TLA⁺ does not have a print keyword, and standalone expressions are not allowed in the language generally, so the best thing to do is allow standalone expressions //only// when running in a REPL, and parse these standalone expressions as print statements.
Add this to ''GenerateAst.java'' then run it to generate our new class:
"Variadic : Token operator, List parameters"
));
defineAst(outputDir, "Stmt", Arrays.asList(
"Print : Expr expression"
));
}
===== Parsing statements =====
We then need to extend our ''Parser'' constructor to accept a parameter telling us whether we're running in a REPL, so we know whether to accept standalone expressions:
private final List tokens;
private int current = 0;
private final boolean replMode;
Parser(List tokens, boolean replMode) {
this.tokens = tokens;
this.replMode = replMode;
}
Modify ''parse()'' in the ''Parser'' class, same as in the book:
List parse() {
List statements = new ArrayList<>();
while (!isAtEnd()) {
statements.add(statement());
}
return statements;
}
Our ''statement()'' method is different from the book; if we're in REPL mode we parse & return a statement wrapped in a ''Stmt.Print'' object, and error otherwise.
We'll return to this method later to add parsing logic for operator definitions.
private Stmt statement() {
if (replMode) return new Stmt.Print(expression());
throw error(peek(), "Expected statement.");
}
===== Executing statements =====
Add ''Stmt'' to the list of interfaces implemented by the ''Interpreter'' class:
class Interpreter implements Expr.Visitor
Now add a visitor method for the ''Stmt.Print'' class, after ''evaluate()'' in the ''Interpreter'' class.
This is identical to the book except for one thing, which we will change to improve the testability of our interpreter:
@Override
public Void visitPrintStmt(Stmt.Print stmt) {
Object value = evaluate(stmt.expression);
out.println(stringify(value));
return null;
}
Instead of printing directly to ''System.out'', we add a ''[[https://docs.oracle.com/javase/8/docs/api/java/io/PrintStream.html|PrintStream]]'' field to the ''Interpreter'' class called ''out''.
''System.out'' implements the ''PrintStream'' interface, but so do other things, and if we want to write unit tests for this class it helps to be able to intercept its output.
Add a constructor for the ''Interpreter'' class along with the field; also add a ''replMode'' parameter, although don't make use of it just yet:
class Interpreter implements Expr.Visitor
Same as the book, modify the old ''interpret()'' method to accept a list of statements.
The changed method is identical to the book except for using ''TlaPlus'' instead of ''Lox'':
void interpret(List statements) {
try {
for (Stmt statement : statements) {
execute(statement);
}
} catch (RuntimeError error) {
TlaPlus.runtimeError(error);
}
}
Also add the ''execute()'' helper method to the ''Interpreter'' class:
private void execute(Stmt stmt) {
stmt.accept(this);
}
At the top of ''Interpreter.java'', import the classes we're now using:
package tla;
import java.util.Set;
import java.util.HashSet;
import java.util.List;
import java.io.PrintStream;
===== Updating the main method =====
Now let's hook everything up in our main ''TlaPlus'' class.
Parameterize the ''run()'' method with a ''replMode'' argument, propagate it to the ''Parser'' constructor, and fix the return type of ''parser.parse()'':
private static void run(String source, boolean replMode) {
Scanner scanner = new Scanner(source);
List tokens = scanner.scanTokens();
Parser parser = new Parser(tokens, replMode);
List statements = parser.parse();
// Stop if there was a syntax error.
Then replace the call to the interpreter with this:
if (hadError) return;
interpreter.interpret(statements);
}
A bit more tidying up is necessary.
We need to detect when we're in REPL vs. file mode and propagate that information to our various methods and classes.
Remove the ''final'' annotation on the ''interpreter'' field of the ''TlaPlus'' class and leave it uninitialized by default:
public class TlaPlus {
private static Interpreter interpreter;
Then modify the ''runFile()'' method as follows:
private static void runFile(String path) throws IOException {
interpreter = new Interpreter(System.out, false);
byte[] bytes = Files.readAllBytes(Paths.get(path));
run(new String(bytes, StandardCharsets.UTF_8), false);
Similarly modify the ''runPrompt'' method:
private static void runPrompt() throws IOException {
interpreter = new Interpreter(System.out, true);
InputStreamReader input = new InputStreamReader(System.in);
BufferedReader reader = new BufferedReader(input);
for (;;) {
System.out.print("> ");
String line = reader.readLine();
if (line == null) break;
run(line, true);
hadError = false;
}
}
This was a lot of work to make our REPL basically function as it did before - printing out the value of an expression - but now it sets us up to implement operator definitions!
====== Section 8.2: Global Variables ======
In [[https://craftinginterpreters.com/statements-and-state.html#global-variables|section 8.2]] we introduce global variables in the sense of top-level operator definitions.
Operator definitions in TLA⁺ take the form ''name == expr''.
They can be parameterized with arguments, but we won't be implementing that until the tutorial covering chapter 10.
For now we are simply assigning a name to a value.
Start off by generating a new class ''Stmt.OpDef'' by adding to the ''main()'' method of your ''GenerateAst'' class:
defineAst(outputDir, "Stmt", Arrays.asList(
"Print : Expr expression",
"OpDef : Token name, Expr body"
));
Also generate a ''Expr.Variable'' class for expressions referring to the operators:
defineAst(outputDir, "Expr", Arrays.asList(
"Binary : Expr left, Token operator, Expr right",
"Grouping : Expr expression",
"Literal : Object value",
"Variable : Token name",
"Unary : Token operator, Expr expr",
Then, in your ''Parser'' class, modify the ''parse()'' method to repeatedly call ''declaration()'' instead of ''statement()'':
List parse() {
List statements = new ArrayList<>();
while (!isAtEnd()) {
statements.add(declaration());
}
return statements;
}
The ''declaration()'' method is defined similarly to the book, with one line difference highlighted.
private Stmt declaration() {
try {
if (lookahead().isAtOpDefStart()) return operatorDefinition();
return statement();
} catch (ParseError error) {
synchronize();
return null;
}
}
That single line difference calls three separate functions we still need to define.
While Lox is a nice unambiguous language with semicolon-delimited statements, TLA⁺ is a wholly different beast!
In TLA⁺ it is quite difficult to know where each definition ends and another beings.
We have to resort to the somewhat drastic measure of cloning our parser then looking several tokens ahead in the stream in search of a ''=='', at which point we will know for certain we are about to parse a new definition.
This type of unbounded lookahead is inelegant and can lead to poor parser performance, but it simply has to be done.
Here's our ''lookahead()'' method, which returns a cloned parser instance; put it above ''match()'':
private Parser lookahead() {
Parser lookahead = new Parser(tokens, replMode);
lookahead.current = current;
return lookahead;
}
And here is our ''isAtOpDefStart()'' method; it is quite simple for now, since we don't yet support parameters.
Put it below ''declaration()'':
private boolean isAtOpDefStart() {
if (!match(IDENTIFIER)) return false;
return match(EQUAL_EQUAL);
}
Finally, here's the ''operatorDefinition()'' method; again put it below ''declaration()'':
private Stmt operatorDefinition() {
Token name = consume(IDENTIFIER, "Name required for operator definition.");
consume(EQUAL_EQUAL, "== required for operator definition.");
return new Stmt.OpDef(name, expression());
}
So we clone the parser, used the cloned parser to look ahead to see whether we're at the start of an operator definition, and - if so - throw away the cloned parser then use our original parser to parse the operator definition like normal.
Incidentally, we can now fill out our ''synchronize()'' error recovery method!
We want ''synchronize()'' to return when it is about to parse another declaration.
That can be done with our ''lookahead()'' and ''isAtOpDefStart()'' methods:
private void synchronize() {
advance();
while(!isAtEnd()) {
if (lookahead().isAtOpDefStart()) return;
advance();
}
}
This means a syntax error in one operator definition will not spill out and halt parsing for the entire file.
Finally, let's parse our new ''Expr.Variable'' class.
Add this logic to the ''primary()'' method of the ''Parser'' class:
return new Expr.Literal(previous().literal);
}
if (match(IDENTIFIER)) {
return new Expr.Variable(previous());
}
if (match(LEFT_PAREN)) {
====== Section 8.3: Environments ======
In [[https://craftinginterpreters.com/statements-and-state.html#environments|section 8.3]] we build the infrastructure for assigning values to variables then retrieving them.
Create a new file, ''Environment.java'':
package tla;
import java.util.HashMap;
import java.util.Map;
class Environment {
private final Map values = new HashMap<>();
}
We need to make an additional modification here, adding a parameter indicating whether we're in a REPL.
We only want to allow variable redefinition inside a REPL, and disallow it otherwise.
This requires defining a constructor:
class Environment {
private final boolean allowRedefinition;
private final Map values = new HashMap<>();
Environment(boolean allowRedefinition) {
this.allowRedefinition = allowRedefinition;
}
}
Then, our ''define()'' method looks like this:
void define(Token name, Object value) {
if (!allowRedefinition && values.containsKey(name.lexeme)) {
throw new RuntimeError(name, "Redefined definition '" + name.lexeme + "'.");
}
values.put(name.lexeme, value);
}
Note that we changed the method signature to accept a ''Token'' instead of a ''String'' for the name, so that the ''RuntimeError'' can be constructed properly.
Our ''get()'' method is unchanged from the book:
Object get(Token name) {
if (values.containsKey(name.lexeme)) {
return values.get(name.lexeme);
}
throw new RuntimeError(name,
"Undefined variable '" + name.lexeme + "'.");
}
Now add an ''Environment'' instance as a field of your ''Interpreter'' class; here's where we make use of the ''replMode'' constructor parameter we added up above:
class Interpreter implements Expr.Visitor
Now add a visitor method in the ''Interpreter'' class for the ''Expr.Variable'' class, which retrieves the value of the variable from the environment:
@Override
public Object visitVariableExpr(Expr.Variable expr) {
return environment.get(expr.name);
}
====== Section 8.4: Assignment ======
In [[https://craftinginterpreters.com/statements-and-state.html#assignment|section 8.4]] we wire up our operator definitions so they store a new value in the environment.
Add a visitor method in the ''Interpreter'' class for the ''Stmt.OpDef'' class:
@Override
public Void visitOpDefStmt(Stmt.OpDef stmt) {
Object value = evaluate(stmt.body);
environment.define(stmt.name, value);
return null;
}
And that's it!
Your interpreter should now support definitions:
> x == 2
> y == 3
> x + y
5
====== Section 8.5: Scope ======
We aren't quite done with the chapter, although we won't be using scope just yet.
In [[https://craftinginterpreters.com/statements-and-state.html#scope|section 8.5]] we add support for nested environments.
Add a field to your ''Environment'' class:
class Environment {
final Environment enclosing;
private final boolean allowRedefinition;
private final Map values = new HashMap<>();
Initialize it to ''null'' in the current constructor, then add a copy constructor:
Environment(boolean allowRedefinition) {
enclosing = null;
this.allowRedefinition = allowRedefinition;
}
Environment(Environment enclosing) {
this.enclosing = enclosing;
this.allowRedefinition = enclosing.allowRedefinition;
}
No further changes to ''Environment'' are necessary at this time; we'll define the desired semantics in later chapters.
Next up, our greatest parsing challenge yet: [[creating:jlists|conjunction & disjunction lists]]!
If your code got out of sync during this tutorial, you can find a snapshot of its expected state in [[https://github.com/tlaplus-community/tlaplus-creator/tree/main/5-statements|this repo directory]].
====== Section 8.5: Challenges ======
- Write some unit tests for your interpreter. Use a ''[[https://docs.oracle.com/javase/8/docs/api/java/io/ByteArrayOutputStream.html|ByteArrayOutputStream]]'' instance as a parameter to the ''PrintStream'' constructor, then pass your ''PrintStream'' instance into the ''Interpreter'' class to capture its output.
- The ''isAtOpDefStart()'' and ''operatorDefinition()'' methods have some duplicated logic that will only grow more involved when we add operator parameter support. Can you find a way to factor out this logic into a single method?
[[creating:evaluation|< Previous Page]] | [[creating:start#table_of_contents|Table of Contents]] | [[creating:jlists|Next Page >]]