creating:statements

This is an old revision of the document!


PHP's gd library is missing or unable to create PNG images

Handling TLA⁺ Statements

Here we cover Chapter 8: Statements and State of book Crafting Interpreters. We'll implement the ability to associate identifiers with values, then subsequently use those identifiers in expressions. While Lox was designed with a dual REPL/file execution model in mind, TLA⁺ was not - so the behavior of our interpreter will vary quite a bit from the book! We'll have to implement parser & interpreter modes. When the user runs our TLA⁺ interpreter as a REPL, we want to support operator redefinition and the ability to see the value of simple expressions - like the Python REPL. In contast, when the user runs our TLA⁺ interpreter against a file it should be an error for multiple operators to use the same name. The parser also should not accept standalone expressions in the middle of the file.

Section 8.1: Statements

In section 8.1 we add a new parse tree node class hierarchy: Stmt. The book initially defines two statement types: Expression, and Print. Expression statements are standalone expressions, which are evaluated then have their result discarded. Print statements are identified with a keyword and print out the expression result instead of discarding it. We will be combining these into a single odd construct! We want to support the user typing standalone expressions like x + y in the REPL then seeing their result. TLA⁺ does not have a print keyword, and standalone expressions are not allowed in the language generally, so the best thing to do is allow standalone expressions only when running in a REPL, and parse these standalone expressions as print statements. Add this to GenerateAst.java then run it to generate our new class:

      "Variadic : Token operator, List<Expr> parameters"
    ));
 
    defineAst(outputDir, "Stmt", Arrays.asList(
      "Print    : Expr expression"
    ));
  }

We then need to extend our Parser constructor to accept a parameter telling us whether we're running in a REPL, so we know whether to accept standalone expressions:

  private final List<Token> tokens;
  private int current = 0;
  private final boolean replMode;
 
  Parser(List<Token> tokens, boolean replMode) {
    this.tokens = tokens;
    this.replMode = replMode;
  }

Modify parse() in the Parser class, same as in the book:

  List<Stmt> parse() {
    List<Stmt> statements = new ArrayList<>();
    while (!isAtEnd()) {
      statements.add(statement());
    }
 
    return statements;
  }

Our statement() method is different from the book; if we're in REPL mode we parse & return a statement wrapped in a Stmt.Print object, and error otherwise. We'll return to this method later to add parsing logic for operator definitions.

  private Stmt statement() {
    if (replMode) return new Stmt.Print(expression());
 
    throw error(peek(), "Expected statement.");
  }

Add Stmt to the list of interfaces implemented by the Interpreter class:

class Interpreter implements Expr.Visitor<Object>,
                             Stmt.Visitor<Void> {
  void interpret(Expr expression) {

Now add a visitor method for the Stmt.Print class, after evaluate() in the Interpreter class. This is identical to the book except for one thing, which we will change to improve the testability of our interpreter:

  @Override
  public Void visitPrintStmt(Stmt.Print stmt) {
    Object value = evaluate(stmt.expression);
    out.println(stringify(value));
    return null;
  }

Instead of printing directly to System.out, we add a PrintStream field to the Interpreter class called out. System.out implements the PrintStream interface, but so do other things, and if we want to write unit tests for this class it helps to be able to intercept its output. Add a constructor for the Interpreter class along with the field; also add a replMode parameter, although don't make use of it just yet:

class Interpreter implements Expr.Visitor<Object>,
                             Stmt.Visitor<Void> {
  private final PrintStream out;
 
  public Interpreter(PrintStream out, boolean replMode) {
    this.out = out;
  }

Same as the book, modify the old interpret() method to accept a list of statements. The changed method is identical to the book except for using TlaPlus instead of Lox:

  void interpret(List<Stmt> statements) {
    try {
      for (Stmt statement : statements) {
        execute(statement);
      }
    } catch (RuntimeError error) {
      TlaPlus.runtimeError(error);
    }
  }

Also add the execute() helper method to the Interpreter class:

  private void execute(Stmt stmt) {
    stmt.accept(this);
  }

At the top of Interpreter.java, import the classes we're now using:

package tla;
 
import java.util.Set;
import java.util.HashSet;
import java.util.List;
import java.io.PrintStream;

Now let's hook everything up in our main TlaPlus class. Parameterize the run() method with a replMode argument, propagate it to the Parser constructor, and fix the return type of parser.parse():

  private static void run(String source, boolean replMode) {
    Scanner scanner = new Scanner(source);
    List<Token> tokens = scanner.scanTokens();
 
    Parser parser = new Parser(tokens, replMode);
    List<Stmt> statements = parser.parse();
 
    // Stop if there was a syntax error.

Then replace the call to the interpreter with this:

    if (hadError) return;
 
    interpreter.interpret(statements);
  }

A bit more tidying up is necessary. We need to detect when we're in REPL vs. file mode and propagate that information to our various methods and classes. Remove the final annotation on the interpreter field of the TlaPlus class and leave it uninitialized by default:

public class TlaPlus {
  private static Interpreter interpreter;

Then modify the runFile() method as follows:

  private static void runFile(String path) throws IOException {
    interpreter = new Interpreter(System.out, false);
    byte[] bytes = Files.readAllBytes(Paths.get(path));
    run(new String(bytes, StandardCharsets.UTF_8), false);

Similarly modify the runPrompt method:

  private static void runPrompt() throws IOException {
    interpreter = new Interpreter(System.out, true);
    InputStreamReader input = new InputStreamReader(System.in);
    BufferedReader reader = new BufferedReader(input);
 
    for (;;) {
      System.out.print("> ");
      String line = reader.readLine();
      if (line == null) break;
      run(line, true);
      hadError = false;
    }
  }

This was a lot of work to make our REPL basically function as it did before - printing out the value of an expression - but now it sets us up to implement operator definitions!

Section 8.2: Global Variables

In section 8.2 we introduce global variables in the sense of top-level operator definitions. Operator definitions in TLA⁺ take the form name == expr. They can be parameterized with arguments, but we won't be implementing that until the tutorial covering chapter 10. For now we are simply assigning a name to a value.

Start off by generating a new class Stmt.OpDef by adding to the main() method of your GenerateAst class:

    defineAst(outputDir, "Stmt", Arrays.asList(
      "Print    : Expr expression",
      "OpDef    : Token name, Expr body"
    ));

Also generate a Expr.Variable class for expressions referring to the operators:

    defineAst(outputDir, "Expr", Arrays.asList(
      "Binary   : Expr left, Token operator, Expr right",
      "Grouping : Expr expression",
      "Literal  : Object value",
      "Variable : Token name",
      "Unary    : Token operator, Expr expr",

Then, in your Parser class, modify the parse() method to repeatedly call declaration() instead of statement():

  List<Stmt> parse() {
    List<Stmt> statements = new ArrayList<>();
    while (!isAtEnd()) {
      statements.add(declaration());
    }
 
    return statements;
  }

The declaration() method is defined similarly to the book, with one line difference highlighted.

  private Stmt declaration() {
    try {
      if (lookahead().isAtOpDefStart()) return operatorDefinition();
 
      return statement();
    } catch (ParseError error) {
      synchronize();
      return null;
    }
  }

That single line difference calls three separate functions we still need to define. While Lox is a nice unambiguous language with semicolon-delimited statements, TLA⁺ is a wholly different beast! In TLA⁺ it is quite difficult to know where each definition ends and another beings. We have to resort to the somewhat drastic measure of cloning our parser then looking several tokens ahead in the stream in search of a ==, at which point we will know for certain we are about to parse a new definition. This type of unbounded lookahead is inelegant and can lead to poor parser performance, but it simply has to be done. Here's our lookahead() method, which returns a cloned parser instance; put it above match():

  private Parser lookahead() {
    Parser lookahead = new Parser(tokens, replMode);
    lookahead.current = current;
    return lookahead;
  }

And here is our isAtOpDefStart() method; it is quite simple for now, since we don't yet support parameters. Put it below declaration():

  private boolean isAtOpDefStart() {
    if (!match(IDENTIFIER)) return false;
 
    return match(EQUAL_EQUAL);
  }

Finally, here's the operatorDefinition() method; again put it below declaration():

  private Stmt operatorDefinition() {
    Token name = consume(IDENTIFIER, "Name required for operator definition.");
    consume(EQUAL_EQUAL, "== required for operator definition.");
    return new Stmt.OpDef(name, expression());
  }

So we clone the parser, used the cloned parser to look ahead to see whether we're at the start of an operator definition, and - if so - throw away the cloned parser then use our original parser to parse the operator definition like normal.

Incidentally, we can now fill out our synchronize() error recovery method! We want synchronize() to return when it is about to parse another declaration. That can be done with our lookahead() and isAtOpDefStart() methods:

  private void synchronize() {
    advance();
 
    while(!isAtEnd()) {
      if (lookahead().isAtOpDefStart()) return;
 
      advance();
    }
  }

This means a syntax error in one operator definition will not spill out and halt parsing for the entire file.

Finally, let's parse our new Expr.Variable class. Add this logic to the primary() method of the Parser class:

      return new Expr.Literal(previous().literal);
    }
 
    if (match(IDENTIFIER)) {
      return new Expr.Variable(previous());
    }
 
    if (match(LEFT_PAREN)) {

Section 8.3: Environments

In section 8.3 we build the infrastructure for assigning values to variables then retrieving them. Create a new file, Environment.java:

package tla;
 
import java.util.HashMap;
import java.util.Map;
 
class Environment {
  private final Map<String, Object> values = new HashMap<>();
}

We need to make an additional modification here, adding a parameter indicating whether we're in a REPL. We only want to allow variable redefinition inside a REPL, and disallow it otherwise. This requires defining a constructor:

class Environment {
  private final boolean allowRedefinition;
  private final Map<String, Object> values = new HashMap<>();
 
  Environment(boolean allowRedefinition) {
    this.allowRedefinition = allowRedefinition;
  }
}

Then, our define() method looks like this:

  void define(Token name, Object value) {
    if (!allowRedefinition && values.containsKey(name.lexeme)) {
      throw new RuntimeError(name, "Redefined definition '" + name.lexeme + "'.");
    }
 
    values.put(name.lexeme, value);
  }

Note that we changed the method signature to accept a Token instead of a String for the name, so that the RuntimeError can be constructed properly.

Our get() method is unchanged from the book:

  Object get(Token name) {
    if (values.containsKey(name.lexeme)) {
      return values.get(name.lexeme);
    }
 
    throw new RuntimeError(name,
        "Undefined variable '" + name.lexeme + "'.");
  }

Now add an Environment instance as a field of your Interpreter class; here's where we make use of the replMode constructor parameter we added up above:

class Interpreter implements Expr.Visitor<Object>,
                             Stmt.Visitor<Void> {
  private Environment environment;
  private final PrintStream out;
 
  public Interpreter(PrintStream out, boolean replMode) {
    this.environment = new Environment(replMode);
    this.out = out;
  }

Now add a visitor method in the Interpreter class for the Expr.Variable class, which retrieves the value of the variable from the environment:

  @Override
  public Object visitVariableExpr(Expr.Variable expr) {
    return environment.get(expr.name);
  }

Section 8.4: Assignment

In section 8.4 we wire up our operator definitions so they store a new value in the environment. Add a visitor method in the Interpreter class for the Stmt.OpDef class:

  @Override
  public Void visitOpDefStmt(Stmt.OpDef stmt) {
    Object value = evaluate(stmt.body);
    environment.define(stmt.name, value);
    return null;
  }

And that's it! Your interpreter should now support definitions:

> x == 2
> y == 3
> x + y
5

Section 8.5: Scope

We aren't quite done with the chapter, although we won't be using scope just yet. In section 8.5 we add support for nested environments. Add a field to your Environment class:

class Environment {
  final Environment enclosing;
  private final boolean allowRedefinition;
  private final Map<String, Object> values = new HashMap<>();

Initialize it to null in the current constructor, then add a copy constructor:

  Environment(boolean allowRedefinition) {
    enclosing = null;
    this.allowRedefinition = allowRedefinition;
  }
 
  Environment(Environment enclosing) {
    this.enclosing = enclosing;
    this.allowRedefinition = enclosing.allowRedefinition;
  }

No further changes to Environment are necessary at this time; we'll define the desired semantics in later chapters.

Next up, our greatest parsing challenge yet: conjunction & disjunction lists!

Section 8.5: Challenges

  1. Write some unit tests for your interpreter. Use a ByteArrayOutputStream instance as a parameter to the PrintStream constructor, then pass your PrintStream instance into the Interpreter class to capture its output.
  2. The isAtOpDefStart() and operatorDefinition() methods have some duplicated logic that will only grow more involved when we add operator parameter support. Can you find a way to factor out this logic into a single method?

< Previous Page | Table of Contents | Next Page >

  • creating/statements.1745090916.txt.gz
  • Last modified: 2025/04/19 19:28
  • by ahelwer