creating:expressions

This is an old revision of the document!


PHP's gd library is missing or unable to create PNG images

Handling Constant TLA⁺ Expressions

This tutorial page covers the next three chapters in Crafting Interpreters:

Same as the book, we could build a parser for our entire minimal TLA⁺ language subset before moving on to interpreting it, but that would be boring! Instead we focus on a simple vertical slice of the language: expressions. And not just any expressions, constant expressions - expressions that do not contain variables or identifiers that we would have to resolve. Just primitive literal values stuck together, not too much more advanced than a simple calculator app. This will give us a skeleton on which to hang the rest of the language.

Each section in this tutorial corresponds to one or more sections of Crafting Interpreters (henceforth referred to as "the book"). First read the section of the book, then read the corresponding commentary & modifications given by this tutorial.

Chapter 5: Representing Code

Chapter 5 focuses more on concepts than code, so this section will not have many TLA⁺-specific modifications. However, since this tutorial is intended to be self-contained code-wise, all necessary code is reproduced.

Everything before Section 5.1.3: A Grammar for Lox expressions applies equally to TLA⁺. The ambiguous first-draft grammar for Lox expressions can be adapted to TLA⁺:

expression     → literal
               | unary
               | binary
               | ternary
               | variadic
               | grouping ;
 
literal        → NUMBER | "TRUE" | "FALSE" ;
grouping       → "(" expression ")" ;
unary          → (( "ENABLED" | "-" ) expression ) | ( expression "'" ) ;
binary         → expression operator expression ;
ternary        → "IF" expression "THEN" expression "ELSE" expression;
variadic       → "{" ( expression ( "," expression )* )? "}"
operator       → "=" | "+" | "-" | ".." | "/\" | "\/" | "<"  | "\in" ;

There are a few interesting differences. The unary rule now captures both prefix and suffix operators, which both only accept a single parameter. The ternary rule matches the IF/THEN/ELSE operator, with the three parameters being the predicate, the true branch, and the false branch. The operators are changed to the set of operators defined in our TLA⁺ implementation. These are all the expressions we can use without introducing the concept of identifiers referring to something else.

There is also the variadic rule matching finite set literals like {1, 2, 3} or the empty set {}. This one is so named because it's where we'll put operators accepting varying numbers of parameters. It's kind of weird to think of the finite set literal {1, 2, 3} as an operator, but it is! The only difference between {1, 2, 3} and an operator like constructSet(1, 2, 3) is syntactic sugar. This is the perspective of a language implementer. Later on we will extend the definition of variadic to include vertically-aligned conjunction & disjunction lists.

While some programmers might have an aversion to generating code, the approach taken in the book is actually very convenient - as you will discover if you take some time to prototype & play around with different class representations of the parse tree! In Section 5.2.2: Metaprogramming the trees, the main differences in the GenerateAst class reflect our modification of the Lox grammar in the previous section:

package tool;
 
import java.io.IOException;
import java.io.PrintWriter;
import java.util.Arrays;
import java.util.List;
 
public class GenerateAst {
  public static void main(String[] args) throws IOException {
    if (args.length != 1) {
      System.err.println("Usage: generate_ast <output directory>");
      System.exit(64);
    }
    String outputDir = args[0];
    defineAst(outputDir, "Expr", Arrays.asList(
      "Binary   : Expr left, Token operator, Expr right",
      "Grouping : Expr expression",
      "Literal  : Object value",
      "Unary    : Token operator, Expr expr",
      "Ternary  : Token operator, Expr first, Expr second, Expr third",
      "Variadic : Token operator, List<Expr> parameters"
    ));
  }
}

In the defineAst function, we only need to modify the output code so that it uses the package .tla instead of .lox; add this after the main method of GenerateAst:

  private static void defineAst(
      String outputDir, String baseName, List<String> types)
      throws IOException {
    String path = outputDir + "/" + baseName + ".java";
    PrintWriter writer = new PrintWriter(path, "UTF-8");
 
    writer.println("package tla;");
    writer.println();
    writer.println("import java.util.List;");
    writer.println();
    writer.println("abstract class " + baseName + " {");
 
 
    // The AST classes.
    for (String type : types) {
      String className = type.split(":")[0].trim();
      String fields = type.split(":")[1].trim();
      defineType(writer, baseName, className, fields);
    }
    writer.println("}");
    writer.close();
  }

Finally, the defineType method - added after the defineAst method - is unchanged:

  private static void defineType(
      PrintWriter writer, String baseName,
      String className, String fieldList) {
    writer.println("  static class " + className + " extends " +
        baseName + " {");
 
    // Constructor.
    writer.println("    " + className + "(" + fieldList + ") {");
 
    // Store parameters in fields.
    String[] fields = fieldList.split(", ");
    for (String field : fields) {
      String name = field.split(" ")[1];
      writer.println("      this." + name + " = " + name + ";");
    }
 
    writer.println("    }");
 
    // Fields.
    writer.println();
    for (String field : fields) {
      writer.println("    final " + field + ";");
    }
 
    writer.println("  }");
  }

Section 5.3 introduces the Visitor pattern. No TLA⁺-specific differences are necessary when modifying GenerateAst to support it. Insert this line in defineAst():

    writer.println("abstract class " + baseName + " {");
 
    defineVisitor(writer, baseName, types);
 
    // The AST classes.

This calls the defineVisitor function which writes the visitor interface, defined as follows:

  private static void defineVisitor(
      PrintWriter writer, String baseName, List<String> types) {
    writer.println("  interface Visitor<R> {");
 
    for (String type : types) {
      String typeName = type.split(":")[0].trim();
      writer.println("    R visit" + typeName + baseName + "(" +
          typeName + " " + baseName.toLowerCase() + ");");
    }
 
    writer.println("  }");
  }

Again insert some lines in defineAst() to create the abstract accept method:

      defineType(writer, baseName, className, fields);
    }
 
    // The base accept() method.
    writer.println();
    writer.println("  abstract <R> R accept(Visitor<R> visitor);");
 
    writer.println("}");

Finally, insert some lines in defineType to add a types-specific accept method in each output class:

    writer.println("    }");
 
    // Visitor pattern.
    writer.println();
    writer.println("    @Override");
    writer.println("    <R> R accept(Visitor<R> visitor) {");
    writer.println("      return visitor.visit" +
        className + baseName + "(this);");
    writer.println("    }");
 
    // Fields.

Our generated syntax tree node types now support the visitor pattern!

Section 5.4 provides an implementation of a visitor called AstPrinter that prints out the parse tree. There are a few TLA⁺-specific modifications, starting of course with the package name:

package tla;
 
class AstPrinter implements Expr.Visitor<String> {
  String print(Expr expr) {
    return expr.accept(this);
  }
}

We also have some modifications and additions to the visit methods of the AstPrinter class, again reflecting our modified TLA⁺-specific grammar:

  @Override
  public String visitBinaryExpr(Expr.Binary expr) {
    return parenthesize(expr.operator.lexeme,
                        expr.left, expr.right);
  }
 
  @Override
  public String visitGroupingExpr(Expr.Grouping expr) {
    return parenthesize("group", expr.expression);
  }
 
  @Override
  public String visitLiteralExpr(Expr.Literal expr) {
    if (expr.value == null) return "nil";
    return expr.value.toString();
  }
 
  @Override
  public String visitUnaryExpr(Expr.Unary expr) {
    return parenthesize(expr.operator.lexeme, expr.expr);
  }
 
  @Override
  public string visitTernaryExpr(Expr.Ternary expr) {
    return parenthesize(expr.operator.lexeme, expr.first,
                        expr.second, expr.third);
  }
 
  @Override
  public string visitVariadicExpr(Expr.Variadic expr) {
    return parenthesize(expr.operator.lexeme,
                        expr.parameters.toArray(Expr[]::new));
  }

The parenthesize method is unchanged from the book and should be inserted after the visit methods:

  private String parenthesize(String name, Expr... exprs) {
    StringBuilder builder = new StringBuilder();
 
    builder.append("(").append(name);
    for (Expr expr : exprs) {
      builder.append(" ");
      builder.append(expr.accept(this));
    }
    builder.append(")");
 
    return builder.toString();
  }

It isn't necessary to define a main method in the AstPrinter class, but if you'd like to try it out you are free to copy the one given in the book.

Chapter 6: Parsing Expressions

In Chapter 6, we finally build a parse tree out of our tokens!

First, as in section 6.1 of the book, we have to disambiguate our grammar. The way that precedence works in TLA⁺ is different from Lox, and indeed different from most other languages! One side-quote from this section of the book reads:

While not common these days, some languages specify that certain pairs of operators have no relative precedence. That makes it a syntax error to mix those operators in an expression without using explicit grouping.
Likewise, some operators are *non-associative*. That means it’s an error to use that operator more than once in a sequence. For example, Perl’s range operator isn’t associative, so a .. b is OK, but a .. b .. c is an error.

TLA⁺ has both of these features! Instead of operators occupying a slot in some hierarchy of precedence, each operator has a precedence range. When the precedence ranges of two operators overlap it is a parse error. For example, the ENABLED prefix operator has precedence range 4-15, and the prime operator has precedence range 15-15; thus, the following expression has a precedence conflict and should not parse:

ENABLED x'

Users must add parentheses to indicate their desired grouping in the expression. Similarly, some operators like = are not associative so a = b = c should be a parse error. Both of these factors combine to make our operator parsing code quite a bit different from that given in the book. Worry not, it can still be made terse and understandable!

Section 6.2 is where we start writing our parser in the recursive descent style. Recursive descent can seem a bit magical even if you're used to reasoning about recursive functions! Unfortunately TLA⁺ requires us to push this magic even further. We'll take it step by step.

We start with the same basic Parser.java file as in the book, renaming lox to tla in the package name as usual. We also add an import for ArrayList, which we will later make use of when parsing variadic operators:

package tla;
 
import java.util.List;
import java.util.ArrayList;
 
import static tla.TokenType.*;
 
class Parser {
  private final List<Token> tokens;
  private int current = 0;
 
  Parser(List<Token> tokens) {
    this.tokens = tokens;
  }
}

Now we hit our first major difference. In Lox, precedence is given by a small hierarchy of named rules like equality, comparison, term, etc. TLA⁺ is more complicated than that. The full language has around 100 operators spanning precedences 1 to 15! If we wanted to match the book's style we'd have to write a tower of recursive functions like:

private Expr operatorExpressionPrec1() { ... }
private Expr operatorExpressionPrec2() { ... }
private Expr operatorExpressionPrec3() { ... }
...
private Expr operatorExpressionPrec15() { ... }

where each operatorExpressionPrecN() function parses all the prefix, infix, and postfix operators of precedence N, and calls operatorExpressionPrecN+1(). Life is too short for this. Instead, we'll adopt a technique alluded to in the text:

If you wanted to do some clever Java 8, you could create a helper method for parsing a left-associative series of binary operators given a list of token types, and an operand method handle to simplify this redundant code.

Here's the skeleton of our operator parsing function; the trick is to make the precedence a parameter to the function instead of a component of the name. Add this code after the Parser() constructor:

  private Expr expression() {
    return operatorExpression(1);
  }
 
  private Expr operatorExpression(int prec) {
    if (prec == 16) return primary();
 
    Expr expr = operatorExpression(prec + 1);
 
    return expr;
  }

Before filling out operatorExpression(), we'll add the helper methods; these form an incredibly well-designed parsing API and are unchanged from the book:

  private boolean match(TokenType... types) {
    for (TokenType type : types) {
      if (check(type)) {
        advance();
        return true;
      }
    }
 
    return false;
  }
 
  private boolean check(TokenType type) {
    if (isAtEnd()) return false;
    return peek().type == type;
  }
 
  private Token advance() {
    if (!isAtEnd()) current++;
    return previous();
  }
 
  private boolean isAtEnd() {
    return peek().type == EOF;
  }
 
  private Token peek() {
    return tokens.get(current);
  }
 
  private Token previous() {
    return tokens.get(current - 1);
  }

Now we have to define a table of operators with their details. For this, create a new file Operator.java containing a class recording operator fix type, token type, associativity, and precedence range:

package tla;
 
enum Fix {
  PREFIX, INFIX, POSTFIX
}
 
class Operator {
  final Fix fix;
  final TokenType token;
  final boolean assoc;
  final int lowPrec;
  final int highPrec;
 
  public Operator(Fix fix, TokenType token, boolean assoc,
                  int lowPrec, int highPrec) {
    this.fix = fix;
    this.token = token;
    this.assoc = assoc;
    this.lowPrec = lowPrec;
    this.highPrec = highPrec;
  }
}

For convenience, import the Fix enum values in Parser.java so they can be referenced directly:

package tla;
 
import java.util.List;
import java.util.ArrayList;
 
import static tla.TokenType.*;
import static tla.Fix.*;
 
class Parser {

You can find operator attributes on page 271 of Specifying Systems by Leslie Lamport, or this TLA⁺ tools source file. We use a small subset of these operators. Record their attributes in a table in Parser.java, below operatorExpression():

  private static final Operator[] operators = new Operator[] {
    new Operator(PREFIX,  NOT,        true,   4,  4 ),
    new Operator(PREFIX,  ENABLED,    false,  4,  15),
    new Operator(PREFIX,  MINUS,      true,   12, 12),
    new Operator(INFIX,   AND,        true,   3,  3 ),
    new Operator(INFIX,   OR,         true,   3,  3 ),
    new Operator(INFIX,   IN,         false,  5,  5 ),
    new Operator(INFIX,   EQUAL,      false,  5,  5 ),
    new Operator(INFIX,   LESS_THAN,  false,  5,  5 ),
    new Operator(INFIX,   DOT_DOT,    false,  9,  9 ),
    new Operator(INFIX,   PLUS,       true,   10, 10),
    new Operator(INFIX,   MINUS,      true,   11, 11),
    new Operator(POSTFIX, PRIME,      false,  15, 15),
  };

Here's something a bit odd. In TLA⁺, the infix minus operator (subtraction) has higher precedence at 11-11 than the infix plus operator (addition) at 10-10! In grade school you probably learned acronyms like PEMDAS or BEDMAS to remember the order of operations in arithmetic. Really, you learned parsing rules! Now when writing our own parsing algorithm we come to understand that the order of these operations is not inscribed in the bedrock of mathematical truth but instead is a simple convention of mathematical notation. TLA⁺ subverts this by parsing the expression a + b - c as a + (b - c) instead of the PEMDAS-style (a + b) - c. While this design decision is unusual, it is unlikely to cause any problems.

We need one more helper method before we can start working on operatorExpression(): a superpowered match() method specific to operators, which will try to match any operators of a given fix type & (low) precedence. Add this code above match():

  private Operator matchOp(Fix fix, int prec) {
    for (Operator op : operators) {
      if (op.fix == fix && op.lowPrec == prec) {
        if (match(op.token)) return op;
      }
    }
 
    return null;
  }

Okay, we're all set for the main event! Here is how we modify operatorExpression() to parse infix operators. You can see a strong resemblance to the infix operator parsing code given in the book (new lines highlighted):

  private Expr operatorExpression(int prec) {
    if (prec == 16) return primary();
 
    Operator op;
 
    Expr expr = operatorExpression(prec + 1);
    while ((op = matchOp(INFIX, prec)) != null) {
      Token operator = previous();
      Expr right = operatorExpression(op.highPrec + 1);
      expr = new Expr.Binary(expr, operator, right);
    }
 
    return expr;
  }

We use Java's combined conditional-assignment atop the while loop to make our code more terse, both checking whether any operators were matched and getting details of the matched operator if so. The interior of the while loop is largely identical to infix parsing logic from the book, except for when we recurse to a higher precedence level: in the book the code recurses one level above the matched operator's precedence, but here we go one level above the upper bound of the matched operator's precedence. It takes a bit of thinking to understand why this works for parsing expressions according to the precedence rules we want, but it does. Take a minute to ponder it. If you still don't get it, wait until we have a fully-functional expression parser then play around with this line to see how it changes the behavior.

Our code implements precedence ranges, but assumes all infix operators are associative. We need to modify the loop to return immediately if the infix operator is not associative:

    while ((op = matchOp(INFIX, prec)) != null) {
      Token operator = previous();
      Expr right = operatorExpression(op.highPrec + 1);
      expr = new Expr.Binary(expr, operator, right);
      if (!op.assoc) return expr;
    }

And that's how we parse infix operators! That was the most complicated case, so let's move on to prefix operators. Again our code resembles the prefix operator parsing logic given in the book. Add this snippet near the top of operatorExpression():

  private Expr operatorExpression(int prec) {
    if (prec == 16) return primary();
 
    Operator op;
    if ((op = matchOp(PREFIX, prec)) != null) {
      Token opToken = previous();
      Expr expr = operatorExpression(
                    op.assoc ? op.lowPrec : op.highPrec + 1);
      return new Expr.Unary(opToken, expr);
    }
 
    Expr expr = operatorExpression(prec + 1);

Of note is the expression op.assoc ? op.lowPrec : op.highPrec + 1 controlling the precedence level at which we recurse. To understand this expression, it is helpful to consider an example. The negative prefix operator -1 is associative, so we should be able to parse the expression --1 as -(-1). In order to do that after consuming the first - we have to recurse into operatorExpression() at the same precedence level. Then we will consume the second -, then again recurse and ultimately consume 1 in our yet-to-be-defined primary method. In contrast, the prefix operator ENABLED is not associative and has range 4-15. In that case we want to reject the expression ENABLED ENABLED TRUE, so we recurse at a level higher than ENABLED's upper precedence bound. Thus the second ENABLED will not be matched and we will report a parse error using methods defined in chapter section 6.3.

All that remains is to parse postfix operators. These are not covered in the book, but they pretty much resemble infix operators without matching an expression on the right-hand side of the operator. Add the highlighted code below the INFIX logic in operatorExpression():

    }
 
    while ((op = matchOp(POSTFIX, prec)) != null) {
      Token opToken = previous();
      expr = new Expr.Unary(opToken, expr);
      if (!op.assoc) break;
    }
 
    return expr
  }

And that completes our operator parsing logic! All that remains is our primary() method. It is fairly similar to that given in the book, with some omissions since our minimal TLA⁺ subset does not contain strings or null values. Add this code after operatorExpression():

  private Expr primary() {
    if (match(FALSE)) return new Expr.Literal(false);
    if (match(TRUE)) return new Expr.Literal(true);
 
    if (match(NUMBER)) {
      return new Expr.Literal(previous().literal);
    }
 
    if (match(LEFT_PAREN)) {
      Expr expr = expression();
      consume(RIGHT_PAREN, "Expect ')' after expression.");
      return new Expr.Grouping(expr);
    }
  }

We have two final additions to primary(), the IF/THEN/ELSE operator and the set construction operator. Here's how we parse IF/THEN/ELSE; add the highlighted code to the bottom of primary():

      return new Expr.Grouping(expr);
    }
 
    if (match(IF)) {
      Token operator = previous();
      Expr condition = expression();
      consume(THEN, "'THEN' required after 'IF' expression.");
      Expr yes = expression();
      consume(ELSE, "'ELSE' required after 'THEN' expression.");
      Expr no = expression();
      return new Expr.Ternary(operator, condition, yes, no);
    }
  }

And here's how we parse the set construction operator. Again add the highlighted code to the bottom of primary():

      return new Expr.Ternary(operator, condition, yes, no);
    }
 
    if (match(LEFT_BRACE)) {
      Token operator = previous();
      List<Expr> elements = new ArrayList<Expr>();
      if (RIGHT_BRACE != peek().type) {
        do {
          elements.add(expression());
        } while (match(COMMA));
      }
      consume(RIGHT_BRACE, "'}' is required to terminate finite set literal.");
      return new Expr.Variadic(operator, elements);
    }
  }

This will handle expressions like {1,2,3} and (crucially) {}.

The consume() operator is defined in the next section, along with error reporting generally.

We don't need to modify most of the error reporting code given in Section 6.3. Here it is; add the consume() method after match() in the Parser class:

  private Token consume(TokenType type, String message) {
    if (check(type)) return advance();
 
    throw error(peek(), message);
  }

Then add the error() method after previous():

  private ParseError error(Token token, String message) {
    Lox.error(token, message);
    return new ParseError();
  }

In the TlaPlus class, add the error() method after report():

  static void error(Token token, String message) {
    if (token.type == TokenType.EOF) {
      report(token.line, " at end", message);
    } else {
      report(token.line, " at '" + token.lexeme + "'", message);
    }
  }

Add the ParseError class definition at the top of the Parser class:

class Parser {
  private static class ParseError extends RuntimeException {}
 
  private final List<Token> tokens;

One piece of code we do need to modify is the synchronization() method. TLA⁺ doesn't have statements separated by semicolons, so our best choice of synchronization point is the EQUAL_EQUAL symbol denoting an operator definition; add this in the Parser class after error():

  private void synchronize() {
    advance();
 
    while(!isAtEnd()) {
      if (previous().type == EQUAL_EQUAL) return;
 
      advance();
    }
  }

Fully-featured TLA⁺ parsers generally look ahead to identify the start of a new operator definition (not trivial!) and insert a synthetic zero-width token for the parser to consume. For an example, see SANY's very complicated belchDEF() method. We won't be doing that here, but it's a good technique to know about; looking ahead then inserting synthetic tokens to disambiguate syntax is a common method when parsing complicated & ambiguous languages like TLA⁺.

In section 6.4 we finally arrive at the point where you can parse TLA⁺ expressions in your REPL. In the run() method of the TlaPlus class, delete the code printing out the scanned tokens (or not, it can be informative to keep around) and replace it with this:

    List<Token> tokens = scanner.scanTokens();
 
    Parser parser = new Parser(tokens);
    Expr expression = parser.parse();
 
    // Stop if there was a syntax error.
    if (hadError) return;
 
    System.out.println(new AstPrinter().print(expression));
  }

Fire up the interpreter and parse a TLA⁺ expression!

> {1 + 2, IF TRUE THEN 3 ELSE 4, {}}
({ (+ 1 2) (IF true 3 4) ({))

Chapter 7: Evaluating Expressions

Now that we can parse expressions, let's evaluate them! Here we follow the material in Chapter 7 of Crafting Interpreters. It's an exciting chapter, so let's jump in.

In section 7.1 we define a mapping from language values to internal Java values. Our mapping for TLA⁺ is fairly similar to Lox, although we use Integer instead of Double for number values and also have a set type. TLA⁺ thankfully lacks nil types.

TLA⁺ type Java representation
Any TLA⁺ value Object
Boolean Boolean
number Integer
set java.util.Set<Object>

Java does a huge amount of work for us with the capabilities of the Object type, as we'll see.

Section 7.2 is where we start laying down interpreter code. Create a new file Interpreter.java with the highlighted lines changed or added compared with the book:

package tla;
 
import java.util.Set;
import java.util.HashSet;
 
class Interpreter implements Expr.Visitor<Object> {
}

If your IDE supports it, get it to automatically add all necessary stub methods to implement the Expr.Visitor<Object> interface; we'll be filling these in, defining the interpretation of every expression type!

Let's start with defining the value of literals. Our code is completely unchanged from the book; add this visitLiteralExpr() method in the Interpreter class:

  @Override
  public Object visitLiteralExpr(Expr.Literal expr) {
    return expr.value;
  }

Our code for evaluating grouping (parentheses) is similarly unchanged from the book:

  @Override
  public Object visitGroupingExpr(Expr.Grouping expr) {
    return evaluate(expr.expression);
  }

This uses the evaluate() helper method; add it to the Interpreter class:

  private Object evaluate(Expr expr) {
    return expr.accept(this);
  }

Our first real difference is in the visitUnaryExpr() method. Recall that since our Expr.Unary type represents both prefix and suffix operators, we don't call its parameter right as in the book. Here's how we define the negative prefix operator, casting the parameter to an int instead of a double (highlighted lines differ from book):

  @Override
  public Object visitUnaryExpr(Expr.Unary expr) {
    Object operand = evaluate(expr.expr);
 
    switch (expr.operator.type) {
      case MINUS:
        return -(int)operand;
    }
 
    // Unreachable.
    return null;
  }

Note that the negative prefix operator is not actually a built-in operator of TLA⁺. It is defined in the Integers standard module, and TLA⁺ users need to import that module to access it. However, because our minimal TLA⁺ language subset lacks module importing, for convenience we just define some common arithmetic operators as built-in. You can find more information about the built-in TLA⁺ operators in chapter 16 of Specifying Systems by Leslie Lamport, or this SANY source file.

Our logical-not prefix operator is denoted by the NOT token type instead of BANG as in Lox. We also have no use for the isTruthy() helper method, since TLA⁺ is quite strict: only Boolean values can be given to Boolean operators! Add this code to the visitUnaryExpr() method:

    switch (expr.operator.type) {
      case NOT:
        return !(boolean)operand;
      case MINUS:

We still have two more unary operators to define: ENABLED, and the prime operator. Both are trivial in this domain but will become much more complicated later on. For constant expressions, ENABLED is true if and only if the expression itself is true. Add the highlighted code to the visitUnaryExpr() method:

    switch (expr.operator.type) {
      case ENABLED:
        return (boolean)operand;
      case NOT:

Priming a constant expression has no effect on the value of that expression, so just return the operand's value:

    switch (expr.operator.type) {
      case PRIME:
        return operand;
      case ENABLED:

Unlike Lox, addition in TLA⁺ is only defined between two numbers (at least in its standard Integers module definition). Here's how we define our basic binary arithmetic & comparison operators; add a visitBinaryExpr() method in the Interpreter class:

  @Override
  public Object visitBinaryExpr(Expr.Binary expr) {
    Object left = evaluate(expr.left);
    Object right = evaluate(expr.right); 
 
    switch (expr.operator.type) {
      case MINUS:
        return (int)left - (int)right;
      case PLUS:
        return (int)left + (int)right;
      case LESS_THAN:
        return (int)left < (int)right;
      case EQUAL:
        return left.equals(right);
    }
 
    // Unreachable.
    return null;
  }

Note that since we don't have to worry about null values, we don't need the isEqual() helper method from the book and can use Java's Object.equals() method directly. However, this actually gives us different equality semantics from TLC. In TLC, evaluating things like 123 = TRUE results in a runtime error instead of evaluating to false as will be the case here. In the interest of simplicity we are fine with keeping these looser semantics. See the challenges at the end of this tutorial page if you're interested in more closely matching the semantics of TLC.

Let's take our first foray into sets by defining the set membership operator in visitBinaryExpr():

    switch (expr.operator.type) {
      case IN:
        return ((Set<?>)right).contains(left);
      case MINUS:

The question mark is some unusual Java syntax you probably haven't seen before, called a wildcard. It has to do with covariance, which you can read about at the end of the chapter in the book. Think of it as casting the right operand to some kind of set, so we can access the Set.contains() method. It is tempting to cast it to Set<Object> (as we only ever create Set<Object> instances) but this produces a compiler warning.

Similar to equality, we get looser semantics here than with TLC. TLC will emit a runtime error when evaluating the expression TRUE \in {1, 2, 3}, while our interpreter will simple evaluate it to false. Again this is acceptable.

Now for something more complicated. Here's the set construction helper operator .., also defined in visitBinaryExpr(); this constructs a set containing all numbers between the left and right parameters, inclusive:

    switch (expr.operator.type) {
      case DOT_DOT:
        Set<Object> set = new HashSet<Object>();
        int lower = (int)left;
        int higher = (int)right;
        if (lower <= higher) {
          for (int i = lower; i <= higher; i++) {
            set.add(i);
          }
        }
        return set;
      case IN:

So expressions like 1 .. 3 produce the set {1, 2, 3} while 3 .. 2 produces the empty set {}. We use Set<Object>, so our sets can hold any value you can imagine including nested sets. It's incredible how much we get for free here; Java sets even implement value-based de-duplication and equality comparison for us for arbitrarily-nested sets!

The only binary operators remaining are logical AND & OR, and they differ in a subtle but important way. We actually want AND to implement short-circuit evaluation, so an expression like FALSE /\ TRUE evaluates to false without even looking at the right hand operand. Add the highlighted lines near the top of the visitBinaryExpr() method:

  public Object visitBinaryExpr(Expr.Binary expr) {
    Object left = evaluate(expr.left);
    switch (expr.operator.type) {
      case AND:
        if (!(boolean)left) return false;
        Object right = evaluate(expr.right);
        return (boolean)right;
      default:
        break;
    }

In an odd contrast, our OR binary operator is not short-circuited. We add it in with the rest of the operators in visitBinaryExpr():

    Object right = evaluate(expr.right);
    switch (expr.operator.type) {
      case OR:
        return (boolean)left || (boolean)right;
      case DOT_DOT:

Why are these different? It has to do with how they are used in TLA⁺ formulas. Actions are usually defined as a conjunction of expressions serving as guards, where later expressions could result in a runtime error if evaluated in some cases. For example, consider a variable x that can either be a sentinel value None or a set. People often write expressions like x /= None /\ 1 \in x. It would be a runtime error to evaluate 1 \in x if x = None. It is thus useful to use conjunction as a guard stopping expression interpretation if the first operand is not true. In contrast, disjunction is used in TLA⁺ to express nondeterminism and so both branches of the disjunct need to be evaluated to see whether their expressions satisfy alternative next states. You can read this thread on the TLA⁺ mailing list for more information.

Our IF/THEN/ELSE and set constructor operators are all that remain. Add this visitTernaryExpr() method to the Interpreter class:

  @Override
  public Object visitTernaryExpr(Ternary expr) {
    switch (expr.operator.type) {
      case IF:
        Object conditional = evaluate(expr.first);
        return (boolean)conditional ?
            evaluate(expr.second) : evaluate(expr.third);
      default:
        // Unreachable.
        return null;
    }
  }

IF/THEN/ELSE has a straightforward translation to Java's ?/: mixfix operator. Note our definition is also short-circuiting.

Finally, here's how we define the set construction operator; add this visitVariadicExpr() method in the Interpreter class:

  @Override
  public Object visitVariadicExpr(Expr.Variadic expr) {
    switch (expr.operator.type) {
      case LEFT_BRACE:
        Set<Object> set = new HashSet<Object>();
        for (Expr parameter : expr.parameters) {
          set.add(evaluate(parameter));
        }
        return set;
      default:
        // Unreachable.
        return null;
    }
  }

Challenges

Here are some optional challenges to flesh out your TLA⁺ interpreter, roughly ranked from simplest to most difficult. You should save a copy of your code before attempting these.

  1. TLC evaluates cross-type equality comparison as a runtime error. For example, 123 = TRUE evaluates to false in our interpreter but will throw an exception in TLC. Modify your interpreter to match the behavior of TLC.
  2. TLC requires set elements to all be of the same type. Trying to construct the set {1, 2, TRUE} results in a TLC runtime error. Similarly, TLC only allows nested sets if all elements are similarly nested and of the same type: {1, {2}} is not allowed. Modify your interpreter to match the behavior of TLC. Use the TLC REPL to check which set constructions are allowed or disallowed. Checking whether an element of the wrong type is part of a set using the \in operator also should result in a runtime error.

< Previous Page | Next Page >

  • creating/expressions.1744650517.txt.gz
  • Last modified: 2025/04/14 17:08
  • by ahelwer