creating:expressions

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
creating:expressions [2025/04/14 17:08] – Wrote evaluation code ahelwercreating:expressions [2025/05/13 15:58] (current) – Fixed lookahead in set literal parsing ahelwer
Line 1: Line 1:
-======= Handling Constant TLA⁺ Expressions =======+======= Parsing Constant TLA⁺ Expressions =======
  
-This tutorial page covers the next three chapters in //Crafting Interpreters//:+This tutorial page covers the next two chapters in //Crafting Interpreters//:
   - [[https://craftinginterpreters.com/representing-code.html|Chapter 5: Representing Code]]   - [[https://craftinginterpreters.com/representing-code.html|Chapter 5: Representing Code]]
   - [[https://craftinginterpreters.com/parsing-expressions.html|Chapter 6: Parsing Expressions]]   - [[https://craftinginterpreters.com/parsing-expressions.html|Chapter 6: Parsing Expressions]]
-  - [[https://craftinginterpreters.com/evaluating-expressions.html|Chapter 7: Evaluating Expressions]] 
  
 Same as the book, we //could// build a parser for our entire minimal TLA⁺ language subset before moving on to interpreting it, but that would be boring! Same as the book, we //could// build a parser for our entire minimal TLA⁺ language subset before moving on to interpreting it, but that would be boring!
Line 38: Line 37:
 ternary        → "IF" expression "THEN" expression "ELSE" expression; ternary        → "IF" expression "THEN" expression "ELSE" expression;
 variadic       → "{" ( expression ( "," expression )* )? "}" variadic       → "{" ( expression ( "," expression )* )? "}"
-operator       → "=" | "+" | "-" | ".." | "/\" | "\/" | "<"  | "\in" ;+operator       → "=" | "+" | "-" | ".." | "<"  | "\in" ;
 </code> </code>
 There are a few interesting differences. There are a few interesting differences.
Line 51: Line 50:
 The only difference between ''{1, 2, 3}'' and an operator like ''constructSet(1, 2, 3)'' is syntactic sugar. The only difference between ''{1, 2, 3}'' and an operator like ''constructSet(1, 2, 3)'' is syntactic sugar.
 This is the perspective of a language implementer. This is the perspective of a language implementer.
-Later on we will extend the definition of ''variadic'' to include vertically-aligned conjunction & disjunction lists.+ 
 +Note that we //could// also include the conjunction ''/\'' and disjunction ''\/'' infix operators here but they have odd parsing interactions with vertically-aligned conjunction & disjunction lists, so we will handle them in a later chapter dedicated to that topic.
  
 ===== Section 5.2: Implementing Syntax Trees ===== ===== Section 5.2: Implementing Syntax Trees =====
Line 398: Line 398:
  
 Now we have to define a table of operators with their details. Now we have to define a table of operators with their details.
-For this, create a new file ''Operator.java'' containing a class recording operator fix type, token type, associativity, and precedence range:+First use the handy Java 17 [[https://openjdk.org/jeps/395|records]] feature to quickly define a new ''Operator'' dataclass; this will hold the attributes of the operators we want to parse. 
 +Put it at the top of the ''Parser'' class:
  
-<code java> +<code java [highlight_lines_extra="2,3,4"]
-package tla;+class Parser { 
 +  private static enum Fix { PREFIX, INFIX, POSTFIX } 
 +  private static record Operator(Fix fix, TokenType token, 
 +      boolean assoc, int lowPrec, int highPrec) {}
  
-enum Fix { +  private final List<Token> tokens;
-  PREFIX, INFIX, POSTFIX +
-+
- +
-class Operator { +
-  final Fix fix; +
-  final TokenType token; +
-  final boolean assoc; +
-  final int lowPrec; +
-  final int highPrec; +
- +
-  public Operator(Fix fix, TokenType token, boolean assoc, +
-                  int lowPrec, int highPrec) { +
-    this.fix = fix; +
-    this.token = token; +
-    this.assoc = assoc; +
-    this.lowPrec = lowPrec; +
-    this.highPrec = highPrec; +
-  } +
-}+
 </code> </code>
- 
-For convenience, import the ''Fix'' enum values in ''Parser.java'' so they can be referenced directly: 
- 
-<code java [highlight_lines_extra="7"]> 
-package tla; 
- 
-import java.util.List; 
-import java.util.ArrayList; 
- 
-import static tla.TokenType.*; 
-import static tla.Fix.*; 
- 
-class Parser { 
-</code> 
- 
  
 You can find operator attributes on page 271 of //[[https://lamport.azurewebsites.net/tla/book.html|Specifying Systems]]// by Leslie Lamport, or [[https://github.com/tlaplus/tlaplus/blob/13e5a39b5368a6da4906b8ed1c2c1114d2e7de15/tlatools/org.lamport.tlatools/src/tla2sany/parser/Operators.java#L130-L234|this TLA⁺ tools source file]]. You can find operator attributes on page 271 of //[[https://lamport.azurewebsites.net/tla/book.html|Specifying Systems]]// by Leslie Lamport, or [[https://github.com/tlaplus/tlaplus/blob/13e5a39b5368a6da4906b8ed1c2c1114d2e7de15/tlatools/org.lamport.tlatools/src/tla2sany/parser/Operators.java#L130-L234|this TLA⁺ tools source file]].
Line 446: Line 416:
 <code java> <code java>
   private static final Operator[] operators = new Operator[] {   private static final Operator[] operators = new Operator[] {
-    new Operator(PREFIX,  NOT,        true,   4,  4 ), +    new Operator(Fix.PREFIX,  NOT,        true,   4,  4 ), 
-    new Operator(PREFIX,  ENABLED,    false,  4,  15), +    new Operator(Fix.PREFIX,  ENABLED,    false,  4,  15), 
-    new Operator(PREFIX,  MINUS,      true,   12, 12), +    new Operator(Fix.PREFIX,  MINUS,      true,   12, 12), 
-    new Operator(INFIX,   AND,        true,   3,  3 ), +    new Operator(Fix.INFIX,   IN,         false,  5,  5 ), 
-    new Operator(INFIX,   OR,         true,   3,  3 ), +    new Operator(Fix.INFIX,   EQUAL,      false,  5,  5 ), 
-    new Operator(INFIX,   IN,         false,  5,  5 ), +    new Operator(Fix.INFIX,   LESS_THAN,  false,  5,  5 ), 
-    new Operator(INFIX,   EQUAL,      false,  5,  5 ), +    new Operator(Fix.INFIX,   DOT_DOT,    false,  9,  9 ), 
-    new Operator(INFIX,   LESS_THAN,  false,  5,  5 ), +    new Operator(Fix.INFIX,   PLUS,       true,   10, 10), 
-    new Operator(INFIX,   DOT_DOT,    false,  9,  9 ), +    new Operator(Fix.INFIX,   MINUS,      true,   11, 11), 
-    new Operator(INFIX,   PLUS,       true,   10, 10), +    new Operator(Fix.POSTFIX, PRIME,      false,  15, 15),
-    new Operator(INFIX,   MINUS,      true,   11, 11), +
-    new Operator(POSTFIX, PRIME,      false,  15, 15),+
   };   };
 </code> </code>
Line 493: Line 461:
  
     Expr expr = operatorExpression(prec + 1);     Expr expr = operatorExpression(prec + 1);
-    while ((op = matchOp(INFIX, prec)) != null) {+    while ((op = matchOp(Fix.INFIX, prec)) != null) {
       Token operator = previous();       Token operator = previous();
       Expr right = operatorExpression(op.highPrec + 1);       Expr right = operatorExpression(op.highPrec + 1);
Line 512: Line 480:
 We need to modify the loop to return immediately if the infix operator is not associative: We need to modify the loop to return immediately if the infix operator is not associative:
 <code java [highlight_lines_extra="5"]> <code java [highlight_lines_extra="5"]>
-    while ((op = matchOp(INFIX, prec)) != null) {+    while ((op = matchOp(Fix.INFIX, prec)) != null) {
       Token operator = previous();       Token operator = previous();
       Expr right = operatorExpression(op.highPrec + 1);       Expr right = operatorExpression(op.highPrec + 1);
Line 530: Line 498:
  
     Operator op;     Operator op;
-    if ((op = matchOp(PREFIX, prec)) != null) {+    if ((op = matchOp(Fix.PREFIX, prec)) != null) {
       Token opToken = previous();       Token opToken = previous();
       Expr expr = operatorExpression(       Expr expr = operatorExpression(
Line 556: Line 524:
     }     }
  
-    while ((op = matchOp(POSTFIX, prec)) != null) {+    while ((op = matchOp(Fix.POSTFIX, prec)) != null) {
       Token opToken = previous();       Token opToken = previous();
       expr = new Expr.Unary(opToken, expr);       expr = new Expr.Unary(opToken, expr);
Line 616: Line 584:
     if (match(LEFT_BRACE)) {     if (match(LEFT_BRACE)) {
       Token operator = previous();       Token operator = previous();
-      List<Expr> elements = new ArrayList<Expr>(); +      List<Expr> elements = new ArrayList<>(); 
-      if (RIGHT_BRACE != peek().type) {+      if (!check(RIGHT_BRACE)) {
         do {         do {
           elements.add(expression());           elements.add(expression());
Line 668: Line 636:
 Add the ''ParseError'' class definition at the top of the ''Parser'' class: Add the ''ParseError'' class definition at the top of the ''Parser'' class:
  
-<code java [highlight_lines_extra="2"]>+<code java [highlight_lines_extra="5"]>
 class Parser { class Parser {
 +  private static enum Fix { PREFIX, INFIX, POSTFIX }
 +  private static record Operator(Fix fix, TokenType token,
 +      boolean assoc, int lowPrec, int highPrec) {}
   private static class ParseError extends RuntimeException {}   private static class ParseError extends RuntimeException {}
  
Line 717: Line 688:
 ({ (+ 1 2) (IF true 3 4) ({)) ({ (+ 1 2) (IF true 3 4) ({))
 </code> </code>
- +If you got out of syncyou can find snapshot of the expected state of the code in [[https://github.com/tlaplus/devkit/tree/main/3-expressions|this repo directory]]. 
-====== Chapter 7: Evaluating Expressions ====== +Next tutorial: [[creating:evaluation|Evaluating Constant TLA⁺ Expressions]]!
- +
-Now that we can parse expressions, let's evaluate them! +
-Here we follow the material in [[https://craftinginterpreters.com/evaluating-expressions.html|Chapter 7]] of //Crafting Interpreters//+
-It's an exciting chapterso let's jump in. +
- +
-===== Section 7.1: Representing Values ===== +
- +
-In [[https://craftinginterpreters.com/evaluating-expressions.html#representing-values|section 7.1]] we define mapping from language values to internal Java values. +
-Our mapping for TLA⁺ is fairly similar to Lox, although we use ''Integer'' instead of ''Double'' for number values and also have a set type. +
-TLA⁺ thankfully lacks ''nil'' types. +
- +
-^ TLA⁺ type      ^ Java representation       ^ +
-| Any TLA⁺ value | ''Object''                | +
-| Boolean        | ''Boolean''               | +
-| number         | ''Integer''               | +
-| set            | ''java.util.Set<Object>''+
- +
-Java does a huge amount of work for us with the capabilities of the ''Object'' type, as we'll see. +
- +
-===== Section 7.2: Representing Values ===== +
- +
-[[https://craftinginterpreters.com/evaluating-expressions.html#evaluating-expressions|Section 7.2]] is where we start laying down interpreter code+
-Create a new file ''Interpreter.java'' with the highlighted lines changed or added compared with the book: +
- +
-<code java [highlight_lines_extra="1,3,4"]> +
-package tla; +
- +
-import java.util.Set; +
-import java.util.HashSet; +
- +
-class Interpreter implements Expr.Visitor<Object>+
-+
-</code> +
- +
-If your IDE supports it, get it to automatically add all necessary stub methods to implement the ''Expr.Visitor<Object>'' interface; we'll be filling these in, defining the interpretation of every expression type! +
- +
-Let's start with defining the value of literals. +
-Our code is completely unchanged from the book; add this ''visitLiteralExpr()'' method in the ''Interpreter'' class: +
- +
-<code java> +
-  @Override +
-  public Object visitLiteralExpr(Expr.Literal expr) { +
-    return expr.value; +
-  } +
-</code> +
- +
-Our code for evaluating grouping (parentheses) is similarly unchanged from the book: +
- +
-<code java> +
-  @Override +
-  public Object visitGroupingExpr(Expr.Grouping expr) { +
-    return evaluate(expr.expression); +
-  } +
-</code> +
- +
-This uses the ''evaluate()'' helper method; add it to the ''Interpreter'' class: +
- +
-<code java> +
-  private Object evaluate(Expr expr) { +
-    return expr.accept(this); +
-  } +
-</code> +
- +
-==== Evaluating Unary Operators ==== +
- +
-Our first real difference is in the ''visitUnaryExpr()'' method. +
-Recall that since our ''Expr.Unary'' type represents both prefix and suffix operators, we don't call its parameter ''right'' as in the book. +
-Here's how we define the negative prefix operator, casting the parameter to an ''int'' instead of a ''double'' (highlighted lines differ from book): +
- +
-<code java [highlight_lines_extra="3,7"]> +
-  @Override +
-  public Object visitUnaryExpr(Expr.Unary expr) { +
-    Object operand = evaluate(expr.expr); +
- +
-    switch (expr.operator.type) { +
-      case MINUS: +
-        return -(int)operand; +
-    } +
- +
-    // Unreachable. +
-    return null; +
-  } +
-</code> +
- +
-Note that the negative prefix operator is not actually a built-in operator of TLA⁺. +
-It is defined in the ''Integers'' standard module, and TLA⁺ users need to import that module to access it. +
-However, because our minimal TLA⁺ language subset lacks module importing, for convenience we just define some common arithmetic operators as built-in. +
-You can find more information about the built-in TLA⁺ operators in chapter 16 of //[[https://lamport.azurewebsites.net/tla/book.html|Specifying Systems]]// by Leslie Lamport, or [[https://github.com/tlaplus/tlaplus/blob/13e5a39b5368a6da4906b8ed1c2c1114d2e7de15/tlatools/org.lamport.tlatools/src/tla2sany/semantic/BuiltInOperators.java#L80-L153|this SANY source file]]. +
- +
-Our logical-not prefix operator is denoted by the ''NOT'' token type instead of ''BANG'' as in Lox. +
-We also have no use for the ''isTruthy()'' helper method, since TLA⁺ is quite strict: only Boolean values can be given to Boolean operators! +
-Add this code to the ''visitUnaryExpr()'' method: +
- +
-<code java [highlight_lines_extra="2,3"]> +
-    switch (expr.operator.type) { +
-      case NOT: +
-        return !(boolean)operand; +
-      case MINUS: +
-</code> +
- +
-We still have two more unary operators to define: ''ENABLED'', and the prime operator. +
-Both are trivial in this domain but will become much more complicated later on. +
-For constant expressions, ''ENABLED'' is true if and only if the expression itself is true. +
-Add the highlighted code to the ''visitUnaryExpr()'' method: +
- +
-<code java [highlight_lines_extra="2,3"]> +
-    switch (expr.operator.type) { +
-      case ENABLED: +
-        return (boolean)operand; +
-      case NOT: +
-</code> +
- +
-Priming a constant expression has no effect on the value of that expression, so just return the operand's value: +
- +
-<code java [highlight_lines_extra="2,3"]> +
-    switch (expr.operator.type) { +
-      case PRIME: +
-        return operand; +
-      case ENABLED: +
-</code> +
- +
-==== Evaluating Binary Operators ==== +
- +
-Unlike Lox, addition in TLA⁺ is only defined between two numbers (at least in its standard ''Integers'' module definition). +
-Here's how we define our basic binary arithmetic & comparison operators; add a ''visitBinaryExpr()'' method in the ''Interpreter'' class: +
- +
-<code java [highlight_lines_extra="8,9,10,11,12,13,14"]> +
-  @Override +
-  public Object visitBinaryExpr(Expr.Binary expr) { +
-    Object left = evaluate(expr.left); +
-    Object right = evaluate(expr.right);  +
- +
-    switch (expr.operator.type) { +
-      case MINUS: +
-        return (int)left - (int)right; +
-      case PLUS: +
-        return (int)left + (int)right; +
-      case LESS_THAN: +
-        return (int)left < (int)right; +
-      case EQUAL: +
-        return left.equals(right); +
-    } +
- +
-    // Unreachable. +
-    return null; +
-  } +
-</code> +
- +
-Note that since we don't have to worry about ''null'' values, we don't need the ''isEqual()'' helper method from the book and can use Java's ''Object.equals()'' method directly. +
-However, this actually gives us different equality semantics from TLC. +
-In TLC, evaluating things like ''123 = TRUE'' results in a runtime error instead of evaluating to ''false'' as will be the case here. +
-In the interest of simplicity we are fine with keeping these looser semantics. +
-See the challenges at the end of this tutorial page if you're interested in more closely matching the semantics of TLC. +
- +
-Let's take our first foray into sets by defining the set membership operator in ''visitBinaryExpr()'': +
- +
-<code java [highlight_lines_extra="2,3"]> +
-    switch (expr.operator.type) { +
-      case IN: +
-        return ((Set<?>)right).contains(left); +
-      case MINUS: +
-</code> +
- +
-The question mark is some unusual Java syntax you probably haven't seen before, called a [[https://docs.oracle.com/javase/tutorial/java/generics/wildcards.html|wildcard]]. +
-It has to do with //covariance//, which you can read about at the end of the chapter in the book. +
-Think of it as casting the right operand to //some kind// of set, so we can access the ''Set.contains()'' method. +
-It is tempting to cast it to ''Set<Object>'' (as we only ever create ''Set<Object>'' instances) but this produces a compiler warning. +
- +
-Similar to equality, we get looser semantics here than with TLC. +
-TLC will emit a runtime error when evaluating the expression ''TRUE \in {1, 2, 3}'', while our interpreter will simple evaluate it to ''false''+
-Again this is acceptable. +
- +
-Now for something more complicated. +
-Here's the set construction helper operator ''..'', also defined in ''visitBinaryExpr()''; this constructs a set containing all numbers between the left and right parameters, inclusive: +
- +
-<code java [highlight_lines_extra="2,3,4,5,6,7,8,9,10,11"]> +
-    switch (expr.operator.type) { +
-      case DOT_DOT: +
-        Set<Object> set = new HashSet<Object>(); +
-        int lower = (int)left; +
-        int higher = (int)right; +
-        if (lower <= higher) { +
-          for (int i = lower; i <= higher; i++) { +
-            set.add(i); +
-          } +
-        } +
-        return set; +
-      case IN: +
-</code> +
- +
-So expressions like ''1 .. 3'' produce the set ''{1, 2, 3}'' while ''3 .. 2'' produces the empty set ''{}''+
-We use ''Set<Object>'', so our sets can hold any value you can imagine including nested sets. +
-It's incredible how much we get for free here; Java sets even implement value-based de-duplication and equality comparison for us for arbitrarily-nested sets! +
- +
-The only binary operators remaining are logical ''AND'' & ''OR'', and they differ in a subtle but important way. +
-We actually want ''AND'' to implement //short-circuit evaluation//, so an expression like ''FALSE /\ TRUE'' evaluates to ''false'' without even looking at the right hand operand. +
-Add the highlighted lines near the top of the ''visitBinaryExpr()'' method: +
- +
-<code java [highlight_lines_extra="3,4,5,6,7,8,9,10"]> +
-  public Object visitBinaryExpr(Expr.Binary expr) { +
-    Object left = evaluate(expr.left); +
-    switch (expr.operator.type) { +
-      case AND: +
-        if (!(boolean)left) return false; +
-        Object right = evaluate(expr.right); +
-        return (boolean)right; +
-      default: +
-        break; +
-    } +
- +
-</code> +
- +
-In an odd contrast, our ''OR'' binary operator is //not// short-circuited. +
-We add it in with the rest of the operators in ''visitBinaryExpr()'': +
- +
-<code java [highlight_lines_extra="3,4"]> +
-    Object right = evaluate(expr.right); +
-    switch (expr.operator.type) { +
-      case OR: +
-        return (boolean)left || (boolean)right; +
-      case DOT_DOT: +
-</code> +
- +
-Why are these different? +
-It has to do with how they are used in TLA⁺ formulas. +
-Actions are usually defined as a conjunction of expressions serving as guards, where later expressions could result in a runtime error if evaluated in some cases. +
-For example, consider a variable ''x'' that can either be a sentinel value ''None'' or a set. +
-People often write expressions like ''x /= None /\ 1 \in x''+
-It would be a runtime error to evaluate ''1 \in x'' if ''x = None''+
-It is thus useful to use conjunction as a guard stopping expression interpretation if the first operand is not true. +
-In contrast, disjunction is used in TLA⁺ to express nondeterminism and so both branches of the disjunct need to be evaluated to see whether their expressions satisfy alternative next states. +
-You can read [[https://groups.google.com/g/tlaplus/c/U6tOJ4dsjVM/m/1zXWHrZbBwAJ|this]] thread on the TLA⁺ mailing list for more information. +
- +
-==== Evaluating Other Operators ==== +
- +
-Our ''IF''/''THEN''/''ELSE'' and set constructor operators are all that remain. +
-Add this ''visitTernaryExpr()'' method to the ''Interpreter'' class: +
- +
-<code java> +
-  @Override +
-  public Object visitTernaryExpr(Ternary expr) { +
-    switch (expr.operator.type) { +
-      case IF: +
-        Object conditional = evaluate(expr.first); +
-        return (boolean)conditional ? +
-            evaluate(expr.second) : evaluate(expr.third); +
-      default: +
-        // Unreachable. +
-        return null; +
-    } +
-  } +
-</code> +
- +
-''IF''/''THEN''/''ELSE'' has a straightforward translation to Java's ''?''/'':'' mixfix operator. +
-Note our definition is also short-circuiting. +
- +
-Finally, here's how we define the set construction operator; add this ''visitVariadicExpr()'' method in the ''Interpreter'' class: +
- +
-<code java> +
-  @Override +
-  public Object visitVariadicExpr(Expr.Variadic expr) { +
-    switch (expr.operator.type) { +
-      case LEFT_BRACE: +
-        Set<Object> set = new HashSet<Object>(); +
-        for (Expr parameter : expr.parameters) { +
-          set.add(evaluate(parameter)); +
-        } +
-        return set; +
-      default: +
-        // Unreachable. +
-        return null; +
-    } +
-  } +
-</code> +
- +
-===== Section 7.3: Runtime Errors =====+
  
 ====== Challenges ====== ====== Challenges ======
  
 Here are some optional challenges to flesh out your TLA⁺ interpreter, roughly ranked from simplest to most difficult. Here are some optional challenges to flesh out your TLA⁺ interpreter, roughly ranked from simplest to most difficult.
-You should save a copy of your code before attempting these. 
-  - TLC evaluates cross-type equality comparison as a runtime error. For example, ''123 = TRUE'' evaluates to ''false'' in our interpreter but will throw an exception in TLC. Modify your interpreter to match the behavior of TLC. 
-  - TLC requires set elements to all be of the same type. Trying to construct the set ''{1, 2, TRUE}'' results in a TLC runtime error. Similarly, TLC only allows nested sets if //all// elements are similarly nested and of the same type: ''{1, {2}}'' is not allowed. Modify your interpreter to match the behavior of TLC. Use the TLC REPL to check which set constructions are allowed or disallowed. Checking whether an element of the wrong type is part of a set using the ''\in'' operator also should result in a runtime error. 
  
-[[https://docs.tlapl.us/creating:scanning|< Previous Page]] | [[https://docs.tlapl.us/creating:statements|Next Page >]]+[[creating:scanning|< Previous Page]] | [[creating:start#table_of_contents|Table of Contents]] | [[creating:evaluation|Next Page >]]
  
  • creating/expressions.1744650517.txt.gz
  • Last modified: 2025/04/14 17:08
  • by ahelwer