creating:operators

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
creating:operators [2025/05/20 18:31] – Parameterized operator evaluation ahelwercreating:operators [2025/06/13 21:25] (current) – Removed PrintStream parameter from Interpreter constructor ahelwer
Line 1: Line 1:
-======= Operators Parameters =======+======= Functions, Operators, and Parameters =======
  
 We now rejoin //Crafting Interpreters// for [[https://craftinginterpreters.com/functions.html|Chapter 10]]. We now rejoin //Crafting Interpreters// for [[https://craftinginterpreters.com/functions.html|Chapter 10]].
Line 101: Line 101:
  
 In [[https://craftinginterpreters.com/functions.html#interpreting-function-calls|this section]] we learn how to interpret our shiny new call syntax. In [[https://craftinginterpreters.com/functions.html#interpreting-function-calls|this section]] we learn how to interpret our shiny new call syntax.
 +First, add some imports to the top of ''Interpreter.java'':
 +<code java [highlight_lines_extra="3,5,6"]>
 +import java.util.Set;
 +import java.util.HashSet;
 +import java.util.ArrayList;
 +import java.util.List;
 +import java.util.Map;
 +import java.util.HashMap;
 +</code>
  
-First let's implement the ''Interpreter'' class visitor method for ''Expr.FnApply'':+Implement the ''Interpreter'' class visitor method for ''Expr.FnApply'':
 <code java> <code java>
   @Override   @Override
Line 232: Line 241:
 We've already implemented all our native TLA⁺ functions in the //[[creating:evaluation|Evaluating Constant TLA⁺ Expressions]]// chapter, but we will follow along to add a crucial field to the ''Interpreter'' class: a global environment. We've already implemented all our native TLA⁺ functions in the //[[creating:evaluation|Evaluating Constant TLA⁺ Expressions]]// chapter, but we will follow along to add a crucial field to the ''Interpreter'' class: a global environment.
 Due to the ''replMode'' parameter we have to put the initialization logic in the constructor (new code highlighted): Due to the ''replMode'' parameter we have to put the initialization logic in the constructor (new code highlighted):
-<code java [highlight_lines_extra="3,8,9"]>+<code java [highlight_lines_extra="3,7,8"]>
 class Interpreter implements Expr.Visitor<Object>, class Interpreter implements Expr.Visitor<Object>,
                              Stmt.Visitor<Void> {                              Stmt.Visitor<Void> {
   final Environment globals;   final Environment globals;
   private Environment environment;   private Environment environment;
-  private final PrintStream out; 
  
-  public Interpreter(PrintStream out, boolean replMode) {+  public Interpreter(boolean replMode) {
     this.globals = new Environment(replMode);     this.globals = new Environment(replMode);
     this.environment = this.globals;     this.environment = this.globals;
-    this.out = out; 
   }   }
 </code> </code>
Line 431: Line 438:
     return null;     return null;
   }   }
 +</code>
 +
 +Additional TLA⁺-specific validation is necessary here.
 +While we allow redefining operators //themselves// in REPL mode, TLA⁺ does not allow shadowing existing operator names with operator parameter names.
 +So, if an operator of name ''x'' exists, and the user tries to define a new operator ''op(x) == ...'', we should raise a runtime error.
 +We should also disallow operator definitions with duplicate parameter names like ''op(x, x) == ...''.
 +Define a new helper for this near the bottom of the ''Interpreter'' class:
 +<code java>
 +  private void checkNotDefined(List<Token> names) {
 +    for (Token name : names) {
 +      if (environment.isDefined(name)) {
 +        throw new RuntimeError(name, "Identifier already in use.");
 +      }
 +    }
 +
 +    for (int i = 0; i < names.size() - 1; i++) {
 +      for (int j = i + 1; j < names.size(); j++) {
 +        if (names.get(i).lexeme.equals(names.get(j).lexeme)) {
 +          throw new RuntimeError(names.get(i), "Identifier used twice in same list.");
 +        }
 +      }
 +    }
 +  }
 +</code>
 +
 +This helper requires a new method in the ''Environment'' class:
 +<code java>
 +  boolean isDefined(Token name) {
 +    return values.containsKey(name.lexeme)
 +        || (enclosing != null && enclosing.isDefined(name));
 +  }
 +</code>
 +
 +Add a ''checkNotDefined()'' call to ''visitOpDefStmt()'', along with another check that the user is not trying to define an operator like ''x(x) == ...'':
 +<code java [highlight_lines_extra="2,3,4,5,6,7"]>
 +  public Void visitOpDefStmt(Stmt.OpDef stmt) {
 +    checkNotDefined(stmt.params);
 +    for (Token param : stmt.params) {
 +      if (param.lexeme.equals(stmt.name.lexeme)) {
 +        throw new RuntimeError(param, "Identifier already in use.");
 +      }
 +    }
 +
 +    TlaOperator op = new TlaOperator(stmt);
 </code> </code>
  
 And that takes care of operators! And that takes care of operators!
 +Now to handle functions.
 +We still need to implement our ''visitQuantFnExpr()'' method in the ''Interpreter'' class.
 +Similar to other operator evaluation methods like ''visitBinaryExpr()'' we evaluate what we can (here, the set to be quantified over) then branch on the operator type.
 +We also ensure the function parameter names are not shadowing an existing identifier:
 +<code java>
 +  @Override
 +  public Object visitQuantFnExpr(Expr.QuantFn expr) {
 +    checkNotDefined(expr.params);
 +    Object set = evaluate(expr.set);
 +    checkSetOperand(expr.op, set);
 +    switch (expr.op.type) {
 +      case ALL_MAP_TO: {
 +
 +      } case FOR_ALL: {
 +
 +      } case EXISTS: {
 +
 +      } default: {
 +        // Unreachable.
 +        return null;
 +      }
 +    }
 +  }
 +</code>
 +
 +Thinking about it, the code to implement these operators is fairly tricky.
 +We want to generate a series of bindings of set elements to identifiers.
 +Let's fill out the ''FOR_ALL'' case to get a sense of how we want this to look; note that like our conjunction implementation in the previous chapter, universal quantification is short-circuiting and returns false as soon as it encounters a counterexample to the body predicate:
 +<code java [highlight_lines_extra="2,3,4,5,6,7"]>
 +      } case FOR_ALL: {
 +        for (Environment binding : bindings) {
 +          Object result = executeBlock(expr.body, binding);
 +          checkBooleanOperand(expr.op, result);
 +          if (!(Boolean)result) return false;
 +        }
 +        return true;
 +      } case EXISTS: {
 +</code>
 +This is nice, concise code.
 +We iterate over a series of bindings - in the form of ''Environment'' instances - which enumerate all possible bindings of set values to identifiers.
 +But what is ''bindings''?
 +This is one of the more algorithmically tricky parts of this entire tutorial series, but it is also very fun!
 +We are going to encapsulate the binding generation logic in a new class file, ''BindingGenerator.java'', where the class implements both the ''Iterator<Environment>'' and ''Iterable<Environment>'' standard Java interfaces:
 +<code java>
 +package tla;
 +
 +import java.util.ArrayList;
 +import java.util.Iterator;
 +import java.util.List;
 +import java.util.Set;
 +
 +class BindingGenerator implements Iterator<Environment>,
 +                                  Iterable<Environment> {
 +  private final List<Token> vars;
 +  private final List<Object> set;
 +  private final Environment parent;
 +
 +  BindingGenerator(List<Token> vars, Set<?> set, Environment parent) {
 +    this.vars = vars;
 +    this.set = new ArrayList<>(set);
 +    this.parent = parent;
 +  }
 +
 +  @Override
 +  public Iterator<Environment> iterator() {
 +    return this;
 +  }
 +}
 +</code>
 +Implementing the ''Iterator<>'' and ''Iterable<>'' interfaces allows us to treat ''BindingGenerator'' instances as things that can be iterated over in Java's nice [[https://docs.oracle.com/javase/8/docs/technotes/guides/language/foreach.html|for-each loop]] syntax.
 +So that's what ''bindings'' is!
 +A ''BindingGenerator'' instance!
 +
 +Before getting too far into the weeds with the ''BindingGenerator'' iterator algorithm, let's just assume it works and fill out the rest of ''visitQuantFnExpr()''.
 +Construct a ''BindingGenerator'' instance atop the ''switch'' statement:
 +<code java [highlight_lines_extra="4"]>
 +  public Object visitQuantFnExpr(Expr.QuantFn expr) {
 +    Object set = evaluate(expr.set);
 +    checkSetOperand(expr.op, set);
 +    BindingGenerator bindings = new BindingGenerator(expr.params, (Set<?>)set, environment);
 +    switch (expr.op.type) {
 +</code>
 +One important thing to note is that unlike in ''TlaOperator'', we give ''BindingGenerator'' the //current// environment instead of the root global environment; this is because if we are evaluating an ''Expr.QuantFn'' instance inside an operator, the body expression should be able to access definitions from the operator's environment, like:
 +<code haskell>
 +op(x) == \A y \in 0 .. 2 : y < x 
 +</code>
 +Here's how we implement the ''EXISTS'' case; unlike universal quantification, existential quantification is not short-circuiting in TLA⁺ (for reasons that will become clear in the next chapter):
 +<code java [highlight_lines_extra="2,3,4,5,6,7,8"]>
 +      } case EXISTS: {
 +        boolean result = false;
 +        for (Environment binding : bindings) {
 +          Object junctResult = executeBlock(expr.body, binding);
 +          checkBooleanOperand(expr.op, junctResult);
 +          result |= (boolean)junctResult;
 +        }
 +        return result;
 +      } default: {
 +</code>
 +Finally, here's how we implement the ''ALL_MAP_TO'' case; we want to construct a ''Map<Object, Object>'' instance recording each value the single parameter is mapped to:
 +<code java [highlight_lines_extra="2,3,4,5,6,7"]>
 +      case ALL_MAP_TO: {
 +        Token param = expr.params.get(0);
 +        Map<Object, Object> function = new HashMap<>();
 +        for (Environment binding : bindings) {
 +          function.put(binding.get(param), executeBlock(expr.body, binding));
 +        }
 +        return function;
 +      } case FOR_ALL: {
 +</code>
 +
 +====== Set Enumeration ======
 +
 +Now that we've finished all that, let's get to the real difficult part: implementing an enumeration algorithm in ''BindingGenerator''.
 +A moment of consideration is in order.
 +If you had to bind a set of variables to every combination of values from a set, how would you do it?
 +To computer scientists, the connection between enumerating every possible combination of values and incrementing numbers in an arbitrary base is readily made.
 +For example, imagine you had a set of three values ''{a, b, c}'' that you wanted to bind to two identifiers ''x'' and ''y''.
 +This is isomorphic to iterating through all possible values of a two-digit number in base three.
 +The number of digits corresponds to the number of variables to bind, and the base corresponds to the number of elements in the set.
 +
 +This is confusing, but we will walk through it step by step.
 +First we must assign an arbitrary order to the set so we can map single digits in base three to an element of the set:
 +^ Digit ^ Element ^
 +| 0     | a       |
 +| 1     | b       |
 +| 2     | c       |
 +
 +Using that mapping, here's a table showing how counting up through all two-digit base three numbers decomposes into binding all possible combinations of values to ''x'' and ''y'':
 +^ Number (base 3) ^ As set elements ^ x (0th-significant digit) ^ y (1st-significant digit) ^
 +| 00              | aa              | a                         | a                         |
 +| 01              | ab              | b                         | a                         |
 +| 02              | ac              | c                         | a                         |
 +| 10              | ba              | a                         | b                         |
 +| 11              | bb              | b                         | b                         |
 +| 12              | bc              | c                         | b                         |
 +| 20              | ca              | a                         | c                         |
 +| 21              | cb              | b                         | c                         |
 +| 22              | cc              | c                         | c                         |
 +
 +As you can see, we generated every possible binding of ''{a, b, c}'' to ''x'' and ''y''!
 +Each identifier is assigned a particular digit in the number, and their value in a binding is given by that digit's value in the enumeration.
 +
 +We're almost there, but not quite; Java doesn't have the built-in ability to iterate through numbers in a particular base.
 +We will need to increment a regular ''int'' counter and //convert// it into the desired base, then extract the value of each digit.
 +Actually the well-known [[https://en.wikipedia.org/wiki/Positional_notation#Base_conversion|base conversion]] algorithm lets us do both at once!
 +It is quite elegant; given a number ''n'' and a desired base ''b'', the value of the least-significant digit can be found by calculating ''n % b'' (''%'' meaning mod, giving the division remainder).
 +Then, assign ''n' = n / b'' (using integer division, so throw away the remainder) and repeat for as many digits as desired.
 +Confused? Here's how it looks as a table for converting the enumeration value 6 to a binding from our previous base 3 example:
 +^ Iteration ^ ''n'' ^ ''n % 3'' ^ ''n / 3'' ^
 +| 0         | 6     | 0         | 2         |
 +| 1         | 2     | 2         | 0         |
 +
 +Note that the value of ''n / 3'' always becomes the value of ''n'' in the next row.
 +Reading the ''n % 3'' column from bottom to top we see "20", which is 6 in base 3.
 +Consulting our tables, we see "20" should map to ''ca''.
 +''x'' gets the 0th (least significant) digit, so it should get ''a''.
 +''y'' gets the 1st-significant digit, so it should get ''c''.
 +
 +The iteration number actually corresponds to digit significance.
 +So, in iteration 0, ''x'' should get the value of the ''n % 3'' column, which we see is 0 and thus ''a'' as expected.
 +In iteration 1, ''y'' should get the value of the ''n % 3'' column, which we see is 2 and thus ''c'' as expected.
 +If this is still too abstract, try some other values to convince yourself that our algorithm successfully calculates the binding!
 +
 +Now to implement it.
 +First, add another field to the ''BindingGenerator'' class:
 +<code java [highlight_lines_extra="6"]>
 +class BindingGenerator implements Iterator<Environment>,
 +                                  Iterable<Environment> {
 +  private final List<Token> vars;
 +  private final List<Object> set;
 +  private final Environment parent;
 +  private int enumerationIndex = 0;
 +</code>
 +
 +Then, implement ''hasNext()''.
 +This is required by the ''Iterator<>'' interface and returns true if there are additional un-enumerated combinations.
 +In our case, we test that ''enumerationIndex'' is strictly less than the number of elements in the set raised to the power of the number of identifiers.
 +For ''n'' digits in base ''b'', there are ''bⁿ'' possible values to iterate through.
 +<code java>
 +  @Override
 +  public boolean hasNext() {
 +    return enumerationIndex < Math.pow(set.size(), vars.size());
 +  }
 +</code>
 +
 +Now for the main event.
 +Here's the strikingly simple code implementing the change-of-base algorithm in the ''next()'' method required by the ''Iterator<>'' interface:
 +<code java>
 +  @Override
 +  public Environment next() {
 +    int current = enumerationIndex++;
 +    Environment bindings = new Environment(parent);
 +    for (Token var : vars) {
 +      bindings.define(var, set.get(current % set.size()));
 +      current /= set.size();
 +    }
 +
 +    return bindings;
 +  }
 +</code>
 +
 +Note that ''enumerationIndex++'' retrieves the current value of ''enumerationIndex'' to store in ''current'' and then increments the value of ''enumerationIndex'' itself.
 +A nice little terse code trick.
 +
 +And we're done!
 +You can now run your interpreter and enjoy the full power of functions & operators with parameters, along with the new universal & existential quantification expressions!
 +See the current expected state of your source code [[https://github.com/tlaplus/devkit/tree/main/7-operators|here]].
 +On to the next chapter: [[creating:actions|Variables, States, and Actions]]!
 +
 +====== Challenges ======
 +
 +  - In the full TLA⁺ language, quantified functions can be even more complicated; for example, you can write ''\A x, y \in 0 .. 2, z \in 0 .. 5 : P(x, y, z)''. Modify the definition of ''Expr.QuantFn'' and your parsing code to handle this syntax.
 +  - Modify the change-of-base algorithm to handle enumerating multiple sets of varying cardinality, as required to interpret the expressions parsed in the first challenge.
 +  - Our enumeration algorithm is a "depth-first" enumeration: all elements of one set will be enumerated before enumerating a second element of any of the other sets. Come up with an enumeration algorithm that enumerates sets in breadth-first order instead, so the enumeration progresses evenly for each set.
 +  - Often, sets in TLA⁺ models are very large. Instead of immediately evaluating sets constructed with the ''..'' infix operator and holding them in memory, write a class that lazily generates set elements as requested. Integrate this lazy evaluation with the breadth-first binding enumeration from the previous challenge.
 +
 +[[creating:jlists|< Previous Page]] | [[creating:start#table_of_contents|Table of Contents]] | [[creating:actions|Next Page >]]
 +
  • creating/operators.1747765882.txt.gz
  • Last modified: 2025/05/20 18:31
  • by ahelwer