Table of Contents

Functions, Operators, and Parameters

We now rejoin Crafting Interpreters for Chapter 10. The chapter is titled Functions, but TLA⁺ actually has two related concepts to handle here: functions, and operators. It's worth taking some time to conceptually delineate them.

Functions in TLA⁺ are what in other languages are called dictionaries, maps, or associative arrays. They are values, which we'll implement in the interpreter as instances of Map<Object, Object>. They have a defined domain, and attempting to apply a function to a value outside of its domain results in a runtime error.

In contrast, TLA⁺ operators are more similar to macros: they don't have a defined domain, and the body of a TLA⁺ operator is whatever results from replacing parameter references with whatever expression was provided for them. If that replacement process results in a totally nonsensical expression, well, that's the user's fault and we raise a runtime error.

We previously implemented the ability to directly bind values to identifiers, which in this chapter will be recognized as zero-parameter operators. Functions have not yet been defined in any form, but will be later in this tutorial.

Section 10.1: Function Calls

Section 10.1 takes us through adding calling syntax to the parser. We have two separate calling syntaxes to support: function application, which use square brackets like f[x], and operator calls, which use parentheses like op(x). Let's go over function application first.

Similar to the book, we add a new expression type in the GenerateAst class main() method, which we'll call FnApply:

    defineAst(outputDir, "Expr", Arrays.asList(
      "Binary   : Expr left, Token operator, Expr right",
      "FnApply  : Expr fn, Token bracket, Expr argument",
      "Grouping : Expr expression",

For simplicity we restrict function application to a single parameter. While the full TLA⁺ language does support constructing & calling functions with multiple parameters (underneath, bundled into a tuple), this feature is easily replicated by nesting single-parameter functions within functions - which we will support.

To support operator calls, we actually augment our existing Expr.Variable class with a list of arguments:

    defineAst(outputDir, "Expr", Arrays.asList(
      "Binary   : Expr left, Token operator, Expr right",
      "FnApply  : Expr fn, Token bracket, Expr argument",
      "Grouping : Expr expression",
      "Literal  : Object value",
      "Variable : Token name, List<Expr> arguments",
      "Unary    : Token operator, Expr expr",

While functions in TLA⁺ are values that can be passed around and constructed, operators have to be directly mentioned by name. That is why FnApply has an Expr instance to derive the function to apply, while operators use Variable which has a Token instance to record the operator name. Operators can accept multiple arguments.

To parse function application we need to splice a method into our recursive descent precedence chain. At the top of operatorExpression() in the Parser class, replace the call to primary() with a call to call():

  private Expr operatorExpression(int prec) {
    if (prec == 16) return call();
 
    Operator op;

Then, define the call() method similar to the book (differences highlighted):

  private Expr call() {
    Expr expr = primary();
 
    while (match(LEFT_BRACKET)) {
      Expr argument = expression();
      consume(RIGHT_BRACKET, "Require ']' to conclude function call");
      expr = new Expr.FnApply(expr, previous(), argument);
    }
 
    return expr;
  }

Our parsing task is simpler so we don't need a separate finishCall() method as suggested by the book. You'll note that parsing function application is strikingly similar to parsing associative postfix operators.

For operator calling, we augment our handling of identifiers in primary():

      return new Expr.Literal(previous().literal);
    }
 
    if (match(IDENTIFIER)) {
      Token identifier = previous();
      List<Expr> arguments = new ArrayList<>();
      if (match(LEFT_PAREN)) {
        do {
          arguments.add(expression());
        } while (match(COMMA));
        consume(RIGHT_PAREN, "Require ')' to conclude operator call");
      }
 
      return new Expr.Variable(identifier, arguments);
    }
 
    if (match(LEFT_PAREN)) {

This do/while loop is very similar to our existing set literal parsing code for handling comma-separated expressions, so it should be familiar to you.

Subsection 10.1.2: Interpreting function calls

In this section we learn how to interpret our shiny new call syntax. First, add some imports to the top of Interpreter.java:

import java.util.Set;
import java.util.HashSet;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.HashMap;
import java.io.PrintStream;

Implement the Interpreter class visitor method for Expr.FnApply:

  @Override
  public Object visitFnApplyExpr(Expr.FnApply expr) {
    Object callee = evaluate(expr.fn);
    Object argument = evaluate(expr.argument);
    Map<?, ?> function = (Map<?, ?>)callee;
 
    return function.get(argument);
  }

Now we can apply functions, although we can't yet define them. Next up is operators. Here we completely rewrite our visitVariableExpr() method in the Interpreter class to resemble the visitCallExpr() method from the book:

  @Override
  public Object visitVariableExpr(Expr.Variable expr) {
    Object callee = environment.get(expr.name);
 
    List<Object> arguments = new ArrayList<>();
    for (Expr argument : expr.arguments) {
      arguments.add(evaluate(argument));
    }
 
    TlaCallable operator = (TlaCallable)callee;
    return operator.call(this, arguments);
  }

We write our own version of the LoxCallable interface from the book, which we'll call TlaCallable:

package tla;
 
import java.util.List;
 
interface TlaCallable {
  Object call(Interpreter interpreter, List<Object> arguments);
}

We can now call operators, although cannot yet define operators to call!

Subsection 10.1.3: Call type errors

In visitFnApply(), we should validate that our callee is actually a function:

  @Override
  public Object visitFnApplyExpr(Expr.FnApply expr) {
    Object callee = evaluate(expr.fn);
    checkFunctionOperand(expr.bracket, callee);
    Object argument = evaluate(expr.argument);
    Map<?, ?> function = (Map<?, ?>)callee;

This requires another validation helper, checkFunctionOperand(). Put it down with the rest of the validation helpers:

  private void checkFunctionOperand(Token operator, Object operand) {
    if (operand instanceof Map<?,?>) return;
    throw new RuntimeError(operator, "Operand must be a function.");
  }

We should also check that the function is actually defined on the given argument, and provide a useful error message if not:

    Object argument = evaluate(expr.argument);
    Map<?, ?> function = (Map<?, ?>)callee;
    if (!function.containsKey(argument)) {
      throw new RuntimeError(expr.bracket,
          "Cannot apply function to element outside domain: "
          + argument.toString());
    }
 
    return function.get(argument);
  }

In visitVariableExpr(), similar to the book we need to check whether callee is an instance of TlaCallable. However, unlike the book this is not an error! It is simply an identifier bound to a concrete value in the current environment, and we should immediately return that value:

  @Override
  public Object visitVariableExpr(Expr.Variable expr) {
    Object callee = environment.get(expr.name);
 
    if (!(callee instanceof TlaCallable)) {
      return callee;
    }
 
    List<Object> arguments = new ArrayList<>();

This duality of identifiers either being operators to be evaluated or already-bound values is a bit confusing at times, but we can handle it. If the user tries to provide arguments to one of these concrete values then we should raise an error:

    if (!(callee instanceof TlaCallable)) {
      if (!expr.arguments.isEmpty()) {
        throw new RuntimeError(expr.name,
            "Cannot give arguments to non-operator identifier.");
      }
 
      return callee;
    }

Subsection 10.1.4: Checking arity

If callee is an operator, we need to check arity - that the number of arguments provided is as expected:

    TlaCallable operator = (TlaCallable)callee;
    if (arguments.size() != operator.arity()) {
      throw new RuntimeError(expr.name, "Expected " +
          operator.arity() + " arguments but got " +
          arguments.size() + ".");
    }
 
    return operator.call(this, arguments);

Which requires adding a new method to the TlaCallable interface:

interface TlaCallable {
  int arity();
  Object call(Interpreter interpreter, List<Object> arguments);

For functions, we don't need to check arity because the single-argument restriction is enforced at the parser level.

Section 10.2: Native Functions

In Section 10.2 we learn how to implement primitive/native functions. We've already implemented all our native TLA⁺ functions in the Evaluating Constant TLA⁺ Expressions chapter, but we will follow along to add a crucial field to the Interpreter class: a global environment. Due to the replMode parameter we have to put the initialization logic in the constructor (new code highlighted):

class Interpreter implements Expr.Visitor<Object>,
                             Stmt.Visitor<Void> {
  final Environment globals;
  private Environment environment;
  private final PrintStream out;
 
  public Interpreter(PrintStream out, boolean replMode) {
    this.globals = new Environment(replMode);
    this.environment = this.globals;
    this.out = out;
  }

Note that the globals field has package visibility, not class visibility. The use of this field will become clear when implementing the TlaCallable.call() interface method.

Section 10.3: Function Declarations

We can call functions & operators, now let's define them! Operators take less work so they come first. In the Parser class, we have our existing isAtOpDefStart() and operatorDefinition() methods which we will now extend to accept parameters. We want to parse operator definitions like op(x, y, z) == expr and also op == expr. op() == expr is invalid syntax in TLA⁺. We will use the now-familiar do/while method of consuming one or more comma-separated values. Recall that isAtOpDefStart() is a lookahead method to determine whether the upcoming token sequence is an operator definition, while operatorDefinition() actually parses that same token sequence into an AST node.

  private boolean isAtOpDefStart() {
    if (!match(IDENTIFIER)) return false;
    if (match(LEFT_PAREN)) {
      do {
        if (!match(IDENTIFIER)) return false;
      } while (match(COMMA));
      if (!match(RIGHT_PAREN)) return false;
    }
 
    return match(EQUAL_EQUAL);
  }
  private Stmt operatorDefinition() {
    Token name = consume(IDENTIFIER, "Name required for operator definition.");
    List<Token> params = new ArrayList<>();
    if (match(LEFT_PAREN)) {
      do {
        params.add(consume(IDENTIFIER, "Identifier required as operator parameter."));
      } while (match(COMMA));
      consume(RIGHT_PAREN, "')' required to terminate operator parameters.");
    }
 
    consume(EQUAL_EQUAL, "'==' required for operator definition.");
    return new Stmt.OpDef(name, params, expression());
  }

Our changes to operatorDefinition() require modifying the definition of Stmt.OpDef in the GenerateAst class to accept parameters:

    defineAst(outputDir, "Stmt", Arrays.asList(
      "Print    : Expr expression",
      "OpDef    : Token name, List<Token> params, Expr body"
    ));

We can now parse operators with parameters! On to functions. This requires defining a new expression type in the GenerateAST class:

    defineAst(outputDir, "Expr", Arrays.asList(
      "Binary   : Expr left, Token operator, Expr right",
      "QuantFn  : Token op, List<Token> params, Expr set, Expr body",
      "FnApply  : Expr fn, Token bracket, Expr argument",

This looks a bit abstract. What does QuantFn mean? And why do we allow multiple parameters when our functions only take one? This is because we'll be bundling multiple closely-related language constructs into this single AST node type: function constructors, universal quantifiers, and existential quantifiers.

In TLA⁺, functions are constructed with the |-> operator. So you can write [x \in 0 .. 2 |-> x + 1] to define a function mapping the values 0, 1, and 2 to 1, 2, and 3 respectively. TLA⁺ also has very useful universal & existential quantification syntax. You can write \A x \in 0 .. 2 : x < 3 to assert that all elements of the set {0, 1, 2} are less than three, or \E x, y \in 0 .. 2 : x + y = 4 to assert that there are two elements of the set {0, 1, 2} that together sum to four. We can see the commonality between these; they all have:

  1. A token identifying the operator (|->, \A, or \E)
  2. A set to draw elements from (0 .. 2)
  3. Some number of identifiers to which to bind values from the set (x, y)
  4. An expression body in which the identifiers can be used

So we can contain all of them in the QuantFn expression node, which means something like "quantifier function".

Here's how we parse the function constructor syntax. Add this to the primary() method in the Parser class, below the finite set literal code:

    if (match(LEFT_BRACKET)) {
      List<Token> params = new ArrayList<>();
      params.add(consume(IDENTIFIER, "Identifier required in function constructor."));
      consume(IN, "'\\in' required in function constructor.");
      Expr set = expression();
      Token op = consume(ALL_MAP_TO, "'|->' required in function constructor.");
      Expr expr = expression();
      consume(RIGHT_BRACKET, "']' required to conclude function constructor.");
      return new Expr.QuantFn(op, params, set, expr);
    }

While the Expr.QuantFn constructor accepts a list of parameters, we only parse one.

The parser code for universal & existential quantification looks similar, but we allow multiple parameters; put this in primary() just below the function constructor parsing logic:

    if (match(FOR_ALL, EXISTS)) {
      Token op = previous();
      List<Token> params = new ArrayList<>();
      do {
        params.add(consume(IDENTIFIER, "Identifier required in quantifier."));
      } while (match(COMMA));
      consume(IN, "'\\in' required in quantifier.");
      Expr set = expression();
      consume(COLON, "':' required in quantifier.");
      Expr expr = expression();
      return new Expr.QuantFn(op, params, set, expr);
    }

And that concludes function parsing! We can now define both operators and functions with parameters; on to evaluation.

Section 10.4: Function Objects

In Section 10.4 we write a class implementing the TlaCallable interface and hook our function & operator definitions up in the interpreter. Create a new file called TlaOperator.java:

package tla;
 
import java.util.List;
 
class TlaOperator implements TlaCallable {
  private final Stmt.OpDef declaration;
  TlaOperator(Stmt.OpDef declaration) {
    this.declaration = declaration;
  }
}

The implementation of the call() method is exactly as defined in the book:

  @Override
  public Object call(Interpreter interpreter,
                     List<Object> arguments) {
    Environment environment = new Environment(interpreter.globals);
    for (int i = 0; i < declaration.params.size(); i++) {
      environment.define(declaration.params.get(i).lexeme,
          arguments.get(i));
    }
 
    interpreter.executeBlock(declaration.body, environment);
    return null;
  }

Here we see the usage of interpreter.globals; we define the argument values within a new Environment instance, but the parent of that Environment is the root global instance instead of the current instance. This is so nested operator calls do not propagate their argument values to the nested callee.

This method calls a nonexistent executeBlock() method which we will define in Interpreter:

  Object executeBlock(Expr expr, Environment environment) {
    Environment previous = this.environment;
    try {
      this.environment = environment;
      return evaluate(expr);
    } finally {
      this.environment = previous;
    }
  }

In the book, executeBlock() was defined back in Chapter 8 and looked quite a bit different. Since TLA⁺ is not an imperative language with code blocks in the traditional sense, we define executeBlock() over expressions instead of lists of statements. Wrapping the evaluation in a try/finally ensures that if a runtime exception is thrown our environment is not left in an invalid state.

Back in TlaOperator, we still need to implement the arity() interface method:

  @Override
  public int arity() {
    return declaration.params.size();
  }

For ease of use we'll also override toString():

  @Override
  public String toString() {
    return "<fn " + declaration.name.lexeme + ">";
  }

There is only one more thing to do and then parameterized operators will be fully functional (operational?): modify visitOpDefStmt() in Interpreter so that it defines new TlaOperator instances instead of evaluating the statement body directly:

  @Override
  public Void visitOpDefStmt(Stmt.OpDef stmt) {
    TlaOperator op = new TlaOperator(stmt);
    environment.define(stmt.name, op);
    return null;
  }

And that takes care of operators! Now to handle functions. We still need to implement our visitQuantFnExpr() method in the Interpreter; similar to other operator evaluation methods like visitBinaryExpr() we evaluate what we can (here, the set to be quantified over) then branch on the operator type:

  @Override
  public Object visitQuantFnExpr(Expr.QuantFn expr) {
    Object set = evaluate(expr.set);
    checkSetOperand(expr.op, set);
    switch (expr.op.type) {
      case ALL_MAP_TO: {
 
      } case FOR_ALL: {
 
      } case EXISTS: {
 
      } default: {
        // Unreachable.
        return null;
      }
    }
  }

Thinking about it, the code to implement these operators is fairly tricky. We want to generate a series of bindings of set elements to identifiers. Let's fill out the FOR_ALL case to get a sense of how we want this to look; note that like our conjunction implementation in the previous chapter, universal quantification is short-circuiting and returns false as soon as it encounters a counterexample to the body predicate:

      } case FOR_ALL: {
        for (Environment binding : bindings) {
          Object result = executeBlock(expr.body, binding);
          checkBooleanOperand(expr.op, result);
          if (!(Boolean)result) return false;
        }
        return true;
      } case EXISTS: {

This is nice, concise code. We iterate over a series of bindings - in the form of Environment instances - which enumerate all possible bindings of set values to identifiers. But what is bindings? This is one of the more algorithmically tricky parts of this entire tutorial series, but it is also very fun! We are going to encapsulate the binding generation logic in a new class file, BindingGenerator.java, where the class implements both the Iterator<Environment> and Iterable<Environment> standard Java interfaces:

package tla;
 
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import java.util.Set;
 
class BindingGenerator implements Iterator<Environment>,
                                  Iterable<Environment> {
  private final List<Token> vars;
  private final List<Object> set;
  private final Environment parent;
 
  BindingGenerator(List<Token> vars, Set<?> set, Environment parent) {
    this.vars = vars;
    this.set = new ArrayList<>(set);
    this.parent = parent;
  }
 
  @Override
  public Iterator<Environment> iterator() {
    return this;
  }
}

Implementing the Iterator<> and Iterable<> interfaces allows us to treat BindingGenerator instances as things that can be iterated over in Java's nice for-each loop syntax. So that's what bindings is! A BindingGenerator instance!

Before getting too far into the weeds with the BindingGenerator iterator algorithm, let's just assume it works and fill out the rest of visitQuantFnExpr(). Construct a BindingGenerator instance atop the switch statement:

  public Object visitQuantFnExpr(Expr.QuantFn expr) {
    Object set = evaluate(expr.set);
    checkSetOperand(expr.op, set);
    BindingGenerator bindings = new BindingGenerator(expr.params, (Set<?>)set, environment);
    switch (expr.op.type) {

One important thing to note is that unlike in TlaOperator, we give BindingGenerator the current environment instead of the root global environment; this is because if we are evaluating an Expr.QuantFn instance inside an operator, the body expression should be able to access definitions from the operator's environment, like:

op(x) == \A y \in 0 .. 2 : y < x 

Here's how we implement the EXISTS case; unlike universal quantification, existential quantification is not short-circuiting in TLA⁺ (for reasons that will become clear in the next chapter):

      } case EXISTS: {
        boolean result = false;
        for (Environment binding : bindings) {
          Object junctResult = executeBlock(expr.body, binding);
          checkBooleanOperand(expr.op, junctResult);
          result |= (Boolean)junctResult;
        }
        return result;
      } default: {

Finally, here's how we implement the ALL_MAP_TO case; we want to construct a Map<Object, Object> instance recording each value the single parameter is mapped to:

      case ALL_MAP_TO: {
        Token param = expr.params.get(0);
        Map<Object, Object> function = new HashMap<>();
        for (Environment binding : bindings) {
          function.put(binding.get(param), executeBlock(expr.body, binding));
        }
        return function;
      } case FOR_ALL: {

Set Enumeration

Now that we've finished all that, let's get to the real difficult part: implementing an enumeration algorithm in BindingGenerator. A moment of consideration is in order. If you had to bind a set of variables to every combination of values from a set, how would you do it? To computer scientists, the connection between enumerating every possible combination of values and incrementing numbers in an arbitrary base is readily made. For example, imagine you had a set of three values {a, b, c} that you wanted to bind to two identifiers x and y. This is isomorphic to iterating through all possible values of a two-digit number in base three. The number of digits corresponds to the number of variables to bind, and the base corresponds to the number of elements in the set.

This is confusing, but we will walk through it step by step. First we must assign an arbitrary order to the set so we can map single digits in base three to an element of the set:

Digit Element
0 a
1 b
2 c

Using that mapping, here's a table showing how counting up through all two-digit base three numbers decomposes into binding all possible combinations of values to x and y:

Number (base 3) As set elements x (0th-significant digit) y (1st-significant digit)
00 aa a a
01 ab b a
02 ac c a
10 ba a b
11 bb b b
12 bc c b
20 ca a c
21 cb b c
22 cc c c

As you can see, we generated every possible binding of {a, b, c} to x and y! Each identifier is assigned a particular digit in the number, and their value in a binding is given by that digit's value in the enumeration.

We're almost there, but not quite; Java doesn't have the built-in ability to iterate through numbers in a particular base. We will need to increment a regular int counter and convert it into the desired base, then extract the value of each digit. Actually the well-known base conversion algorithm lets us do both at once! It is quite elegant; given a number n and a desired base b, the value of the least-significant digit can be found by calculating n % b (% meaning mod, giving the division remainder). Then, assign n' = n / b (using integer division, so throw away the remainder) and repeat for as many digits as desired. Confused? Here's how it looks as a table for converting the enumeration value 6 to a binding from our previous base 3 example:

Iteration n n % 3 n / 3
0 6 0 2
1 2 2 0

Note that the value of n / 3 always becomes the value of n in the next row. Reading the n % 3 column from bottom to top we see "20", which is 6 in base 3. Consulting our tables, we see "20" should map to ca. x gets the 0th (least significant) digit, so it should get a. y gets the 1st-significant digit, so it should get c.

The iteration number actually corresponds to digit significance. So, in iteration 0, x should get the value of the n % 3 column, which we see is 0 and thus a as expected. In iteration 1, y should get the value of the n % 3 column, which we see is 2 and thus c as expected. If this is still too abstract, try some other values to convince yourself that our algorithm successfully calculates the binding!

Now to implement it. First, add another field to the BindingGenerator class:

class BindingGenerator implements Iterator<Environment>,
                                  Iterable<Environment> {
  private final List<Token> vars;
  private final List<Object> set;
  private final Environment parent;
  private int enumerationIndex = 0;

Then, implement hasNext(). This is required by the Iterator<> interface and returns true if there are additional un-enumerated combinations. In our case, we test that enumerationIndex is strictly less than the number of elements in the set raised to the power of the number of identifiers. For n digits in base b, there are bⁿ possible values to iterate through.

  @Override
  public boolean hasNext() {
    return enumerationIndex < Math.pow(set.size(), vars.size());
  }

Now for the main event. Here's the strikingly simple code implementing the change-of-base algorithm in the next() method required by the Iterator<> interface:

  @Override
  public Environment next() {
    int current = enumerationIndex++;
    Environment bindings = new Environment(parent);
    for (Token var : vars) {
      bindings.define(var, set.get(current % set.size()));
      current /= set.size();
    }
 
    return bindings;
  }

Note that enumerationIndex++ retrieves the current value of enumerationIndex to store in current and then increments the value of enumerationIndex itself. A nice little terse code trick.

And we're done! You can now run your interpreter and enjoy the full power of functions & operators with parameters, along with the new universal & existential quantification expressions! See the current expected state of your source code here. On to the next chapter: Variables, States, and Actions!

Challenges

  1. In the full TLA⁺ language, quantified functions can be even more complicated; for example, you can write \A x, y \in 0 .. 2, z \in 0 .. 5 : P(x, y, z). Modify the definition of Expr.QuantFn and your parsing code to handle this syntax.
  2. Modify the change-of-base algorithm to handle enumerating multiple sets of varying cardinality, as required to interpret the expressions parsed in the first challenge.
  3. Our enumeration algorithm is a "depth-first" enumeration: all elements of one set will be enumerated before enumerating a second element of any of the other sets. Come up with an enumeration algorithm that enumerates sets in breadth-first order instead, so the enumeration progresses evenly for each set.
  4. Often, sets in TLA⁺ models are very large. Instead of immediately evaluating sets constructed with the .. infix operator and holding them in memory, write a class that lazily generates set elements as requested. Integrate this lazy evaluation with the breadth-first binding enumeration from the previous challenge.

< Previous Page | Table of Contents | Next Page >