This is an old revision of the document!
Operators & Parameters
We now rejoin Crafting Interpreters for Chapter 10. The chapter is titled Functions, but TLA⁺ actually has two related concepts to handle here: functions, and operators. It's worth taking some time to conceptually delineate them.
Functions in TLA⁺ are what in other languages are called dictionaries, maps, or associative arrays.
They are values, which we'll implement in the interpreter as instances of Map<Object, Object>
.
They have a defined domain, and attempting to apply a function to a value outside of its domain results in a runtime error.
In contrast, TLA⁺ operators are more similar to macros: they don't have a defined domain, and the body of a TLA⁺ operator is whatever results from replacing parameter references with whatever expression was provided for them. If that replacement process results in a totally nonsensical expression, well, that's the user's fault and we raise a runtime error.
We previously implemented the ability to directly bind values to identifiers, which in this chapter will be recognized as zero-parameter operators. Functions have not yet been defined in any form, but will be later in this tutorial.
Section 10.1: Function Calls
Section 10.1 takes us through adding calling syntax to the parser.
We have two separate calling syntaxes to support: function application, which use square brackets like f[x]
, and operator calls, which use parentheses like op(x)
.
Let's go over function application first.
Similar to the book, we add a new expression type in the GenerateAst
class main()
method, which we'll call FnApply
:
defineAst(outputDir, "Expr", Arrays.asList( "Binary : Expr left, Token operator, Expr right", "FnApply : Expr fn, Token bracket, Expr argument", "Grouping : Expr expression",
For simplicity we restrict function application to a single parameter. While the full TLA⁺ language does support constructing & calling functions with multiple parameters (underneath, bundled into a tuple), this feature is easily replicated by nesting single-parameter functions within functions - which we will support.
To support operator calls, we actually augment our existing Expr.Variable
class with a list of arguments:
defineAst(outputDir, "Expr", Arrays.asList( "Binary : Expr left, Token operator, Expr right", "FnApply : Expr fn, Token bracket, Expr argument", "Grouping : Expr expression", "Literal : Object value", "Variable : Token name, List<Expr> arguments", "Unary : Token operator, Expr expr",
While functions in TLA⁺ are values that can be passed around and constructed, operators have to be directly mentioned by name.
That is why FnApply
has an Expr
instance to derive the function to apply, while operators use Variable
which has a Token
instance to record the operator name.
Operators can accept multiple arguments.
To parse function application we need to splice a method into our recursive descent precedence chain.
At the top of operatorExpression()
in the Parser
class, replace the call to primary()
with a call to call()
:
private Expr operatorExpression(int prec) { if (prec == 16) return call(); Operator op;
Then, define the call()
method similar to the book (differences highlighted):
private Expr call() { Expr expr = primary(); while (match(LEFT_BRACKET)) { Expr argument = expression(); consume(RIGHT_BRACKET, "Require ']' to conclude function call"); expr = new Expr.FnApply(expr, previous(), argument); } return expr; }
Our parsing task is simpler so we don't need a separate finishCall()
method as suggested by the book.
You'll note that parsing function application is strikingly similar to parsing associative postfix operators.
For operator calling, we augment our handling of identifiers in primary()
:
return new Expr.Literal(previous().literal); } if (match(IDENTIFIER)) { Token identifier = previous(); List<Expr> arguments = new ArrayList<>(); if (match(LEFT_PAREN)) { do { arguments.add(expression()); } while (match(COMMA)); consume(RIGHT_PAREN, "Require ')' to conclude operator call"); } return new Expr.Variable(identifier, arguments); } if (match(LEFT_PAREN)) {
This do
/while
loop is very similar to our existing set literal parsing code for handling comma-separated expressions, so it should be familiar to you.
Subsection 10.1.2: Interpreting function calls
In this section we learn how to interpret our shiny new call syntax.
First let's implement the Interpreter
class visitor method for Expr.FnApply
:
@Override public Object visitFnApplyExpr(Expr.FnApply expr) { Object callee = evaluate(expr.fn); Object argument = evaluate(expr.argument); Map<?, ?> function = (Map<?, ?>)callee; return function.get(argument); }
Now we can apply functions, although we can't yet define them.
Next up is operators.
Here we completely rewrite our visitVariableExpr()
method in the Interpreter
class to resemble the visitCallExpr()
method from the book:
@Override public Object visitVariableExpr(Expr.Variable expr) { Object callee = environment.get(expr.name); List<Object> arguments = new ArrayList<>(); for (Expr argument : expr.arguments) { arguments.add(evaluate(argument)); } TlaCallable operator = (TlaCallable)callee; return operator.call(this, arguments); }
We write our own version of the LoxCallable
interface from the book, which we'll call TlaCallable
:
package tla; import java.util.List; interface TlaCallable { Object call(Interpreter interpreter, List<Object> arguments); }
We can now call operators, although cannot yet define operators to call!
Subsection 10.1.3: Call type errors
In visitFnApply()
, we should validate that our callee
is actually a function:
@Override public Object visitFnApplyExpr(Expr.FnApply expr) { Object callee = evaluate(expr.fn); checkFunctionOperand(expr.bracket, callee); Object argument = evaluate(expr.argument); Map<?, ?> function = (Map<?, ?>)callee;
This requires another validation helper, checkFunctionOperand()
.
Put it down with the rest of the validation helpers:
private void checkFunctionOperand(Token operator, Object operand) { if (operand instanceof Map<?,?>) return; throw new RuntimeError(operator, "Operand must be a function."); }
We should also check that the function is actually defined on the given argument, and provide a useful error message if not:
Object argument = evaluate(expr.argument); Map<?, ?> function = (Map<?, ?>)callee; if (!function.containsKey(argument)) { throw new RuntimeError(expr.bracket, "Cannot apply function to element outside domain: " + argument.toString()); } return function.get(argument); }
In visitVariableExpr()
, similar to the book we need to check whether callee
is an instance of TlaCallable
.
However, unlike the book this is not an error!
It is simply an identifier bound to a concrete value in the current environment
, and we should immediately return that value:
@Override public Object visitVariableExpr(Expr.Variable expr) { Object callee = environment.get(expr.name); if (!(callee instanceof TlaCallable)) { return callee; } List<Object> arguments = new ArrayList<>();
This duality of identifiers either being operators to be evaluated or already-bound values is a bit confusing at times, but we can handle it. If the user tries to provide arguments to one of these concrete values then we should raise an error:
if (!(callee instanceof TlaCallable)) { if (!expr.arguments.isEmpty()) { throw new RuntimeError(expr.name, "Cannot give arguments to non-operator identifier."); } return callee; }
Subsection 10.1.4: Checking arity
If callee
is an operator, we need to check arity - that the number of arguments provided is as expected:
TlaCallable operator = (TlaCallable)callee; if (arguments.size() != operator.arity()) { throw new RuntimeError(expr.name, "Expected " + operator.arity() + " arguments but got " + arguments.size() + "."); } return operator.call(this, arguments);
Which requires adding a new method to the TlaCallable
interface:
interface TlaCallable { int arity(); Object call(Interpreter interpreter, List<Object> arguments);
For functions, we don't need to check arity because the single-argument restriction is enforced at the parser level.
Section 10.2: Native Functions
In Section 10.2 we learn how to implement primitive/native functions.
We've already implemented all our native TLA⁺ functions in the Evaluating Constant TLA⁺ Expressions chapter, but we will follow along to add a crucial field to the Interpreter
class: a global environment.
Due to the replMode
parameter we have to put the initialization logic in the constructor (new code highlighted):
class Interpreter implements Expr.Visitor<Object>, Stmt.Visitor<Void> { final Environment globals; private Environment environment; private final PrintStream out; public Interpreter(PrintStream out, boolean replMode) { this.globals = new Environment(replMode); this.environment = this.globals; this.out = out; }
Note that the globals
field has package visibility, not class visibility.
The use of this field will become clear when implementing the TlaCallable.call()
interface method.
Section 10.3: Function Declarations
We can call functions & operators, now let's define them!
Operators take less work so they come first.
In the Parser
class, we have our existing isAtOpDefStart()
and operatorDefinition()
methods which we will now extend to accept parameters.
We want to parse operator definitions like op(x, y, z) == expr
and also op == expr
.
op() == expr
is invalid syntax in TLA⁺.
We will use the now-familiar do
/while
method of consuming one or more comma-separated values.
Recall that isAtOpDefStart()
is a lookahead method to determine whether the upcoming token sequence is an operator definition, while operatorDefinition()
actually parses that same token sequence into an AST node.
private boolean isAtOpDefStart() { if (!match(IDENTIFIER)) return false; if (match(LEFT_PAREN)) { do { if (!match(IDENTIFIER)) return false; } while (match(COMMA)); if (!match(RIGHT_PAREN)) return false; } return match(EQUAL_EQUAL); }
private Stmt operatorDefinition() { Token name = consume(IDENTIFIER, "Name required for operator definition."); List<Token> params = new ArrayList<>(); if (match(LEFT_PAREN)) { do { params.add(consume(IDENTIFIER, "Identifier required as operator parameter.")); } while (match(COMMA)); consume(RIGHT_PAREN, "')' required to terminate operator parameters."); } consume(EQUAL_EQUAL, "'==' required for operator definition."); return new Stmt.OpDef(name, params, expression()); }
Our changes to operatorDefinition()
require modifying the definition of Stmt.OpDef
in the GenerateAst
class to accept parameters:
defineAst(outputDir, "Stmt", Arrays.asList( "Print : Expr expression", "OpDef : Token name, List<Token> params, Expr body" ));
We can now parse operators with parameters!
On to functions.
This requires defining a new expression type in the GenerateAST
class:
defineAst(outputDir, "Expr", Arrays.asList( "Binary : Expr left, Token operator, Expr right", "QuantFn : Token op, List<Token> params, Expr set, Expr body", "FnApply : Expr fn, Token bracket, Expr argument",
This looks a bit abstract.
What does QuantFn
mean?
And why do we allow multiple parameters when our functions only take one?
This is because we'll be bundling multiple closely-related language constructs into this single AST node type: function constructors, universal quantifiers, and existential quantifiers.
In TLA⁺, functions are constructed with the |->
operator.
So you can write [x \in 0 .. 2 |-> x + 1]
to define a function mapping the values 0, 1, and 2 to 1, 2, and 3 respectively.
TLA⁺ also has very useful universal & existential quantification syntax.
You can write \A x \in 0 .. 2 : x < 3
to assert that all elements of the set {0, 1, 2}
are less than three, or \E x, y \in 0 .. 2 : x + y = 4
to assert that there are two elements of the set {0, 1, 2}
that together sum to four.
We can see the commonality between these; they all have:
- A token identifying the operator (
|->
,\A
, or\E
) - A set to draw elements from (
0 .. 2
) - Some number of identifiers to which to bind values from the set (
x
,y
) - An expression body in which the identifiers can be used
So we can contain all of them in the QuantFn
expression node, which means something like "quantifier function".
Here's how we parse the function constructor syntax.
Add this to the primary()
method in the Parser
class, below the finite set literal code:
if (match(LEFT_BRACKET)) { List<Token> params = new ArrayList<>(); params.add(consume(IDENTIFIER, "Identifier required in function constructor.")); consume(IN, "'\\in' required in function constructor."); Expr set = expression(); Token op = consume(ALL_MAP_TO, "'|->' required in function constructor."); Expr expr = expression(); consume(RIGHT_BRACKET, "']' required to conclude function constructor."); return new Expr.QuantFn(op, params, set, expr); }
While the Expr.QuantFn
constructor accepts a list of parameters, we only parse one.
The parser code for universal & existential quantification looks similar, but we allow multiple parameters; put this in primary()
just below the function constructor parsing logic:
if (match(FOR_ALL, EXISTS)) { Token op = previous(); List<Token> params = new ArrayList<>(); do { params.add(consume(IDENTIFIER, "Identifier required in quantifier.")); } while (match(COMMA)); consume(IN, "'\\in' required in quantifier."); Expr set = expression(); consume(COLON, "':' required in quantifier."); Expr expr = expression(); return new Expr.QuantFn(op, params, set, expr); }
And that concludes function parsing! We can now define both operators and functions with parameters; on to evaluation.
Section 10.4: Function Objects
In Section 10.4 we write a class implementing the TlaCallable
interface and hook our function & operator definitions up in the interpreter.
Create a new file called TlaOperator.java
:
package tla; import java.util.List; class TlaOperator implements TlaCallable { private final Stmt.OpDef declaration; TlaOperator(Stmt.OpDef declaration) { this.declaration = declaration; } }
The implementation of the call()
method is exactly as defined in the book:
@Override public Object call(Interpreter interpreter, List<Object> arguments) { Environment environment = new Environment(interpreter.globals); for (int i = 0; i < declaration.params.size(); i++) { environment.define(declaration.params.get(i).lexeme, arguments.get(i)); } interpreter.executeBlock(declaration.body, environment); return null; }
Here we see the usage of interpreter.globals
; we define the argument values within a new Environment
instance, but the parent of that Environment
is the root global instance instead of the current instance.
This is so nested operator calls do not propagate their argument values to the nested callee.
This method calls a nonexistent executeBlock()
method which we will define in Interpreter
:
Object executeBlock(Expr expr, Environment environment) { Environment previous = this.environment; try { this.environment = environment; return evaluate(expr); } finally { this.environment = previous; } }
In the book, executeBlock()
was defined back in Chapter 8 and looked quite a bit different.
Since TLA⁺ is not an imperative language with code blocks in the traditional sense, we define executeBlock()
over expressions instead of lists of statements.
Wrapping the evaluation in a try
/finally
ensures that if a runtime exception is thrown our environment is not left in an invalid state.
Back in TlaOperator
, we still need to implement the arity()
interface method:
@Override public int arity() { return declaration.params.size(); }
For ease of use we'll also override toString()
:
@Override public String toString() { return "<fn " + declaration.name.lexeme + ">"; }
There is only one more thing to do and then parameterized operators will be fully functional (operational?): modify visitOpDefStmt()
in Interpreter
so that it defines new TlaOperator
instances instead of evaluating the statement body directly:
@Override public Void visitOpDefStmt(Stmt.OpDef stmt) { TlaOperator op = new TlaOperator(stmt); environment.define(stmt.name, op); return null; }
And that takes care of operators!
Now to handle functions.
We still need to implement our visitQuantFnExpr()
method in the Interpreter
; similar to other operator evaluation methods like visitBinaryExpr()
we evaluate what we can (here, the set to be quantified over) then branch on the operator type:
@Override public Object visitQuantFnExpr(Expr.QuantFn expr) { Object set = evaluate(expr.set); checkSetOperand(expr.op, set); switch (expr.op.type) { case ALL_MAP_TO: { } case FOR_ALL: { } case EXISTS: { } default: { // Unreachable. return null; } } }
Thinking about it, the code to implement these operators is fairly tricky.
We want to generate a series of bindings of set elements to identifiers.
Let's fill out the FOR_ALL
case to get a sense of how we want this to look; note that like our conjunction implementation in the previous chapter, universal quantification is short-circuiting and returns false as soon as it encounters a counterexample to the body predicate:
} case FOR_ALL: { for (Environment binding : bindings) { Object result = executeBlock(expr.body, binding); checkBooleanOperand(expr.op, result); if (!(Boolean)result) return false; } return true; } case EXISTS: {
This is nice, concise code.
We iterate over a series of bindings - in the form of Environment
instances - which enumerate all possible bindings of set values to identifiers.
But what is bindings
?
This is one of the more algorithmically tricky parts of this entire tutorial series, but it is also very fun!
We are going to encapsulate the binding generation logic in a new class file, BindingGenerator.java
, where the class implements both the Iterator<Environment>
and Iterable<Environment>
standard Java interfaces:
package tla; import java.util.ArrayList; import java.util.Iterator; import java.util.List; import java.util.Set; class BindingGenerator implements Iterator<Environment>, Iterable<Environment> { private final List<Token> vars; private final List<Object> set; private final Environment parent; BindingGenerator(List<Token> vars, Set<?> set, Environment parent) { this.vars = vars; this.set = new ArrayList<>(set); this.parent = parent; } @Override public Iterator<Environment> iterator() { return this; } }
Implementing the Iterator<>
and Iterable<>
interfaces allows us to treat BindingGenerator
instances as things that can be iterated over in Java's nice for-each loop syntax.
So that's what bindings
is!
A BindingGenerator
instance!
Before getting too far into the weeds with the BindingGenerator
iterator algorithm, let's just assume it works and fill out the rest of visitQuantFnExpr()
.
Construct a BindingGenerator
instance atop the switch
statement:
public Object visitQuantFnExpr(Expr.QuantFn expr) { Object set = evaluate(expr.set); checkSetOperand(expr.op, set); BindingGenerator bindings = new BindingGenerator(expr.params, (Set<?>)set, environment); switch (expr.op.type) {
One important thing to note is that unlike in TlaOperator
, we give BindingGenerator
the current environment instead of the root global environment; this is because if we are evaluating an Expr.QuantFn
instance inside an operator, the body expression should be able to access definitions from the operator's environment, like:
op(x) == \A y \in 0 .. 2 : y < x
Here's how we implement the EXISTS
case; unlike universal quantification, existential quantification is not short-circuiting in TLA⁺ (for reasons that will become clear in the next chapter):
} case EXISTS: { boolean result = false; for (Environment binding : bindings) { Object junctResult = executeBlock(expr.body, binding); checkBooleanOperand(expr.op, junctResult); result |= (Boolean)junctResult; } return result; } default: {
Finally, here's how we implement the ALL_MAP_TO
case; we want to construct a Map<Object, Object>
instance recording each value the single parameter is mapped to:
case ALL_MAP_TO: { Token param = expr.params.get(0); Map<Object, Object> function = new HashMap<>(); for (Environment binding : bindings) { function.put(binding.get(param), executeBlock(expr.body, binding)); } return function; } case FOR_ALL: {
Now that we've finished all that, let's get to the real difficult part: implementing an enumeration algorithm in BindingGenerator
.
A moment of consideration is in order.
If you had to bind a set of variables to every combination of values from a set, how would you do it?
To computer scientists, the connection between enumerating every possible combination of values and incrementing numbers in an arbitrary base is readily made.
For example, imagine you had a set of three values {a, b, c}
that you wanted to bind to two identifiers x
and y
.
This is isomorphic to iterating through all possible values of a two-digit number in base three.
Confusing? Here's a table illustrating the connection:
Number (base 3) | As set elements | x (0th-significant digit) | y (1st-significant digit) |
---|---|---|---|
00 | aa | a | a |
01 | ab | b | a |
02 | ac | c | a |
10 | ba | a | b |
11 | bb | b | b |
12 | bc | c | b |
20 | ca | a | c |
21 | cb | b | c |
22 | cc | c | c |
As you can see, by counting up through all two-digit base-three numbers, we generate every possible binding of {a, b, c}
to x
and y
!
Each identifier is assigned a particular digit in the number, and their value in a binding is given by that digit's value in the enumeration.
So, for a set S
and list of identifiers I
it suffices to enumerate all |I|
-digit numbers of base |S|
, where | .. |
means the number of elements in the collection.
We're almost there, but not quite; Java doesn't have the built-in ability to iterate through numbers in a particular base.
We will need to increment a regular int
counter and convert it into the desired base, then extract the value of each digit.
Actually the well-known base conversion algorithm lets us do both at once!
It is quite elegant; given a number n
and a desired base |S|
, the value of the least-significant digit can be found by calculating n % |S|
(%
meaning mod, giving the division remainder).
Then, assign n' = n / |S|
(using integer division) and repeat for as many digits as desired (after some finite number of iterations, the digits all become zero).
Confused? Here's how it looks as a table for converting the enumeration value 6 to a binding from our previous base 3 example:
Iteration | n | n % 3 | n / 3 |
---|---|---|---|
0 | 6 | 0 | 2 |
1 | 2 | 2 | 0 |
Note that the value of n / 3
always becomes the value of n
in the next row.
Reading the n % 3
column from bottom to top we see "20", which is 6 in base 3.
Now associate each value in the set with a fixed value between 0
and |S| - 1
.
In our case, 0 maps to a
, 1 maps to b
, and 2 maps to c
.
The base 3 digit is identified by the iteration value.
The value of n % 3
in the zeroeth iteration gives the 0th-significant digit, from which we derive the value of x
- here, 0 which maps to a
.
The value of n % 3
in the first iteration gives the 1st-significant digit, from which we derive the value of y
- here, 2 which maps to c
.
So, enumeration value 6 binds value a
to identifier x
and value c
to identifier y
.
You should consult the original table and try some other values to convince yourself that our algorithm successfully calculates the binding.
Now to implement it!
First, add another field to the BindingGenerator
class:
class BindingGenerator implements Iterator<Environment>, Iterable<Environment> { private final List<Token> vars; private final List<Object> set; private final Environment parent; private int enumerationIndex = 0;
Then, implement hasNext()
.
This is required by the Iterator<>
interface and returns true if there are additional un-enumerated combinations.
In our case, we test that enumerationIndex
is strictly less than |S|^|I|
:
@Override public boolean hasNext() { return enumerationIndex < Math.pow(set.size(), vars.size()); }
Now for the main event.
Here's the strikingly simple code implementing the change-of-base algorithm in the next()
method required by the Iterator<>
interface:
@Override public Environment next() { int current = enumerationIndex++; Environment bindings = new Environment(parent); for (Token var : vars) { bindings.define(var, set.get(current % set.size())); current /= set.size(); } return bindings; }
And we're done! You can now run your interpreter and enjoy the full power of functions & operators with parameters, along with the new universal & existential quantification expressions!