======= Functions, Operators, and Parameters ======= We now rejoin //Crafting Interpreters// for [[https://craftinginterpreters.com/functions.html|Chapter 10]]. The chapter is titled //Functions//, but TLA⁺ actually has two related concepts to handle here: functions, and operators. It's worth taking some time to conceptually delineate them. //Functions// in TLA⁺ are what in other languages are called dictionaries, maps, or associative arrays. They are //values//, which we'll implement in the interpreter as instances of ''Map''. They have a defined domain, and attempting to apply a function to a value outside of its domain results in a runtime error. In contrast, TLA⁺ //operators// are more similar to macros: they don't have a defined domain, and the body of a TLA⁺ operator is whatever results from replacing parameter references with whatever expression was provided for them. If that replacement process results in a totally nonsensical expression, well, that's the user's fault and we raise a runtime error. We [[creating:statements|previously]] implemented the ability to directly bind values to identifiers, which in this chapter will be recognized as zero-parameter operators. Functions have not yet been defined in any form, but will be later in this tutorial. ====== Section 10.1: Function Calls ====== [[https://craftinginterpreters.com/functions.html#function-calls|Section 10.1]] takes us through adding calling syntax to the parser. We have two separate calling syntaxes to support: function application, which use square brackets like ''f[x]'', and operator calls, which use parentheses like ''op(x)''. Let's go over function application first. Similar to the book, we add a new expression type in the ''GenerateAst'' class ''main()'' method, which we'll call ''FnApply'': defineAst(outputDir, "Expr", Arrays.asList( "Binary : Expr left, Token operator, Expr right", "FnApply : Expr fn, Token bracket, Expr argument", "Grouping : Expr expression", For simplicity we restrict function application to a single parameter. While the full TLA⁺ language does support constructing & calling functions with multiple parameters (underneath, bundled into a tuple), this feature is easily replicated by nesting single-parameter functions within functions - which we will support. To support operator calls, we actually augment our existing ''Expr.Variable'' class with a list of arguments: defineAst(outputDir, "Expr", Arrays.asList( "Binary : Expr left, Token operator, Expr right", "FnApply : Expr fn, Token bracket, Expr argument", "Grouping : Expr expression", "Literal : Object value", "Variable : Token name, List arguments", "Unary : Token operator, Expr expr", While functions in TLA⁺ are values that can be passed around and constructed, operators have to be directly mentioned by name. That is why ''FnApply'' has an ''Expr'' instance to derive the function to apply, while operators use ''Variable'' which has a ''Token'' instance to record the operator name. Operators can accept multiple arguments. To parse function application we need to splice a method into our recursive descent precedence chain. At the top of ''operatorExpression()'' in the ''Parser'' class, replace the call to ''primary()'' with a call to ''call()'': private Expr operatorExpression(int prec) { if (prec == 16) return call(); Operator op; Then, define the ''call()'' method similar to the book (differences highlighted): private Expr call() { Expr expr = primary(); while (match(LEFT_BRACKET)) { Expr argument = expression(); consume(RIGHT_BRACKET, "Require ']' to conclude function call"); expr = new Expr.FnApply(expr, previous(), argument); } return expr; } Our parsing task is simpler so we don't need a separate ''finishCall()'' method as suggested by the book. You'll note that parsing function application is strikingly similar to parsing associative postfix operators. For operator calling, we augment our handling of identifiers in ''primary()'': return new Expr.Literal(previous().literal); } if (match(IDENTIFIER)) { Token identifier = previous(); List arguments = new ArrayList<>(); if (match(LEFT_PAREN)) { do { arguments.add(expression()); } while (match(COMMA)); consume(RIGHT_PAREN, "Require ')' to conclude operator call"); } return new Expr.Variable(identifier, arguments); } if (match(LEFT_PAREN)) { This ''do''/''while'' loop is very similar to our existing set literal parsing code for handling comma-separated expressions, so it should be familiar to you. ===== Subsection 10.1.2: Interpreting function calls ===== In [[https://craftinginterpreters.com/functions.html#interpreting-function-calls|this section]] we learn how to interpret our shiny new call syntax. First, add some imports to the top of ''Interpreter.java'': import java.util.Set; import java.util.HashSet; import java.util.ArrayList; import java.util.List; import java.util.Map; import java.util.HashMap; import java.io.PrintStream; Implement the ''Interpreter'' class visitor method for ''Expr.FnApply'': @Override public Object visitFnApplyExpr(Expr.FnApply expr) { Object callee = evaluate(expr.fn); Object argument = evaluate(expr.argument); Map function = (Map)callee; return function.get(argument); } Now we can apply functions, although we can't yet define them. Next up is operators. Here we completely rewrite our ''visitVariableExpr()'' method in the ''Interpreter'' class to resemble the ''visitCallExpr()'' method from the book: @Override public Object visitVariableExpr(Expr.Variable expr) { Object callee = environment.get(expr.name); List arguments = new ArrayList<>(); for (Expr argument : expr.arguments) { arguments.add(evaluate(argument)); } TlaCallable operator = (TlaCallable)callee; return operator.call(this, arguments); } We write our own version of the ''LoxCallable'' interface from the book, which we'll call ''TlaCallable'': package tla; import java.util.List; interface TlaCallable { Object call(Interpreter interpreter, List arguments); } We can now call operators, although cannot yet define operators to call! ===== Subsection 10.1.3: Call type errors ===== In ''visitFnApply()'', we should validate that our ''callee'' is actually a function: @Override public Object visitFnApplyExpr(Expr.FnApply expr) { Object callee = evaluate(expr.fn); checkFunctionOperand(expr.bracket, callee); Object argument = evaluate(expr.argument); Map function = (Map)callee; This requires another validation helper, ''checkFunctionOperand()''. Put it down with the rest of the validation helpers: private void checkFunctionOperand(Token operator, Object operand) { if (operand instanceof Map) return; throw new RuntimeError(operator, "Operand must be a function."); } We should also check that the function is actually defined on the given argument, and provide a useful error message if not: Object argument = evaluate(expr.argument); Map function = (Map)callee; if (!function.containsKey(argument)) { throw new RuntimeError(expr.bracket, "Cannot apply function to element outside domain: " + argument.toString()); } return function.get(argument); } In ''visitVariableExpr()'', similar to the book we need to check whether ''callee'' is an instance of ''TlaCallable''. However, unlike the book this is not an error! It is simply an identifier bound to a concrete value in the current ''environment'', and we should immediately return that value: @Override public Object visitVariableExpr(Expr.Variable expr) { Object callee = environment.get(expr.name); if (!(callee instanceof TlaCallable)) { return callee; } List arguments = new ArrayList<>(); This duality of identifiers either being operators to be evaluated or already-bound values is a bit confusing at times, but we can handle it. If the user tries to provide arguments to one of these concrete values then we should raise an error: if (!(callee instanceof TlaCallable)) { if (!expr.arguments.isEmpty()) { throw new RuntimeError(expr.name, "Cannot give arguments to non-operator identifier."); } return callee; } ===== Subsection 10.1.4: Checking arity ===== If ''callee'' //is// an operator, we need to [[https://craftinginterpreters.com/functions.html#checking-arity|check arity]] - that the number of arguments provided is as expected: TlaCallable operator = (TlaCallable)callee; if (arguments.size() != operator.arity()) { throw new RuntimeError(expr.name, "Expected " + operator.arity() + " arguments but got " + arguments.size() + "."); } return operator.call(this, arguments); Which requires adding a new method to the ''TlaCallable'' interface: interface TlaCallable { int arity(); Object call(Interpreter interpreter, List arguments); For functions, we don't need to check arity because the single-argument restriction is enforced at the parser level. ====== Section 10.2: Native Functions ====== In [[https://craftinginterpreters.com/functions.html#native-functions|Section 10.2]] we learn how to implement primitive/native functions. We've already implemented all our native TLA⁺ functions in the //[[creating:evaluation|Evaluating Constant TLA⁺ Expressions]]// chapter, but we will follow along to add a crucial field to the ''Interpreter'' class: a global environment. Due to the ''replMode'' parameter we have to put the initialization logic in the constructor (new code highlighted): class Interpreter implements Expr.Visitor, Stmt.Visitor { final Environment globals; private Environment environment; private final PrintStream out; public Interpreter(PrintStream out, boolean replMode) { this.globals = new Environment(replMode); this.environment = this.globals; this.out = out; } Note that the ''globals'' field has package visibility, not class visibility. The use of this field will become clear when implementing the ''TlaCallable.call()'' interface method. ====== Section 10.3: Function Declarations ====== We can call functions & operators, now let's [[https://craftinginterpreters.com/functions.html#function-declarations|define them]]! Operators take less work so they come first. In the ''Parser'' class, we have our existing ''isAtOpDefStart()'' and ''operatorDefinition()'' methods which we will now extend to accept parameters. We want to parse operator definitions like ''op(x, y, z) == expr'' and also ''op == expr''. ''op() == expr'' is invalid syntax in TLA⁺. We will use the now-familiar ''do''/''while'' method of consuming one or more comma-separated values. Recall that ''isAtOpDefStart()'' is a lookahead method to determine whether the upcoming token sequence is an operator definition, while ''operatorDefinition()'' actually parses that same token sequence into an AST node. private boolean isAtOpDefStart() { if (!match(IDENTIFIER)) return false; if (match(LEFT_PAREN)) { do { if (!match(IDENTIFIER)) return false; } while (match(COMMA)); if (!match(RIGHT_PAREN)) return false; } return match(EQUAL_EQUAL); } private Stmt operatorDefinition() { Token name = consume(IDENTIFIER, "Name required for operator definition."); List params = new ArrayList<>(); if (match(LEFT_PAREN)) { do { params.add(consume(IDENTIFIER, "Identifier required as operator parameter.")); } while (match(COMMA)); consume(RIGHT_PAREN, "')' required to terminate operator parameters."); } consume(EQUAL_EQUAL, "'==' required for operator definition."); return new Stmt.OpDef(name, params, expression()); } Our changes to ''operatorDefinition()'' require modifying the definition of ''Stmt.OpDef'' in the ''GenerateAst'' class to accept parameters: defineAst(outputDir, "Stmt", Arrays.asList( "Print : Expr expression", "OpDef : Token name, List params, Expr body" )); We can now parse operators with parameters! On to functions. This requires defining a new expression type in the ''GenerateAST'' class: defineAst(outputDir, "Expr", Arrays.asList( "Binary : Expr left, Token operator, Expr right", "QuantFn : Token op, List params, Expr set, Expr body", "FnApply : Expr fn, Token bracket, Expr argument", This looks a bit abstract. What does ''QuantFn'' mean? And why do we allow multiple parameters when our functions only take one? This is because we'll be bundling multiple closely-related language constructs into this single AST node type: function constructors, universal quantifiers, and existential quantifiers. In TLA⁺, functions are constructed with the ''|->'' operator. So you can write ''[x \in 0 .. 2 |-> x + 1]'' to define a function mapping the values 0, 1, and 2 to 1, 2, and 3 respectively. TLA⁺ also has very useful universal & existential quantification syntax. You can write ''\A x \in 0 .. 2 : x < 3'' to assert that all elements of the set ''{0, 1, 2}'' are less than three, or ''\E x, y \in 0 .. 2 : x + y = 4'' to assert that there are two elements of the set ''{0, 1, 2}'' that together sum to four. We can see the commonality between these; they all have: - A token identifying the operator (''|->'', ''\A'', or ''\E'') - A set to draw elements from (''0 .. 2'') - Some number of identifiers to which to bind values from the set (''x'', ''y'') - An expression body in which the identifiers can be used So we can contain all of them in the ''QuantFn'' expression node, which means something like "quantifier function". Here's how we parse the function constructor syntax. Add this to the ''primary()'' method in the ''Parser'' class, below the finite set literal code: if (match(LEFT_BRACKET)) { List params = new ArrayList<>(); params.add(consume(IDENTIFIER, "Identifier required in function constructor.")); consume(IN, "'\\in' required in function constructor."); Expr set = expression(); Token op = consume(ALL_MAP_TO, "'|->' required in function constructor."); Expr expr = expression(); consume(RIGHT_BRACKET, "']' required to conclude function constructor."); return new Expr.QuantFn(op, params, set, expr); } While the ''Expr.QuantFn'' constructor accepts a list of parameters, we only parse one. The parser code for universal & existential quantification looks similar, but we allow multiple parameters; put this in ''primary()'' just below the function constructor parsing logic: if (match(FOR_ALL, EXISTS)) { Token op = previous(); List params = new ArrayList<>(); do { params.add(consume(IDENTIFIER, "Identifier required in quantifier.")); } while (match(COMMA)); consume(IN, "'\\in' required in quantifier."); Expr set = expression(); consume(COLON, "':' required in quantifier."); Expr expr = expression(); return new Expr.QuantFn(op, params, set, expr); } And that concludes function parsing! We can now define both operators and functions with parameters; on to evaluation. ====== Section 10.4: Function Objects ====== In [[https://craftinginterpreters.com/functions.html#function-objects|Section 10.4]] we write a class implementing the ''TlaCallable'' interface and hook our function & operator definitions up in the interpreter. Create a new file called ''TlaOperator.java'': package tla; import java.util.List; class TlaOperator implements TlaCallable { private final Stmt.OpDef declaration; TlaOperator(Stmt.OpDef declaration) { this.declaration = declaration; } } The implementation of the ''call()'' method is exactly as defined in the book: @Override public Object call(Interpreter interpreter, List arguments) { Environment environment = new Environment(interpreter.globals); for (int i = 0; i < declaration.params.size(); i++) { environment.define(declaration.params.get(i).lexeme, arguments.get(i)); } interpreter.executeBlock(declaration.body, environment); return null; } Here we see the usage of ''interpreter.globals''; we define the argument values within a new ''Environment'' instance, but the parent of that ''Environment'' is the root global instance instead of the current instance. This is so nested operator calls do not propagate their argument values to the nested callee. This method calls a nonexistent ''executeBlock()'' method which we will define in ''Interpreter'': Object executeBlock(Expr expr, Environment environment) { Environment previous = this.environment; try { this.environment = environment; return evaluate(expr); } finally { this.environment = previous; } } In the book, ''executeBlock()'' was defined back in [[https://craftinginterpreters.com/statements-and-state.html#block-syntax-and-semantics|Chapter 8]] and looked quite a bit different. Since TLA⁺ is not an imperative language with code blocks in the traditional sense, we define ''executeBlock()'' over expressions instead of lists of statements. Wrapping the evaluation in a ''try''/''finally'' ensures that if a runtime exception is thrown our environment is not left in an invalid state. Back in ''TlaOperator'', we still need to implement the ''arity()'' interface method: @Override public int arity() { return declaration.params.size(); } For ease of use we'll also override ''toString()'': @Override public String toString() { return ""; } There is only one more thing to do and then parameterized operators will be fully functional (operational?): modify ''visitOpDefStmt()'' in ''Interpreter'' so that it defines new ''TlaOperator'' instances instead of evaluating the statement body directly: @Override public Void visitOpDefStmt(Stmt.OpDef stmt) { TlaOperator op = new TlaOperator(stmt); environment.define(stmt.name, op); return null; } And that takes care of operators! Now to handle functions. We still need to implement our ''visitQuantFnExpr()'' method in the ''Interpreter''; similar to other operator evaluation methods like ''visitBinaryExpr()'' we evaluate what we can (here, the set to be quantified over) then branch on the operator type: @Override public Object visitQuantFnExpr(Expr.QuantFn expr) { Object set = evaluate(expr.set); checkSetOperand(expr.op, set); switch (expr.op.type) { case ALL_MAP_TO: { } case FOR_ALL: { } case EXISTS: { } default: { // Unreachable. return null; } } } Thinking about it, the code to implement these operators is fairly tricky. We want to generate a series of bindings of set elements to identifiers. Let's fill out the ''FOR_ALL'' case to get a sense of how we want this to look; note that like our conjunction implementation in the previous chapter, universal quantification is short-circuiting and returns false as soon as it encounters a counterexample to the body predicate: } case FOR_ALL: { for (Environment binding : bindings) { Object result = executeBlock(expr.body, binding); checkBooleanOperand(expr.op, result); if (!(Boolean)result) return false; } return true; } case EXISTS: { This is nice, concise code. We iterate over a series of bindings - in the form of ''Environment'' instances - which enumerate all possible bindings of set values to identifiers. But what is ''bindings''? This is one of the more algorithmically tricky parts of this entire tutorial series, but it is also very fun! We are going to encapsulate the binding generation logic in a new class file, ''BindingGenerator.java'', where the class implements both the ''Iterator'' and ''Iterable'' standard Java interfaces: package tla; import java.util.ArrayList; import java.util.Iterator; import java.util.List; import java.util.Set; class BindingGenerator implements Iterator, Iterable { private final List vars; private final List set; private final Environment parent; BindingGenerator(List vars, Set set, Environment parent) { this.vars = vars; this.set = new ArrayList<>(set); this.parent = parent; } @Override public Iterator iterator() { return this; } } Implementing the ''Iterator<>'' and ''Iterable<>'' interfaces allows us to treat ''BindingGenerator'' instances as things that can be iterated over in Java's nice [[https://docs.oracle.com/javase/8/docs/technotes/guides/language/foreach.html|for-each loop]] syntax. So that's what ''bindings'' is! A ''BindingGenerator'' instance! Before getting too far into the weeds with the ''BindingGenerator'' iterator algorithm, let's just assume it works and fill out the rest of ''visitQuantFnExpr()''. Construct a ''BindingGenerator'' instance atop the ''switch'' statement: public Object visitQuantFnExpr(Expr.QuantFn expr) { Object set = evaluate(expr.set); checkSetOperand(expr.op, set); BindingGenerator bindings = new BindingGenerator(expr.params, (Set)set, environment); switch (expr.op.type) { One important thing to note is that unlike in ''TlaOperator'', we give ''BindingGenerator'' the //current// environment instead of the root global environment; this is because if we are evaluating an ''Expr.QuantFn'' instance inside an operator, the body expression should be able to access definitions from the operator's environment, like: op(x) == \A y \in 0 .. 2 : y < x Here's how we implement the ''EXISTS'' case; unlike universal quantification, existential quantification is not short-circuiting in TLA⁺ (for reasons that will become clear in the next chapter): } case EXISTS: { boolean result = false; for (Environment binding : bindings) { Object junctResult = executeBlock(expr.body, binding); checkBooleanOperand(expr.op, junctResult); result |= (Boolean)junctResult; } return result; } default: { Finally, here's how we implement the ''ALL_MAP_TO'' case; we want to construct a ''Map'' instance recording each value the single parameter is mapped to: case ALL_MAP_TO: { Token param = expr.params.get(0); Map function = new HashMap<>(); for (Environment binding : bindings) { function.put(binding.get(param), executeBlock(expr.body, binding)); } return function; } case FOR_ALL: { ====== Set Enumeration ====== Now that we've finished all that, let's get to the real difficult part: implementing an enumeration algorithm in ''BindingGenerator''. A moment of consideration is in order. If you had to bind a set of variables to every combination of values from a set, how would you do it? To computer scientists, the connection between enumerating every possible combination of values and incrementing numbers in an arbitrary base is readily made. For example, imagine you had a set of three values ''{a, b, c}'' that you wanted to bind to two identifiers ''x'' and ''y''. This is isomorphic to iterating through all possible values of a two-digit number in base three. The number of digits corresponds to the number of variables to bind, and the base corresponds to the number of elements in the set. This is confusing, but we will walk through it step by step. First we must assign an arbitrary order to the set so we can map single digits in base three to an element of the set: ^ Digit ^ Element ^ | 0 | a | | 1 | b | | 2 | c | Using that mapping, here's a table showing how counting up through all two-digit base three numbers decomposes into binding all possible combinations of values to ''x'' and ''y'': ^ Number (base 3) ^ As set elements ^ x (0th-significant digit) ^ y (1st-significant digit) ^ | 00 | aa | a | a | | 01 | ab | b | a | | 02 | ac | c | a | | 10 | ba | a | b | | 11 | bb | b | b | | 12 | bc | c | b | | 20 | ca | a | c | | 21 | cb | b | c | | 22 | cc | c | c | As you can see, we generated every possible binding of ''{a, b, c}'' to ''x'' and ''y''! Each identifier is assigned a particular digit in the number, and their value in a binding is given by that digit's value in the enumeration. We're almost there, but not quite; Java doesn't have the built-in ability to iterate through numbers in a particular base. We will need to increment a regular ''int'' counter and //convert// it into the desired base, then extract the value of each digit. Actually the well-known [[https://en.wikipedia.org/wiki/Positional_notation#Base_conversion|base conversion]] algorithm lets us do both at once! It is quite elegant; given a number ''n'' and a desired base ''b'', the value of the least-significant digit can be found by calculating ''n % b'' (''%'' meaning mod, giving the division remainder). Then, assign ''n' = n / b'' (using integer division, so throw away the remainder) and repeat for as many digits as desired. Confused? Here's how it looks as a table for converting the enumeration value 6 to a binding from our previous base 3 example: ^ Iteration ^ ''n'' ^ ''n % 3'' ^ ''n / 3'' ^ | 0 | 6 | 0 | 2 | | 1 | 2 | 2 | 0 | Note that the value of ''n / 3'' always becomes the value of ''n'' in the next row. Reading the ''n % 3'' column from bottom to top we see "20", which is 6 in base 3. Consulting our tables, we see "20" should map to ''ca''. ''x'' gets the 0th (least significant) digit, so it should get ''a''. ''y'' gets the 1st-significant digit, so it should get ''c''. The iteration number actually corresponds to digit significance. So, in iteration 0, ''x'' should get the value of the ''n % 3'' column, which we see is 0 and thus ''a'' as expected. In iteration 1, ''y'' should get the value of the ''n % 3'' column, which we see is 2 and thus ''c'' as expected. If this is still too abstract, try some other values to convince yourself that our algorithm successfully calculates the binding! Now to implement it. First, add another field to the ''BindingGenerator'' class: class BindingGenerator implements Iterator, Iterable { private final List vars; private final List set; private final Environment parent; private int enumerationIndex = 0; Then, implement ''hasNext()''. This is required by the ''Iterator<>'' interface and returns true if there are additional un-enumerated combinations. In our case, we test that ''enumerationIndex'' is strictly less than the number of elements in the set raised to the power of the number of identifiers. For ''n'' digits in base ''b'', there are ''bⁿ'' possible values to iterate through. @Override public boolean hasNext() { return enumerationIndex < Math.pow(set.size(), vars.size()); } Now for the main event. Here's the strikingly simple code implementing the change-of-base algorithm in the ''next()'' method required by the ''Iterator<>'' interface: @Override public Environment next() { int current = enumerationIndex++; Environment bindings = new Environment(parent); for (Token var : vars) { bindings.define(var, set.get(current % set.size())); current /= set.size(); } return bindings; } Note that ''enumerationIndex++'' retrieves the current value of ''enumerationIndex'' to store in ''current'' and then increments the value of ''enumerationIndex'' itself. A nice little terse code trick. And we're done! You can now run your interpreter and enjoy the full power of functions & operators with parameters, along with the new universal & existential quantification expressions! See the current expected state of your source code [[https://github.com/tlaplus/devkit/tree/main/7-operators|here]]. On to the next chapter: [[creating:actions|Variables, States, and Actions]]! ====== Challenges ====== - In the full TLA⁺ language, quantified functions can be even more complicated; for example, you can write ''\A x, y \in 0 .. 2, z \in 0 .. 5 : P(x, y, z)''. Modify the definition of ''Expr.QuantFn'' and your parsing code to handle this syntax. - Modify the change-of-base algorithm to handle enumerating multiple sets of varying cardinality, as required to interpret the expressions parsed in the first challenge. - Our enumeration algorithm is a "depth-first" enumeration: all elements of one set will be enumerated before enumerating a second element of any of the other sets. Come up with an enumeration algorithm that enumerates sets in breadth-first order instead, so the enumeration progresses evenly for each set. - Often, sets in TLA⁺ models are very large. Instead of immediately evaluating sets constructed with the ''..'' infix operator and holding them in memory, write a class that lazily generates set elements as requested. Integrate this lazy evaluation with the breadth-first binding enumeration from the previous challenge. [[creating:jlists|< Previous Page]] | [[creating:start#table_of_contents|Table of Contents]] | [[creating:actions|Next Page >]]