creating:syntax

This is an old revision of the document!


PHP's gd library is missing or unable to create PNG images

Parsing TLA⁺ Syntax

TLA⁺ is a large, complicated, and ambiguous language that takes a huge amount of work to parse correctly & completely. This is assisted by a comprehensive test corpus exercising the language's nooks & crannies, but here we will make life easy on ourselves. This tutorial only uses a minimal subset of TLA⁺ syntax, just enough to handle the classic DieHard spec. While that may seem limiting, this tutorial tries to focus on the difficult & interesting parts of parsing the language instead of more mundane drudgework like handling all hundred-some user-definable operator symbols. You are encouraged to extend this minimal core as you wish; language tooling is best developed incrementally! Slowly filling in the details of this rough language sketch has a satisfying meditative aspect.

Here is what our minimal language subset includes:

  • The ---- MODULE Name ---- header and ==== footer
  • Named parameterized operator definitions like op == ... and op(x, y, z) == ...
  • Comments, both single-line like \* comment text and nestable multi-line like (* comment text *)
  • Declaration of VARIABLES
  • The ENABLED prefix operator
  • Finite set literals, like {1, 2, 3, 4, 5}
  • Some infix operators: \in, .., =, #, +, -, <, /\, and \/
  • The variable-priming suffix operator
  • Vertically-aligned conjunction & disjunction lists
  • The IF/THEN/ELSE construct
  • Natural numbers like 0, 5, 10, etc.
  • Boolean values TRUE and FALSE

Notably, the following will not be covered in this tutorial:

  • PlusCal
  • Declaring CONSTANTS
  • Nesting modules or importing other modules through EXTENDS or INSTANCE
  • Set map & set filter syntax, like {e \in S : P(e)}
  • Universal & existential quantification, like \E e \in S : P(e)
  • The CHOOSE operator
  • Functions, like [x \in Nat |-> 2*x] or the function EXCEPT syntax
  • Sequences, like <<1, 2, 3, 4, 5>>
  • The LET/IN construct
  • Temporal- or action-level operators like [], <>, [A]_v, or WF_v(A)
  • Higher-order operator parameters like op(F(_), x) == ...
  • The TLA⁺ proof language

As outlined above, you are free to add these missing features (or others) as you wish.

Read the first three chapters of free online textbook Crafting Interpreters by Robert Nystrom. We will be closely following the material in this book, modifying it to our uses. The first two chapters are a nice introduction and overview of language implementation generally. Chapter three specifies a toy language called Lox, to be used as an object of study for the remainder of the book. Our minimal TLA⁺ subset has some similarity to Lox with regard to expressions involving integers and booleans, but also differences - you can skip the section on closures (unless you want to implement higher-order operator parameters yourself) and the section on classes. What's important is that it's similar enough for all the fundamentals to still apply!

  • creating/syntax.1742853688.txt.gz
  • Last modified: 2025/03/24 22:01
  • by ahelwer