creating:jlists

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
creating:jlists [2025/04/27 21:44] – Finished how to handle infix op ambiguity ahelwercreating:jlists [2025/04/27 21:57] (current) – Finished first draft ahelwer
Line 188: Line 188:
 Now we write a critical method. Now we write a critical method.
 In ''JListContext'', define ''isNewBullet()'' to return true if the given token could act as the start of a new junct in the current jlist: In ''JListContext'', define ''isNewBullet()'' to return true if the given token could act as the start of a new junct in the current jlist:
-<code haskell>+<code java>
   public boolean isNewBullet(Token op) {   public boolean isNewBullet(Token op) {
-    JListInfo current = this.stack.peekFirst();+    JListInfo current = stack.peekFirst();
     return current != null     return current != null
         && current.type == op.type         && current.type == op.type
Line 233: Line 233:
   /\ 1   /\ 1
   /\ 2   /\ 2
-  /\ 3 
 </code> </code>
 The parser will: The parser will:
Line 246: Line 245:
 If the call to ''jlists.isNewBullet()'' were not present, ''matchOp()'' would have matched the next ''/\'' as just another infix operator token. If the call to ''jlists.isNewBullet()'' were not present, ''matchOp()'' would have matched the next ''/\'' as just another infix operator token.
 But we pre-empted it! But we pre-empted it!
-So now ''/\'' and ''\/'' tokens will only be parsed as infix operator tokens when they would not be the start of a new junct in a currently-active jlist.+So now ''/\'' and ''\/'' tokens will only be parsed as infix operator tokens when they are not new juncts in a currently-active jlist.
  
 ====== Termination ====== ====== Termination ======
Line 258: Line 257:
   /\ 3   /\ 3
 </code> </code>
-Our code parses this as though it's:+Our code parses this as though it'actually:
 <code haskell> <code haskell>
 op == op ==
Line 284: Line 283:
 <code java> <code java>
   public boolean isAboveCurrent(Token tok) {   public boolean isAboveCurrent(Token tok) {
-    JListInfo current = this.stack.peekFirst();+    JListInfo current = stack.peekFirst();
     return current == null || current.column < tok.column;     return current == null || current.column < tok.column;
   }   }
Line 321: Line 320:
  
 Talk of parsing errors nicely segues us onto the topic of error recovery. Talk of parsing errors nicely segues us onto the topic of error recovery.
 +Recall that on error, we call ''synchronize()'' in the ''Parser'' class and desperately churn through tokens looking for the start of an operator definition.
 +Jlists complicate this a bit!
 +What happens if an error occurs while parsing a jlist and we enter ''synchronize()'' carrying around a bunch of state in the jlists stack?
 +Well, nonsensical things happen.
 +To fix this we just wipe out our jlist stack at the top of ''synchronize()'':
 +<code java [highlight_lines_extra="2"]>
 +  private void synchronize() {
 +    jlists.dump();
 +    advance();
 +
 +    while (!isAtEnd()) {
 +</code>
 +
 +This calls a new helper we'll define in ''JListContext'':
 +<code java>
 +  public void dump() {
 +    stack.clear();
 +  }
 +</code>
  
 +Done.
 +You've successfully parsed vertically-aligned conjunction & disjunction lists in TLA⁺!
 +This puts you in rarified air.
 +Only a handful of people in the world possess this knowledge, and now you are among them.
 +If your code got out of sync during this tutorial, you can find its expected state [[https://github.com/tlaplus/devkit/tree/main/6-jlists|here]].
 +Continue on the [[creating:operators|next page]], where we learn to parse operators with parameters!
  
 ====== Challenges ====== ====== Challenges ======
Line 330: Line 354:
   - It's tempting to summarize this chapter as us solving the jlist parsing problem by making jlists have higher precedence than infix operators, but that is not quite the case. Think carefully about what precedence means; is there a difference between what might be called //lexical// precedence - where one interpretation of a token takes higher precedence than another - and parsing precedence? Did we make use of that here? What are some ways that parsers can deal with the problem of the same token having multiple possible meanings?   - It's tempting to summarize this chapter as us solving the jlist parsing problem by making jlists have higher precedence than infix operators, but that is not quite the case. Think carefully about what precedence means; is there a difference between what might be called //lexical// precedence - where one interpretation of a token takes higher precedence than another - and parsing precedence? Did we make use of that here? What are some ways that parsers can deal with the problem of the same token having multiple possible meanings?
   - Jlists are not the only context-sensitive language construct in TLA⁺. Nested proof steps are another. Take some time to read the [[https://lamport.azurewebsites.net/tla/tla2-guide.pdf|TLA⁺ 2 language spec (pdf)]] and think about what makes proof syntax context-sensitive and how you might parse it if you had to.   - Jlists are not the only context-sensitive language construct in TLA⁺. Nested proof steps are another. Take some time to read the [[https://lamport.azurewebsites.net/tla/tla2-guide.pdf|TLA⁺ 2 language spec (pdf)]] and think about what makes proof syntax context-sensitive and how you might parse it if you had to.
 +  - Write unit tests for your jlist parsing code. Think of every weird jlist case you can. Look at [[https://github.com/tlaplus/rfcs/blob/2a772d9dd11acec5d7dedf30abfab91a49de48b8/language_standard/tests/tlaplus_syntax/conjlist.txt|this standard test file]] for inspiration.
   - If you are familiar with the [[https://youtu.be/IycOPFmEQk8|pumping lemma for context-free languages]], use it to prove that TLA⁺ with jlists is not context-free.   - If you are familiar with the [[https://youtu.be/IycOPFmEQk8|pumping lemma for context-free languages]], use it to prove that TLA⁺ with jlists is not context-free.
   - Most courses in formal languages skip directly from context-free grammars to Turing machines, but this misses a number of automata of intermediate power. See whether it is possible to use [[https://cs.stackexchange.com/a/170282/110483|context-free grammars with storage]] to concisely express a formal grammar for TLA⁺ that includes jlists. The viability of this is unknown; share your results with [[https://groups.google.com/g/tlaplus|the mailing list]] if accomplished!   - Most courses in formal languages skip directly from context-free grammars to Turing machines, but this misses a number of automata of intermediate power. See whether it is possible to use [[https://cs.stackexchange.com/a/170282/110483|context-free grammars with storage]] to concisely express a formal grammar for TLA⁺ that includes jlists. The viability of this is unknown; share your results with [[https://groups.google.com/g/tlaplus|the mailing list]] if accomplished!
  • creating/jlists.1745790245.txt.gz
  • Last modified: 2025/04/27 21:44
  • by ahelwer