Show Menu
Cheatography

Cheatsheet for the Lark parsing toolkit

Lark Options

parse­r="e­arl­ey"
Earley - default
parse­r="l­alr­"
LALR­(1)
debug­=True
Enable debug prints
lexer="standard"
Revert to simple lexer
ambiguity='explicit'
Return all deriva­tions for Earley
start­="fo­o"
Set starting rule
cache­=True
Enable grammar caching
trans­for­mer­=...
Apply transf­ormer to tree (for LALR)
propagate_positions
Fill tree instances with line number inform­ation
maybe_placeholders
[] returns None when not matched
keep_­all­_to­ken­s=True
Don't remove unnamed terminals
postlex
Provide a wrapper for the lexer
tree_­class
Provide an altern­ative for Tree
regex­=True
Use the regex module

Tree Reference

tree.d­ata
Rule name
tree.c­hi­ldren
Rule matches
tree.m­eta
Positional inform­ation, if enabled
print­(tr­ee.p­re­tty())
tree.i­te­r_s­ubt­rees()
Iterate all subtrees
tree.find_data("foo")
Find by rule
tree.f­in­d_p­red­(...)
Find by predicate
tree1 == tree2

Token Reference

token.type
Terminal name
token.value
Matched text
token.po­s_i­n_s­tream
Index in source text
token.line
token.co­lumn
token.en­d_line
token.end_column
token.en­d_pos
len(t­oken)
Tokens inherit from str, so all string operations are valid (such as token.up­per­()).
 

Grammar Defini­tions

rule: ...
Define a rule
TERM: ...
Define a terminal
rule.n: ...
Rule with priority n
TERM.n: ...
Terminal with priority n
// text
Comment
%ignore ...
Ignore terminal in input
%import ...
Import terminal from file
%declare TERM
Declare a terminal without a pattern (used for postlex)
t{p1, p2}: ...
Define template
rule: t{foo, bar}
Use template
Rules consist of values, other rules and terminals.
Term­inals only consist of values and other terminals.

Grammar Patterns

foo bar
Match sequence
(foo bar)
Group together (for operat­ions)
foo | bar
Match one or the other
foo?
Match 0 or 1 instances
[foo bar]
Match 0 or 1 instances
foo*
Match 0 or more instances
foo+
Match 1 or more instances
foo~3
Match exactly 3 instances
foo~3..5
Match between 3 to 5 instances

Terminal Atoms

"st­rin­g"
String to match
"st­rin­g"i
Case-i­nse­nsitive string
/regexp/
Regular Expression
/re/i­mslux
Regular Expression with flags
"a".."z"
Literal range

Tree Shaping

rule: "­foo­" BAR
"­foo­" will be filtered out
!rule: "­foo­" BAR
"­foo­" will be kept
rule: /foo/ BAR
/foo/ will be kept
_TERM
Filter out this terminal
_rule
Always inline this rule
?rule: ...
Inline if matched 1 child
foo bar -> new_name
Rename this derivation
Rules are a branch (node) in the resulting tree, and its children are its matches, in the order of matching.
Term­inals (tokens) are always values in the tree, never branches.
Inlining rules means removing their branch and replacing it with their children.

Examples

// Define template for comma-separated list
cs_list{item}: item ("," item)*

// Use template to make a list of numbers
number_list: cs_list{ number }

// Example of a terminal for a Python comment
PY_COMMENT: /#[^\n]*/

// Example of a terminal for C comment
C_COMMENT: "/" /.?/s "*/"
       
 

Comments

No comments yet. Add yours below!

Add a Comment

Your Comment

Please enter your name.

    Please enter your email address

      Please enter your Comment.

          Related Cheat Sheets

            Python 3 Cheat Sheet by Finxter