Show Menu
Cheatography

Cheatsheet for the Lark parsing toolkit

Lark Options

parse­r="e­arl­ey"
Earley - default
parse­r="l­alr­"
LALR­(1)
debug­=True
Enable debug prints
lexer="standard"
Revert to simple lexer
ambiguity='explicit'
Return all deriva­tions for Earley
start­="fo­o"
Set starting rule
cache­=True
Enable grammar caching
trans­for­mer­=...
Apply transf­ormer to tree (for LALR)
propagate_positions
Fill tree instances with line number inform­ation
maybe_placeholders
[] returns None when not matched
keep_­all­_to­ken­s=True
Don't remove unnamed terminals
postlex
Provide a wrapper for the lexer
tree_­class
Provide an altern­ative for Tree
regex­=True
Use the regex module

Tree Reference

tree.d­ata
Rule name
tree.c­hi­ldren
Rule matches
tree.m­eta
Positional inform­ation, if enabled
print­(tr­ee.p­re­tty())
tree.i­te­r_s­ubt­rees()
Iterate all subtrees
tree.find_data("foo")
Find by rule
tree.f­in­d_p­red­(...)
Find by predicate
tree1 == tree2

Token Reference

token.type
Terminal name
token.value
Matched text
token.po­s_i­n_s­tream
Index in source text
token.line
token.co­lumn
token.en­d_line
token.end_column
token.en­d_pos
len(t­oken)
Tokens inherit from str, so all string operations are valid (such as token.up­per­()).
 

Grammar Defini­tions

rule: ...
Define a rule
TERM: ...
Define a terminal
rule.n: ...
Rule with priority n
TERM.n: ...
Terminal with priority n
// text
Comment
%ignore ...
Ignore terminal in input
%import ...
Import terminal from file
%declare TERM
Declare a terminal without a pattern (used for postlex)
t{p1, p2}: ...
Define template
rule: t{foo, bar}
Use template
Rules consist of values, other rules and terminals.
Term­inals only consist of values and other terminals.

Grammar Patterns

foo bar
Match sequence
(foo bar)
Group together (for operat­ions)
foo | bar
Match one or the other
foo?
Match 0 or 1 instances
[foo bar]
Match 0 or 1 instances
foo*
Match 0 or more instances
foo+
Match 1 or more instances
foo~3
Match exactly 3 instances
foo~3..5
Match between 3 to 5 instances

Terminal Atoms

"st­rin­g"
String to match
"st­rin­g"i
Case-i­nse­nsitive string
/regexp/
Regular Expression
/re/i­mslux
Regular Expression with flags
"a".."z"
Literal range

Tree Shaping

rule: "­foo­" BAR
"­foo­" will be filtered out
!rule: "­foo­" BAR
"­foo­" will be kept
rule: /foo/ BAR
/foo/ will be kept
_TERM
Filter out this terminal
_rule
Always inline this rule
?rule: ...
Inline if matched 1 child
foo bar -> new_name
Rename this derivation
Rules are a branch (node) in the resulting tree, and its children are its matches, in the order of matching.
Term­inals (tokens) are always values in the tree, never branches.
Inlining rules means removing their branch and replacing it with their children.

Examples

// Define template for comma-separated list
cs_list{item}: item ("," item)*

// Use template to make a list of numbers
number_list: cs_list{ number }

// Example of a terminal for a Python comment
PY_COMMENT: /#[^\n]*/

// Example of a terminal for C comment
C_COMMENT: "/" /.?/s "*/"
       

Help Us Go Positive!

We offset our carbon usage with Ecologi. Click the link below to help us!

We offset our carbon footprint via Ecologi
 

Comments

No comments yet. Add yours below!

Add a Comment

Your Comment

Please enter your name.

    Please enter your email address

      Please enter your Comment.

          Related Cheat Sheets

            Python 3 Cheat Sheet by Finxter