Show Menu
Cheatography

Cheatsheet for the Lark parsing toolkit

Lark Options

parser­="ea­rle­y"
Earley - default
parser­="la­lr"
LALR(1)
debug=True
Enable debug prints
lexer="standard"
Revert to simple lexer
ambiguity='explicit'
Return all deriva­tions for Earley
start=­"­foo­"
Set starting rule
cache=True
Enable grammar caching
transf­orm­er=...
Apply transf­ormer to tree (for LALR)
propagate_positions
Fill tree instances with line number inform­ation
maybe_placeholders
[]
returns
None
when not matched
keep_a­ll_­tok­ens­=True
Don't remove unnamed terminals
postlex
Provide a wrapper for the lexer
tree_class
Provide an altern­ative for
Tree
regex=True
Use the
regex
module

Tree Reference

tree.data
Rule name
tree.c­hildren
Rule matches
tree.meta
Positional inform­ation, if enabled
print(­tre­e.p­ret­ty())
tree.i­ter­_su­btr­ees()
Iterate all subtrees
tree.find_data("foo")
Find by rule
tree.f­ind­_pr­ed(...)
Find by predicate
tree1 == tree2

Token Reference

token.type
Terminal name
token.v­alue
Matched text
token.p­os­_in­_stream
Index in source text
token.line
token.c­olumn
token.e­nd­_line
token.end_column
token.e­nd_pos
len(token)
Tokens inherit from
str
, so all string operations are valid (such as
token.u­pper()
).
 

Grammar Defini­tions

rule: ...
Define a rule
TERM: ...
Define a terminal
rule.n: ...
Rule with priority n
TERM.n: ...
Terminal with priority n
// text
Comment
%ignore ...
Ignore terminal in input
%import ...
Import terminal from file
%declare TERM
Declare a terminal without a pattern (used for postlex)
t{p1, p2}: ...
Define template
rule: t{foo, bar}
Use template
Rules consist of values, other rules and terminals.
Terminals only consist of values and other terminals.

Grammar Patterns

foo bar
Match sequence
(foo bar)
Group together (for operat­ions)
foo | bar
Match one or the other
foo?
Match 0 or 1 instances
[foo bar]
Match 0 or 1 instances
foo*
Match 0 or more instances
foo+
Match 1 or more instances
foo~3
Match exactly 3 instances
foo~3..5
Match between 3 to 5 instances

Terminal Atoms

"­str­ing­"
String to match
"­str­ing­"i
Case-i­nse­nsitive string
/regexp/
Regular Expression
/re/imslux
Regular Expression with flags
"­a".."­z­"
Literal range

Tree Shaping

rule: "­foo­" BAR
"­foo­" will be filtered out
!rule: "­foo­" BAR
"­foo­" will be kept
rule: /foo/ BAR
/foo/ will be kept
_TERM
Filter out this terminal
_rule
Always inline this rule
?rule: ...
Inline if matched 1 child
foo bar -> new_name
Rename this derivation
Rules are a branch (node) in the resulting tree, and its children are its matches, in the order of matching.
Terminals (tokens) are always values in the tree, never branches.
Inlining rules means removing their branch and replacing it with their children.

Examples

// Define template for comma-separated list
cs_list{item}: item ("," item)*

// Use template to make a list of numbers
number_list: cs_list{ number }

// Example of a terminal for a Python comment
PY_COMMENT: /#[^\n]*/

// Example of a terminal for C comment
C_COMMENT: "/" /.?/s "*/"
       
 

Comments

No comments yet. Add yours below!

Add a Comment

Your Comment

Please enter your name.

    Please enter your email address

      Please enter your Comment.

          Related Cheat Sheets

            Python 3 Cheat Sheet by Finxter