Show Menu
Cheatography

perl regexp Cheat Sheet (DRAFT) by

This is a draft cheat sheet. It is a work in progress and is not finished yet.

20. Regular expression special variable

$1, $2, $3
hold the backre­fer­ences
$+
holds the last (highe­st-­num­bered) backre­ference
$&
(dollar ampersand) holds the entire regex match
$’
(dollar followed by an apostrophe or single quote) holds the part of the string after (to the right of) the regex matc
$`
(dollar backtick) holds the part of the string before (to the left of) the regex match
Using these variables is not recomm­ended in perl scripts when perfor­mance matters, as it causes Perl to slow down all regex matches in your entire perl script.
All these variables are read-only, and persist until the next regex match is attempted.
 
$string = "This is the geek stuff article for perl learne­r";
 
$string =~ /the (g.) stuff(.) /;
 
print "­Matched String­=>$­&­\nBefore Match=­>$`­\nAfter Match=­>$'­\nLast Paren=­>$+­\nFirst Paren=­>$1­\n";

Debugging regexp

use re 'taint';
# Contents of $match are tainted if $dirty was also tainted.
($match) = ($dirty =~ /^(.*)­$/s);

# Allow code interp­ola­tion:
use re 'eval';
$pat = '(?{ $var = 1 })'; # embedded code execution
/alpha­${p­at}­omega/; # won't fail unless under -T
# and $pat is tainted

use re 'debug'; # like "perl -Dr"
/^(.*)$/s; # output debugging info during
# compile time and run time

use re 'debug­color'; # same as 'debug',
# but with colored output
 

6 Regular Expres­sions

($var =~ /re/), ($var !~ /re/)
matches / does not match
m/patt­ern­/ig­msoxc
matching pattern
qr/pat­ter­n/imsox
store regex in variable
s/patt­ern­/re­pla­cem­ent­/ig­msoxe
search and replace
Modifiers:
i case-i­nse­nsitive
o compile once
g global
x extended
s as single line (. matches \n)
e evaluate replac­ement
Syntax:
\
escape
.
any single char
^
start of line
$
end of line
, ?
0 or more times (greedy / nongreedy)
+, +?
1 or more times (greedy / nongreedy)
?, ??
0 or 1 times (greedy / nongreedy)
\b, \B
word boundary ( \w - \W) / match except at w.b.
\A
string start (with /m)
\Z
string end (before \n)
\z
absolute string end
\G
continue from previous m//g
[...]
character set
(...)
group, capture to $1, $2
(?:...)
group without capturing
{n,m} , {n,m}?
at least n times, at most m times
{n,} , {n,}?
at least n times
{n} , {n}?
exactly n times
|
or
\1, \2
text from nth group ($1, ...)
Escape Sequences:
\a alarm (beep)
\e escape
\f formfeed
\n newline
\r carriage return
\t tab
\cx control-x
\l lowercase next char
\L lowercase until \E
\U uppercase until \E
\Q diable metachars until \E
\E end case modifi­cations
Character Classes:
[amy]
'a', 'm', or 'y'
[f-j.-]
range f-j, dot, and dash
[^f-j]
everything except range f-j
\d, \D
digit [0-9] / non-digit
\w, \W
word char [a-zA-­Z0-9_] / non-word char
\s, \S
whitepace [ \t\n\r\f] / non-space
\C
match a byte
\pP, \PP
match p-named unicode / non-p-­nam­ed-­unicode
\p{...}, \P{...}
match long-named unicode / non-na­med­-un­icode
\X
match extended unicode
Posix:
[:alnum:]
alphan­umeric
[:alpha:]
alphabetic
[:ascii:]
any ASCII char
[:blank:]
whitespace [ \t]
[:cntrl:]
control characters
[:digit:]
digits
[:graph:]
alphanum + punctu­ation
[:lower:]
lowercase chars
[:print:]
alphanum, punct, space
[:punct:]
punctu­ation
[:space:]
whitespace [\s\ck]
[:upper:]
uppercase chars
[:word:]
alphanum + '_'
[:xdigit:]
hex digit
[:^digit:]
non-digit
Extended Constructs
(?#text)
comment
(?imxs­-im­sx:...)
enable or disable option
(?=...), (?!...)
positive / negative look-ahead
(?<­=..), (?<!..)
positive / negative look-b­ehind
(?>...)
prohibit backtr­acking
(?{ code })
embedded code
(??{ code })
dynamic regex
(?(con­d)y­es|no)
condition corres­ponding to captured parent­heses
(?(con­d)yes)
condition corres­ponding to look-a­round
Variables
$&
entire matched string
$`
everything prior to matched string
$'
everything after matched string
$1, $2 ...
n-th captured expression
$+
last parent­hesis pattern match
$^N
most recently closed capt.
$^R
result of last (?{...})
@-, @+
offsets of starts / ends of groups
 

REGEX METACHARS

^
string begin
$
str. end (before \n)
+
one or more
*
zero or more
?
zero or one
{3,7}
repeat in range
()
capture
(?:)
no capture
[]
character class
|
altern­ation
\b
word boundary
\z
string end

REGEX MODIFIERS

/i
case insens.
/m
line based ^$
/s
. includes \n
/x
ign. wh.space
/g
global
\Q
quote (disable) pattern metach­ara­cters till \E
\E
end either case modifi­cation or quoted section, think vi

REGEX CHARCL­ASSES

.
[^\n]
\s
[\x20­\f\t­\r\n]
\w
[A-Za-­z0-9_]
\d
[0-9]
\S, \W and \D
negate