Edmund's Perl Quick Reference
MORE
RETURN TO INDEX

Regular Expressions

Each character matches itself, unless it is one of the special characters + ? . * ^ $ ( ) [ ] { } | \. The special meaning of these characters can be escaped using a \.

. matches an arbitrary character, but not a newline unless it is a single-line match (see m//s).
(...) groups a series of pattern elements to a single element.
^ matches the beginning of the target. In multiline mode (see m//m) also matches after every newline character.
$ matches the end of the line. In multiline mode also matches before every newline character.
[ ... ] denotes a class of characters to match. [^ ... ] negates the class.
( ... | ... | ... ) matches one of the alternatives.
(?# TEXT ) Comment.
(?: REGEXP ) Like (REGEXP) but does not make back-references.
(?= REGEXP ) Zero width positive look-ahead assertion.
(?! REGEXP ) Zero width negative look-ahead assertion.
(? MODIFIER ) Embedded pattern-match modifier. MODIFIER can be one or more of i, m, s, or x.
Quantified subpatterns match as many times as possible. When followed with a ? they match the minimum number of times. These are the quantifiers:
+ matches the preceding pattern element one or more times.
? matches zero or one times.
* matches zero or more times.
{N,M} denotes the minimum N and maximum M match count. {N} means exactly N times; {N,} means at least N times.
A \ escapes any special meaning of the following character if non-alphanumeric, but it turns most alphanumeric characters into something special:
\w matches alphanumeric, including _, \W matches non-alphanumeric.
\s matches whitespace, \S matches non-whitespace.
\d matches numeric, \D matches non-numeric.
\A matches the beginning of the string, \Z matches the end.
\b matches word boundaries, \B matches non-boundaries.
\G matches where the previous m//g search left off.
\n, \r, \f, \t etc. have their usual meaning.
\w, \s and \d may be used within character classes, \b denotes backspace in this context.
Back-references:
\1 ... \9 refer to matched subexpressions, grouped with (), inside the match.
\10 and up can also be used if the pattern matches that many subexpressions.

See also $1 ... $9, $+, $&, $`, and $' in section Special Variables.

With modifier x, whitespace can be used in the patterns for readability purposes.

Search and Replace Functions

[ EXPR =~ ] [ m ] /PATTERN/ [ g ] [ i ] [ m ] [ o ] [ v ] [ x ]
Searches EXPR (default: $_) for a pattern. If you prepend an m you can use almost any pair of delimiters instead of the slashes. If used in array context, an array is returned consisting of the subexpressions matched by the parentheses in the pattern, i.e., ($1,$2,$3,...).
Optional modifiers: g matches as many times as possible; i searches in a case-insensitive manner; o interpolates variables only once. m treats the string as multiple lines; s treats the string as a single line; x allows for regular expression extensions.
If PATTERN is empty, the most recent pattern from a previous match or replacement is used.
With g the match can be used as an iterator in scalar context.
?PATTERN?
This is just like the /PATTERN/ search, except that it matches only once between calls to the reset operator.
[ $VAR =~ ] s/PATTERN/REPLACEMENT/ [ e ] [ g ] [ i ] [ m ] [ o ] [ s ] [ x ]
Searches a string for a pattern, and if found, replaces that pattern with the replacement text. It returns the number of substitutions made, if any; if no substitutions are made, it returns false.
Optional modifiers: g replaces all occurrences of the pattern; e evaluates the replacement string as a Perl expression; for any other modifiers, see /PATTERN/ matching. Almost any delimiter may replace the slashes; if single quotes are used, no interpretation is done on the strings between the delimiters, otherwise the strings are interpolated as if inside double quotes.
If bracketing delimiters are used, PATTERN and REPLACEMENT may have their own delimiters, e.g., s(foo)[bar]. If PATTERN is empty, the most recent pattern from a previous match or replacement is used.
[ $VAR =~ ] tr/SEARCHLIST/REPLACEMENTLIST/ [ c ] [ d ] [ s ]
Translates all occurrences of the characters found in the search list with the corresponding character in the replacement list. It returns the number of characters replaced. y may be used instead of tr.
Optional modifiers: c complements the SEARCHLIST; d deletes all characters found in SEARCHLIST that do not have a corresponding character in REPLACEMENTLIST; s squeezes all sequences of characters that are translated into the same target character into one occurrence of this character.
pos SCALAR
Returns the position where the last m//g search left off for SCALAR. May be assigned to.
study [ $VAR† ]
Study the scalar variable $VAR in anticipation of performing many pattern matches on its contents before the variable is next modified.

File Test Operators

These unary operators takes one argument, either a filename or a filehandle, and test the associated file to see if something is true about it. If the argument is omitted, they test $_ (except for -t, which tests STDIN). If the special argument _ (underscore) is passed, they use the information from the preceding test or stat call.

-r -w -x  File is readable/writable/executable by effective uid/gid.
-R -W -X File is readable/writable/executable by real uid/gid.
-o -O File is owned by effective/real uid.
-e -z File exists/has zero size.
-s File exists and has non-zero size. Returns the size.
-f -d File is a plain file/a directory.
-l -S -p File is a symbolic link/a socket/a named pipe (FIFO).
-b -c File is a block/character special file.
-u -g -k File has setuid/setgid/sticky bit set.
-t Tests if filehandle (STDIN by default) is opened to a tty.
-T -B File is a text/non-text (binary) file. -T and -B return true on a null file, or a file at EOF when testing a filehandle.
-M -A -C File modification / access / inode-change time. Measured in days. Value returned reflects the file age at the time the script started. See also $^T in the section Special Variables.