|
Edmund J. Sutcliffe
Thoughtful Solutions, Creatively Implemented and Communicated
Edmund's Perl Quick Reference -
1
2
3
4
5
6
7
8
9
Click here
for a printable version of this page.
Regular Expressions
Each character matches itself, unless it is one of the special
characters + ? . * ^ $ ( ) [ ] { } | \. The
special meaning of these characters can be escaped using a
\.
| . |
matches an arbitrary character,
but not a newline unless it is a single-line match (see
m//s). |
| (...) |
groups a series of pattern elements to a
single element. |
| ^ |
matches the beginning of the target.
In multiline mode (see m//m) also matches after
every newline character. |
| $ |
matches the end of the line. In multiline
mode also matches before every newline character. |
| [ ... ] |
denotes a class of characters to match. [^
... ] negates the class. |
| ( ... |
... | ... ) |
matches one of the alternatives. |
| (?# TEXT ) |
Comment. |
| (?: REGEXP ) |
Like (REGEXP)
but does not make back-references. |
| (?= REGEXP ) |
Zero width positive look-ahead assertion. |
| (?! REGEXP ) |
Zero width negative look-ahead assertion. |
| (? MODIFIER ) |
Embedded pattern-match modifier. MODIFIER
can be one or more of i, m, s, or
x. |
| Quantified subpatterns match as
many times as possible. When followed with a ?
they match the minimum number of times. These are the
quantifiers: |
| + |
matches the preceding pattern element one
or more times. |
| ? |
matches zero or one times. |
| * |
matches zero or more times. |
| {N,M} |
denotes the minimum N and maximum M match
count. {N} means exactly
N times; {N,} means at
least N times. |
| A \ escapes any
special meaning of the following character if non-alphanumeric,
but it turns most alphanumeric characters into something
special: |
| \w |
matches alphanumeric, including _,
\W matches non-alphanumeric. |
| \s |
matches whitespace, \S matches
non-whitespace. |
| \d |
matches numeric, \D matches
non-numeric. |
| \A |
matches the beginning of the string, \Z
matches the end. |
| \b |
matches word boundaries, \B
matches non-boundaries. |
| \G |
matches where the previous m//g search
left off. |
| \n, \r, \f,
\t |
etc. have their usual meaning. |
| \w, \s and
\d |
may be used within character classes, \b
denotes backspace in this context. |
| Back-references: |
| \1 ... \9 |
refer to matched subexpressions, grouped
with (), inside the match. |
| \10 |
and up can also be used if the pattern matches
that many subexpressions. |
See also $1 ... $9,
$+, $&, $`, and
$' in section Special Variables.
With modifier x, whitespace can be used in the patterns
for readability purposes.
Search and Replace Functions
- [ EXPR =~
] [ m ] /PATTERN/
[ g ] [ i ] [ m ] [ o ] [ v
] [ x ]
- Searches EXPR (default: $_)
for a pattern. If you prepend an m you can use almost
any pair of delimiters instead of the slashes. If used in
array context, an array is returned consisting of the subexpressions
matched by the parentheses in the pattern, i.e., ($1,$2,$3,...).
Optional modifiers: g matches as many times as possible;
i searches in a case-insensitive manner; o
interpolates variables only once. m treats the string
as multiple lines; s treats the string as a single
line; x allows for regular expression extensions.
If PATTERN is empty, the most recent pattern from a previous
match or replacement is used.
With g the match can be used as an iterator in scalar
context.
- ?PATTERN?
- This is just like the /PATTERN/
search, except that it matches only once between calls to
the reset operator.
- [ $VAR =~
] s/PATTERN/REPLACEMENT/
[ e ] [ g ] [ i ] [ m ] [ o
] [ s ] [ x ]
- Searches a string for a pattern,
and if found, replaces that pattern with the replacement
text. It returns the number of substitutions made, if any;
if no substitutions are made, it returns false.
Optional modifiers: g replaces all occurrences of
the pattern; e evaluates the replacement string as
a Perl expression; for any other modifiers, see /PATTERN/
matching. Almost any delimiter may replace the slashes;
if single quotes are used, no interpretation is done on
the strings between the delimiters, otherwise the strings
are interpolated as if inside double quotes.
If bracketing delimiters are used, PATTERN and REPLACEMENT
may have their own delimiters, e.g., s(foo)[bar].
If PATTERN is empty, the most recent pattern from a previous
match or replacement is used.
- [ $VAR =~
] tr/SEARCHLIST/REPLACEMENTLIST/
[ c ] [ d ] [ s ]
- Translates all occurrences
of the characters found in the search list with the corresponding
character in the replacement list. It returns the number
of characters replaced. y may be used instead of
tr.
Optional modifiers: c complements the SEARCHLIST;
d deletes all characters found in SEARCHLIST that
do not have a corresponding character in REPLACEMENTLIST;
s squeezes all sequences of characters that are translated
into the same target character into one occurrence of this
character.
- pos SCALAR
- Returns the position where
the last m//g search left off for SCALAR. May be
assigned to.
- study [ $VAR†
]
- Study the scalar variable
$VAR in anticipation of performing many
pattern matches on its contents before the variable is next
modified.
File Test Operators
These unary operators takes one argument, either
a filename or a filehandle, and test the associated file
to see if something is true about it. If the argument is
omitted, they test $_ (except for -t,
which tests STDIN). If the special argument
_ (underscore) is passed, they use the information
from the preceding test or stat call.
| -r -w -x |
File is readable/writable/executable
by effective uid/gid. |
| -R -W -X |
File is readable/writable/executable
by real uid/gid. |
| -o -O |
File is owned by effective/real
uid. |
| -e -z |
File exists/has zero size. |
| -s |
File exists and has non-zero
size. Returns the size. |
| -f -d |
File is a plain file/a directory. |
| -l -S -p |
File is a symbolic link/a socket/a
named pipe (FIFO). |
| -b -c |
File is a block/character special
file. |
| -u -g -k |
File has setuid/setgid/sticky
bit set. |
| -t |
Tests if filehandle (STDIN
by default) is opened to a tty. |
| -T -B |
File is a text/non-text (binary)
file. -T and -B return
true on a null file, or a file at EOF when testing
a filehandle. |
| -M -A -C |
File modification / access /
inode-change time. Measured in days. Value returned
reflects the file age at the time the script started.
See also $^T in the section Special Variables. |
-
|
|