=head1 NAME
perlvar - Perl predefined variables
=head1 DESCRIPTION
=head2 Predefined Names
The following names have special meaning to Perl. Most
punctuation names have reasonable mnemonics, or analogs in the
shells. Nevertheless, if you wish to use long variable names,
you need only say
use English;
at the top of your program. This aliases all the short names to the long
names in the current package. Some even have medium names, generally
borrowed from B. In general, it's best to use the
use English '-no_match_vars';
invocation if you don't need $PREMATCH, $MATCH, or $POSTMATCH, as it avoids
a certain performance hit with the use of regular expressions. See
L.
Variables that depend on the currently selected filehandle may be set by
calling an appropriate object method on the IO::Handle object, although
this is less efficient than using the regular built-in variables. (Summary
lines below for this contain the word HANDLE.) First you must say
use IO::Handle;
after which you may use either
method HANDLE EXPR
or more safely,
HANDLE->method(EXPR)
Each method returns the old value of the IO::Handle attribute.
The methods each take an optional EXPR, which, if supplied, specifies the
new value for the IO::Handle attribute in question. If not supplied,
most methods do nothing to the current value--except for
autoflush(), which will assume a 1 for you, just to be different.
Because loading in the IO::Handle class is an expensive operation, you should
learn how to use the regular built-in variables.
A few of these variables are considered "read-only". This means that if
you try to assign to this variable, either directly or indirectly through
a reference, you'll raise a run-time exception.
You should be very careful when modifying the default values of most
special variables described in this document. In most cases you want
to localize these variables before changing them, since if you don't,
the change may affect other modules which rely on the default values
of the special variables that you have changed. This is one of the
correct ways to read the whole file at once:
open my $fh, "<", "foo" or die $!;
local $/; # enable localized slurp mode
my $content = <$fh>;
close $fh;
But the following code is quite bad:
open my $fh, "<", "foo" or die $!;
undef $/; # enable slurp mode
my $content = <$fh>;
close $fh;
since some other module, may want to read data from some file in the
default "line mode", so if the code we have just presented has been
executed, the global value of C<$/> is now changed for any other code
running inside the same Perl interpreter.
Usually when a variable is localized you want to make sure that this
change affects the shortest scope possible. So unless you are already
inside some short C<{}> block, you should create one yourself. For
example:
my $content = '';
open my $fh, "<", "foo" or die $!;
{
local $/;
$content = <$fh>;
}
close $fh;
Here is an example of how your own code can go broken:
for (1..5){
nasty_break();
print "$_ ";
}
sub nasty_break {
$_ = 5;
# do something with $_
}
You probably expect this code to print:
1 2 3 4 5
but instead you get:
5 5 5 5 5
Why? Because nasty_break() modifies C<$_> without localizing it
first. The fix is to add local():
local $_ = 5;
It's easy to notice the problem in such a short example, but in more
complicated code you are looking for trouble if you don't localize
changes to the special variables.
The following list is ordered by scalar variables first, then the
arrays, then the hashes.
=over 8
=item $ARG
=item $_
X<$_> X<$ARG>
The default input and pattern-searching space. The following pairs are
equivalent:
while (<>) {...} # equivalent only in while!
while (defined($_ = <>)) {...}
/^Subject:/
$_ =~ /^Subject:/
tr/a-z/A-Z/
$_ =~ tr/a-z/A-Z/
chomp
chomp($_)
Here are the places where Perl will assume $_ even if you
don't use it:
=over 3
=item *
The following functions:
abs, alarm, chomp, chop, chr, chroot, cos, defined, eval, exp, glob,
hex, int, lc, lcfirst, length, log, lstat, mkdir, oct, ord, pos, print,
quotemeta, readlink, readpipe, ref, require, reverse (in scalar context only),
rmdir, sin, split (on its second argument), sqrt, stat, study, uc, ucfirst,
unlink, unpack.
=item *
All file tests (C<-f>, C<-d>) except for C<-t>, which defaults to STDIN.
See L
=item *
The pattern matching operations C, C and C
(aka C)
when used without an C<=~> operator.
=item *
The default iterator variable in a C loop if no other
variable is supplied.
=item *
The implicit iterator variable in the grep() and map() functions.
=item *
The implicit variable of given().
=item *
The default place to put an input record when a C<< >>
operation's result is tested by itself as the sole criterion of a C
test. Outside a C test, this will not happen.
=back
As C<$_> is a global variable, this may lead in some cases to unwanted
side-effects. As of perl 5.9.1, you can now use a lexical version of
C<$_> by declaring it in a file or in a block with C. Moreover,
declaring C restores the global C<$_> in the current scope.
(Mnemonic: underline is understood in certain operations.)
=back
=over 8
=item $a
=item $b
X<$a> X<$b>
Special package variables when using sort(), see L.
Because of this specialness $a and $b don't need to be declared
(using use vars, or our()) even when using the C pragma.
Don't lexicalize them with C or C if you want to be
able to use them in the sort() comparison block or function.
=back
=over 8
=item $>
X<$1> X<$2> X<$3>
Contains the subpattern from the corresponding set of capturing
parentheses from the last pattern match, not counting patterns
matched in nested blocks that have been exited already. (Mnemonic:
like \digits.) These variables are all read-only and dynamically
scoped to the current BLOCK.
=item $MATCH
=item $&
X<$&> X<$MATCH>
The string matched by the last successful pattern match (not counting
any matches hidden within a BLOCK or eval() enclosed by the current
BLOCK). (Mnemonic: like & in some editors.) This variable is read-only
and dynamically scoped to the current BLOCK.
The use of this variable anywhere in a program imposes a considerable
performance penalty on all regular expression matches. See L.
See L@-> for a replacement.
=item ${^MATCH}
X<${^MATCH}>
This is similar to C<$&> (C<$MATCH>) except that it does not incur the
performance penalty associated with that variable, and is only guaranteed
to return a defined value when the pattern was compiled or executed with
the C
modifier.
=item $PREMATCH
=item C<$`>
X<$`> X<$PREMATCH>
The string preceding whatever was matched by the last successful
pattern match (not counting any matches hidden within a BLOCK or eval
enclosed by the current BLOCK). (Mnemonic: C<`> often precedes a quoted
string.) This variable is read-only.
The use of this variable anywhere in a program imposes a considerable
performance penalty on all regular expression matches. See L.
See L@-> for a replacement.
=item ${^PREMATCH}
X<${^PREMATCH}>
This is similar to C<$`> ($PREMATCH) except that it does not incur the
performance penalty associated with that variable, and is only guaranteed
to return a defined value when the pattern was compiled or executed with
the C modifier.
=item $POSTMATCH
=item C<$'>
X<$'> X<$POSTMATCH>
The string following whatever was matched by the last successful
pattern match (not counting any matches hidden within a BLOCK or eval()
enclosed by the current BLOCK). (Mnemonic: C<'> often follows a quoted
string.) Example:
local $_ = 'abcdefghi';
/def/;
print "$`:$&:$'\n"; # prints abc:def:ghi
This variable is read-only and dynamically scoped to the current BLOCK.
The use of this variable anywhere in a program imposes a considerable
performance penalty on all regular expression matches. See L.
See L@-> for a replacement.
=item ${^POSTMATCH}
X<${^POSTMATCH}>
This is similar to C<$'> (C<$POSTMATCH>) except that it does not incur the
performance penalty associated with that variable, and is only guaranteed
to return a defined value when the pattern was compiled or executed with
the C modifier.
=item $LAST_PAREN_MATCH
=item $+
X<$+> X<$LAST_PAREN_MATCH>
The text matched by the last bracket of the last successful search pattern.
This is useful if you don't know which one of a set of alternative patterns
matched. For example:
/Version: (.*)|Revision: (.*)/ && ($rev = $+);
(Mnemonic: be positive and forward looking.)
This variable is read-only and dynamically scoped to the current BLOCK.
=item $LAST_SUBMATCH_RESULT
=item $^N
X<$^N>
The text matched by the used group most-recently closed (i.e. the group
with the rightmost closing parenthesis) of the last successful search
pattern. (Mnemonic: the (possibly) Nested parenthesis that most
recently closed.)
This is primarily used inside C<(?{...})> blocks for examining text
recently matched. For example, to effectively capture text to a variable
(in addition to C<$1>, C<$2>, etc.), replace C<(...)> with
(?:(...)(?{ $var = $^N }))
By setting and then using C<$var> in this way relieves you from having to
worry about exactly which numbered set of parentheses they are.
This variable is dynamically scoped to the current BLOCK.
=item @LAST_MATCH_END
=item @+
X<@+> X<@LAST_MATCH_END>
This array holds the offsets of the ends of the last successful
submatches in the currently active dynamic scope. C<$+[0]> is
the offset into the string of the end of the entire match. This
is the same value as what the C function returns when called
on the variable that was matched against. The Ith element
of this array holds the offset of the Ith submatch, so
C<$+[1]> is the offset past where $1 ends, C<$+[2]> the offset
past where $2 ends, and so on. You can use C<$#+> to determine
how many subgroups were in the last successful match. See the
examples given for the C<@-> variable.
=item %LAST_PAREN_MATCH
=item %+
X<%+>
Similar to C<@+>, the C<%+> hash allows access to the named capture
buffers, should they exist, in the last successful match in the
currently active dynamic scope.
For example, C<$+{foo}> is equivalent to C<$1> after the following match:
'foo' =~ /(?foo)/;
The keys of the C<%+> hash list only the names of buffers that have
captured (and that are thus associated to defined values).
The underlying behaviour of C<%+> is provided by the
L module.
B C<%-> and C<%+> are tied views into a common internal hash
associated with the last successful regular expression. Therefore mixing
iterative access to them via C may have unpredictable results.
Likewise, if the last successful match changes, then the results may be
surprising.
=item HANDLE->input_line_number(EXPR)
=item $INPUT_LINE_NUMBER
=item $NR
=item $.
X<$.> X<$NR> X<$INPUT_LINE_NUMBER> X
Current line number for the last filehandle accessed.
Each filehandle in Perl counts the number of lines that have been read
from it. (Depending on the value of C<$/>, Perl's idea of what
constitutes a line may not match yours.) When a line is read from a
filehandle (via readline() or C<< <> >>), or when tell() or seek() is
called on it, C<$.> becomes an alias to the line counter for that
filehandle.
You can adjust the counter by assigning to C<$.>, but this will not
actually move the seek pointer. I will not localize
the filehandle's line count>. Instead, it will localize perl's notion
of which filehandle C<$.> is currently aliased to.
C<$.> is reset when the filehandle is closed, but B when an open
filehandle is reopened without an intervening close(). For more
details, see LO Operators">. Because C<< <> >> never does
an explicit close, line numbers increase across ARGV files (but see
examples in L).
You can also use C<< HANDLE->input_line_number(EXPR) >> to access the
line counter for a given filehandle without having to worry about
which handle you last accessed.
(Mnemonic: many programs use "." to mean the current line number.)
=item IO::Handle->input_record_separator(EXPR)
=item $INPUT_RECORD_SEPARATOR
=item $RS
=item $/
X<$/> X<$RS> X<$INPUT_RECORD_SEPARATOR>
The input record separator, newline by default. This
influences Perl's idea of what a "line" is. Works like B's RS
variable, including treating empty lines as a terminator if set to
the null string. (An empty line cannot contain any spaces
or tabs.) You may set it to a multi-character string to match a
multi-character terminator, or to C to read through the end
of file. Setting it to C<"\n\n"> means something slightly
different than setting to C<"">, if the file contains consecutive
empty lines. Setting to C<""> will treat two or more consecutive
empty lines as a single empty line. Setting to C<"\n\n"> will
blindly assume that the next input character belongs to the next
paragraph, even if it's a newline. (Mnemonic: / delimits
line boundaries when quoting poetry.)
local $/; # enable "slurp" mode
local $_ = ; # whole file now here
s/\n[ \t]+/ /g;
Remember: the value of C<$/> is a string, not a regex. B has to be
better for something. :-)
Setting C<$/> to a reference to an integer, scalar containing an integer, or
scalar that's convertible to an integer will attempt to read records
instead of lines, with the maximum record size being the referenced
integer. So this:
local $/ = \32768; # or \"32768", or \$var_containing_32768
open my $fh, "<", $myfile or die $!;
local $_ = <$fh>;
will read a record of no more than 32768 bytes from FILE. If you're
not reading from a record-oriented file (or your OS doesn't have
record-oriented files), then you'll likely get a full chunk of data
with every read. If a record is larger than the record size you've
set, you'll get the record back in pieces. Trying to set the record
size to zero or less will cause reading in the (rest of the) whole file.
On VMS, record reads are done with the equivalent of C,
so it's best not to mix record and non-record reads on the same
file. (This is unlikely to be a problem, because any file you'd
want to read in record mode is probably unusable in line mode.)
Non-VMS systems do normal I/O, so it's safe to mix record and
non-record reads of a file.
See also L. Also see C<$.>.
=item HANDLE->autoflush(EXPR)
=item $OUTPUT_AUTOFLUSH
=item $|
X<$|> X X X<$OUTPUT_AUTOFLUSH>
If set to nonzero, forces a flush right away and after every write
or print on the currently selected output channel. Default is 0
(regardless of whether the channel is really buffered by the
system or not; C<$|> tells you only whether you've asked Perl
explicitly to flush after each write). STDOUT will
typically be line buffered if output is to the terminal and block
buffered otherwise. Setting this variable is useful primarily when
you are outputting to a pipe or socket, such as when you are running
a Perl program under B and want to see the output as it's
happening. This has no effect on input buffering. See L
for that. See L on how to select the output channel.
See also L. (Mnemonic: when you want your pipes to be piping hot.)
=item IO::Handle->output_field_separator EXPR
=item $OUTPUT_FIELD_SEPARATOR
=item $OFS
=item $,
X<$,> X<$OFS> X<$OUTPUT_FIELD_SEPARATOR>
The output field separator for the print operator. If defined, this
value is printed between each of print's arguments. Default is C.
(Mnemonic: what is printed when there is a "," in your print statement.)
=item IO::Handle->output_record_separator EXPR
=item $OUTPUT_RECORD_SEPARATOR
=item $ORS
=item $\
X<$\> X<$ORS> X<$OUTPUT_RECORD_SEPARATOR>
The output record separator for the print operator. If defined, this
value is printed after the last of print's arguments. Default is C.
(Mnemonic: you set C<$\> instead of adding "\n" at the end of the print.
Also, it's just like C<$/>, but it's what you get "back" from Perl.)
=item $LIST_SEPARATOR
=item $"
X<$"> X<$LIST_SEPARATOR>
This is like C<$,> except that it applies to array and slice values
interpolated into a double-quoted string (or similar interpreted
string). Default is a space. (Mnemonic: obvious, I think.)
=item $SUBSCRIPT_SEPARATOR
=item $SUBSEP
=item $;
X<$;> X<$SUBSEP> X
The subscript separator for multidimensional array emulation. If you
refer to a hash element as
$foo{$a,$b,$c}
it really means
$foo{join($;, $a, $b, $c)}
But don't put
@foo{$a,$b,$c} # a slice--note the @
which means
($foo{$a},$foo{$b},$foo{$c})
Default is "\034", the same as SUBSEP in B. If your
keys contain binary data there might not be any safe value for C<$;>.
(Mnemonic: comma (the syntactic subscript separator) is a
semi-semicolon. Yeah, I know, it's pretty lame, but C<$,> is already
taken for something more important.)
Consider using "real" multidimensional arrays as described
in L.
=item HANDLE->format_page_number(EXPR)
=item $FORMAT_PAGE_NUMBER
=item $%
X<$%> X<$FORMAT_PAGE_NUMBER>
The current page number of the currently selected output channel.
Used with formats.
(Mnemonic: % is page number in B.)
=item HANDLE->format_lines_per_page(EXPR)
=item $FORMAT_LINES_PER_PAGE
=item $=
X<$=> X<$FORMAT_LINES_PER_PAGE>
The current page length (printable lines) of the currently selected
output channel. Default is 60.
Used with formats.
(Mnemonic: = has horizontal lines.)
=item HANDLE->format_lines_left(EXPR)
=item $FORMAT_LINES_LEFT
=item $-
X<$-> X<$FORMAT_LINES_LEFT>
The number of lines left on the page of the currently selected output
channel.
Used with formats.
(Mnemonic: lines_on_page - lines_printed.)
=item @LAST_MATCH_START
=item @-
X<@-> X<@LAST_MATCH_START>
$-[0] is the offset of the start of the last successful match.
C<$-[>IC<]> is the offset of the start of the substring matched by
I-th subpattern, or undef if the subpattern did not match.
Thus after a match against $_, $& coincides with C. Similarly, $I coincides with C if C<$-[n]> is defined, and $+ coincides with
C. One can use C<$#-> to find the last
matched subgroup in the last successful match. Contrast with
C<$#+>, the number of subgroups in the regular expression. Compare
with C<@+>.
This array holds the offsets of the beginnings of the last
successful submatches in the currently active dynamic scope.
C<$-[0]> is the offset into the string of the beginning of the
entire match. The Ith element of this array holds the offset
of the Ith submatch, so C<$-[1]> is the offset where $1
begins, C<$-[2]> the offset where $2 begins, and so on.
After a match against some variable $var:
=over 5
=item C<$`> is the same as C
=item C<$&> is the same as C
=item C<$'> is the same as C
=item C<$1> is the same as C
=item C<$2> is the same as C
=item C<$3> is the same as C
=back
=item %-
X<%->
Similar to C<%+>, this variable allows access to the named capture buffers
in the last successful match in the currently active dynamic scope. To
each capture buffer name found in the regular expression, it associates a
reference to an array containing the list of values captured by all
buffers with that name (should there be several of them), in the order
where they appear.
Here's an example:
if ('1234' =~ /(?1)(?2)(?3)(?4)/) {
foreach my $bufname (sort keys %-) {
my $ary = $-{$bufname};
foreach my $idx (0..$#$ary) {
print "\$-{$bufname}[$idx] : ",
(defined($ary->[$idx]) ? "'$ary->[$idx]'" : "undef"),
"\n";
}
}
}
would print out:
$-{A}[0] : '1'
$-{A}[1] : '3'
$-{B}[0] : '2'
$-{B}[1] : '4'
The keys of the C<%-> hash correspond to all buffer names found in
the regular expression.
The behaviour of C<%-> is implemented via the
L module.
B C<%-> and C<%+> are tied views into a common internal hash
associated with the last successful regular expression. Therefore mixing
iterative access to them via C may have unpredictable results.
Likewise, if the last successful match changes, then the results may be
surprising.
=item HANDLE->format_name(EXPR)
=item $FORMAT_NAME
=item $~
X<$~> X<$FORMAT_NAME>
The name of the current report format for the currently selected output
channel. Default is the name of the filehandle. (Mnemonic: brother to
C<$^>.)
=item HANDLE->format_top_name(EXPR)
=item $FORMAT_TOP_NAME
=item $^
X<$^> X<$FORMAT_TOP_NAME>
The name of the current top-of-page format for the currently selected
output channel. Default is the name of the filehandle with _TOP
appended. (Mnemonic: points to top of page.)
=item IO::Handle->format_line_break_characters EXPR
=item $FORMAT_LINE_BREAK_CHARACTERS
=item $:
X<$:> X
The current set of characters after which a string may be broken to
fill continuation fields (starting with ^) in a format. Default is
S<" \n-">, to break on whitespace or hyphens. (Mnemonic: a "colon" in
poetry is a part of a line.)
=item IO::Handle->format_formfeed EXPR
=item $FORMAT_FORMFEED
=item $^L
X<$^L> X<$FORMAT_FORMFEED>
What formats output as a form feed. Default is \f.
=item $ACCUMULATOR
=item $^A
X<$^A> X<$ACCUMULATOR>
The current value of the write() accumulator for format() lines. A format
contains formline() calls that put their result into C<$^A>. After
calling its format, write() prints out the contents of C<$^A> and empties.
So you never really see the contents of C<$^A> unless you call
formline() yourself and then look at it. See L and
L.
=item $CHILD_ERROR
=item $?
X<$?> X<$CHILD_ERROR>
The status returned by the last pipe close, backtick (C<``>) command,
successful call to wait() or waitpid(), or from the system()
operator. This is just the 16-bit status word returned by the
traditional Unix wait() system call (or else is made up to look like it). Thus, the
exit value of the subprocess is really (C<<< $? >> 8 >>>), and
C<$? & 127> gives which signal, if any, the process died from, and
C<$? & 128> reports whether there was a core dump. (Mnemonic:
similar to B and B.)
Additionally, if the C variable is supported in C, its value
is returned via $? if any C function fails.
If you have installed a signal handler for C, the
value of C<$?> will usually be wrong outside that handler.
Inside an C subroutine C<$?> contains the value that is going to be
given to C. You can modify C<$?> in an C subroutine to
change the exit status of your program. For example:
END {
$? = 1 if $? == 255; # die would make it 255
}
Under VMS, the pragma C