=head1 NAME
perldata - Perl data types
=head1 DESCRIPTION
=head2 Variable names
X X X X
Perl has three built-in data types: scalars, arrays of scalars, and
associative arrays of scalars, known as "hashes". A scalar is a
single string (of any size, limited only by the available memory),
number, or a reference to something (which will be discussed
in L). Normal arrays are ordered lists of scalars indexed
by number, starting with 0. Hashes are unordered collections of scalar
values indexed by their associated string key.
Values are usually referred to by name, or through a named reference.
The first character of the name tells you to what sort of data
structure it refers. The rest of the name tells you the particular
value to which it refers. Usually this name is a single I,
that is, a string beginning with a letter or underscore, and
containing letters, underscores, and digits. In some cases, it may
be a chain of identifiers, separated by C<::> (or by the slightly
archaic C<'>); all but the last are interpreted as names of packages,
to locate the namespace in which to look up the final identifier
(see L for details). It's possible to substitute
for a simple identifier, an expression that produces a reference
to the value at runtime. This is described in more detail below
and in L.
X
Perl also has its own built-in variables whose names don't follow
these rules. They have strange names so they don't accidentally
collide with one of your normal variables. Strings that match
parenthesized parts of a regular expression are saved under names
containing only digits after the C<$> (see L and L).
In addition, several special variables that provide windows into
the inner working of Perl have names containing punctuation characters
and control characters. These are documented in L.
X
Scalar values are always named with '$', even when referring to a
scalar that is part of an array or a hash. The '$' symbol works
semantically like the English word "the" in that it indicates a
single value is expected.
X
$days # the simple scalar value "days"
$days[28] # the 29th element of array @days
$days{'Feb'} # the 'Feb' value from hash %days
$#days # the last index of array @days
Entire arrays (and slices of arrays and hashes) are denoted by '@',
which works much like the word "these" or "those" does in English,
in that it indicates multiple values are expected.
X
@days # ($days[0], $days[1],... $days[n])
@days[3,4,5] # same as ($days[3],$days[4],$days[5])
@days{'a','c'} # same as ($days{'a'},$days{'c'})
Entire hashes are denoted by '%':
X
%days # (key1, val1, key2, val2 ...)
In addition, subroutines are named with an initial '&', though this
is optional when unambiguous, just as the word "do" is often redundant
in English. Symbol table entries can be named with an initial '*',
but you don't really care about that yet (if ever :-).
Every variable type has its own namespace, as do several
non-variable identifiers. This means that you can, without fear
of conflict, use the same name for a scalar variable, an array, or
a hash--or, for that matter, for a filehandle, a directory handle, a
subroutine name, a format name, or a label. This means that $foo
and @foo are two different variables. It also means that C<$foo[1]>
is a part of @foo, not a part of $foo. This may seem a bit weird,
but that's okay, because it is weird.
X
Because variable references always start with '$', '@', or '%', the
"reserved" words aren't in fact reserved with respect to variable
names. They I reserved with respect to labels and filehandles,
however, which don't have an initial special character. You can't
have a filehandle named "log", for instance. Hint: you could say
C rather than C. Using
uppercase filehandles also improves readability and protects you
from conflict with future reserved words. Case I significant--"FOO",
"Foo", and "foo" are all different names. Names that start with a
letter or underscore may also contain digits and underscores.
X
X
It is possible to replace such an alphanumeric name with an expression
that returns a reference to the appropriate type. For a description
of this, see L.
Names that start with a digit may contain only more digits. Names
that do not start with a letter, underscore, digit or a caret (i.e.
a control character) are limited to one character, e.g., C<$%> or
C<$$>. (Most of these one character names have a predefined
significance to Perl. For instance, C<$$> is the current process
id.)
=head2 Context
X X X
The interpretation of operations and values in Perl sometimes depends
on the requirements of the context around the operation or value.
There are two major contexts: list and scalar. Certain operations
return list values in contexts wanting a list, and scalar values
otherwise. If this is true of an operation it will be mentioned in
the documentation for that operation. In other words, Perl overloads
certain operations based on whether the expected return value is
singular or plural. Some words in English work this way, like "fish"
and "sheep".
In a reciprocal fashion, an operation provides either a scalar or a
list context to each of its arguments. For example, if you say
int( )
the integer operation provides scalar context for the <>
operator, which responds by reading one line from STDIN and passing it
back to the integer operation, which will then find the integer value
of that line and return that. If, on the other hand, you say
sort( )
then the sort operation provides list context for <>, which
will proceed to read every line available up to the end of file, and
pass that list of lines back to the sort routine, which will then
sort those lines and return them as a list to whatever the context
of the sort was.
Assignment is a little bit special in that it uses its left argument
to determine the context for the right argument. Assignment to a
scalar evaluates the right-hand side in scalar context, while
assignment to an array or hash evaluates the righthand side in list
context. Assignment to a list (or slice, which is just a list
anyway) also evaluates the righthand side in list context.
When you use the C