=head1 NAME
perl595delta - what is new for perl v5.9.5
=head1 DESCRIPTION
This document describes differences between the 5.9.4 and the 5.9.5
development releases. See L, L,
L, L and L for the differences
between 5.8.0 and 5.9.4.
=head1 Incompatible Changes
=head2 Tainting and printf
When perl is run under taint mode, C and C will now
reject any tainted format argument. (Rafael Garcia-Suarez)
=head2 undef and signal handlers
Undefining or deleting a signal handler via C is now
equivalent to setting it to C<'DEFAULT'>. (Rafael)
=head2 strictures and array/hash dereferencing in defined()
C and C are now subject to C
(that is, C<$foo> and C<$bar> shall be proper references there.)
(Nicholas Clark)
(However, C and C are discouraged constructs
anyway.)
=head2 C<(?p{})> has been removed
The regular expression construct C<(?p{})>, which was deprecated in perl
5.8, has been removed. Use C<(??{})> instead. (Rafael)
=head2 Pseudo-hashes have been removed
Support for pseudo-hashes has been removed from Perl 5.9. (The C
pragma remains here, but uses an alternate implementation.)
=head2 Removal of the bytecode compiler and of perlcc
C, the byteloader and the supporting modules (B::C, B::CC,
B::Bytecode, etc.) are no longer distributed with the perl sources. Those
experimental tools have never worked reliably, and, due to the lack of
volunteers to keep them in line with the perl interpreter developments, it
was decided to remove them instead of shipping a broken version of those.
The last version of those modules can be found with perl 5.9.4.
However the B compiler framework stays supported in the perl core, as with
the more useful modules it has permitted (among others, B::Deparse and
B::Concise).
=head2 Removal of the JPL
The JPL (Java-Perl Linguo) has been removed from the perl sources tarball.
=head2 Recursive inheritance detected earlier
Perl will now immediately throw an exception if you modify any package's
C<@ISA> in such a way that it would cause recursive inheritance.
Previously, the exception would not occur until Perl attempted to make
use of the recursive inheritance while resolving a method or doing a
C<$foo-Eisa($bar)> lookup.
=head1 Core Enhancements
=head2 Regular expressions
=over 4
=item Recursive Patterns
It is now possible to write recursive patterns without using the C<(??{})>
construct. This new way is more efficient, and in many cases easier to
read.
Each capturing parenthesis can now be treated as an independent pattern
that can be entered by using the C<(?PARNO)> syntax (C standing for
"parenthesis number"). For example, the following pattern will match
nested balanced angle brackets:
/
^ # start of line
( # start capture buffer 1
< # match an opening angle bracket
(?: # match one of:
(?> # don't backtrack over the inside of this group
[^<>]+ # one or more non angle brackets
) # end non backtracking group
| # ... or ...
(?1) # recurse to bracket 1 and try it again
)* # 0 or more times.
> # match a closing angle bracket
) # end capture buffer one
$ # end of line
/x
Note, users experienced with PCRE will find that the Perl implementation
of this feature differs from the PCRE one in that it is possible to
backtrack into a recursed pattern, whereas in PCRE the recursion is
atomic or "possessive" in nature. (Yves Orton)
=item Named Capture Buffers
It is now possible to name capturing parenthesis in a pattern and refer to
the captured contents by name. The naming syntax is C<< (?....) >>.
It's possible to backreference to a named buffer with the C<< \k >>
syntax. In code, the new magical hashes C<%+> and C<%-> can be used to
access the contents of the capture buffers.
Thus, to replace all doubled chars, one could write
s/(?.)\k/$+{letter}/g
Only buffers with defined contents will be "visible" in the C<%+> hash, so
it's possible to do something like
foreach my $name (keys %+) {
print "content of buffer '$name' is $+{$name}\n";
}
The C<%-> hash is a bit more complete, since it will contain array refs
holding values from all capture buffers similarly named, if there should
be many of them.
C<%+> and C<%-> are implemented as tied hashes through the new module
C.
Users exposed to the .NET regex engine will find that the perl
implementation differs in that the numerical ordering of the buffers
is sequential, and not "unnamed first, then named". Thus in the pattern
/(A)(?B)(C)(?D)/
$1 will be 'A', $2 will be 'B', $3 will be 'C' and $4 will be 'D' and not
$1 is 'A', $2 is 'C' and $3 is 'B' and $4 is 'D' that a .NET programmer
would expect. This is considered a feature. :-) (Yves Orton)
=item Possessive Quantifiers
Perl now supports the "possessive quantifier" syntax of the "atomic match"
pattern. Basically a possessive quantifier matches as much as it can and never
gives any back. Thus it can be used to control backtracking. The syntax is
similar to non-greedy matching, except instead of using a '?' as the modifier
the '+' is used. Thus C+>, C<*+>, C<++>, C<{min,max}+> are now legal
quantifiers. (Yves Orton)
=item Backtracking control verbs
The regex engine now supports a number of special-purpose backtrack
control verbs: (*THEN), (*PRUNE), (*MARK), (*SKIP), (*COMMIT), (*FAIL)
and (*ACCEPT). See L for their descriptions. (Yves Orton)
=item Relative backreferences
A new syntax C<\g{N}> or C<\gN> where "N" is a decimal integer allows a
safer form of back-reference notation as well as allowing relative
backreferences. This should make it easier to generate and embed patterns
that contain backreferences. See L. (Yves Orton)
=item C<\K> escape
The functionality of Jeff Pinyan's module Regexp::Keep has been added to
the core. You can now use in regular expressions the special escape C<\K>
as a way to do something like floating length positive lookbehind. It is
also useful in substitutions like:
s/(foo)bar/$1/g
that can now be converted to
s/foo\Kbar//g
which is much more efficient. (Yves Orton)
=item Vertical and horizontal whitespace, and linebreak
Regular expressions now recognize the C<\v> and C<\h> escapes, that match
vertical and horizontal whitespace, respectively. C<\V> and C<\H>
logically match their complements.
C<\R> matches a generic linebreak, that is, vertical whitespace, plus
the multi-character sequence C<"\x0D\x0A">.
=back
=head2 The C<_> prototype
A new prototype character has been added. C<_> is equivalent to C<$> (it
denotes a scalar), but defaults to C<$_> if the corresponding argument
isn't supplied. Due to the optional nature of the argument, you can only
use it at the end of a prototype, or before a semicolon.
This has a small incompatible consequence: the prototype() function has
been adjusted to return C<_> for some built-ins in appropriate cases (for
example, C). (Rafael)
=head2 UNITCHECK blocks
C, a new special code block has been introduced, in addition to
C, C, C and C.
C and C blocks, while useful for some specialized purposes,
are always executed at the transition between the compilation and the
execution of the main program, and thus are useless whenever code is
loaded at runtime. On the other hand, C blocks are executed
just after the unit which defined them has been compiled. See L
for more information. (Alex Gough)
=head2 readpipe() is now overridable
The built-in function readpipe() is now overridable. Overriding it permits
also to override its operator counterpart, C (a.k.a. C<``>).
Moreover, it now defaults to C<$_> if no argument is provided. (Rafael)
=head2 default argument for readline()
readline() now defaults to C<*ARGV> if no argument is provided. (Rafael)
=head2 UCD 5.0.0
The copy of the Unicode Character Database included in Perl 5.9 has
been updated to version 5.0.0.
=head2 Smart match
The smart match operator (C<~~>) is now available by default (you don't
need to enable it with C