Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
233 Cards in this Set
- Front
- Back
CPAN abr
Who created Perl? |
Comprehensive Perl Archive network
Larry Wall |
|
Perl Hello World example
|
print "Hello, world!\n";
Create a file enter line above save as PLX and run perl <filename> |
|
To make a perl script executable on Linux system run
|
chmod a+x <filename>
or chmod 755 <filename> and then to run type ./<filename> the ./ indicate that the cmd is in the current directory |
|
Hello world in perl 5.10 or later
|
use 5.010;
say "Hello World!"; Type perl -v to see the version of perl currently installed then substitute 5.010 with the version installed |
|
What's a bareword?
|
A bareword is a series of characters outside of a string that perl doesnt recognize
|
|
Single- vs Double-Quoted Strings
|
No processing is done within single quoted string, except on \\ and \'. Variable names are not evaluated in double-quoted they are evaluated
This type of processing of sting is called interpolation, and say that single-quoted strings are not interpolated #!/user/bin/perl #quotes.plx user warnings; print '\tThis is a single quoted string.\n'; print "\tThis is a double quoted string.\"; >perl quotes.plx \tThis is a single quoted string.\n This is a double quoted. > |
|
dec, oct, bin, hex
|
use warnings;
print 255, "\n"; print 0377, "\n"; print 0b11111111, "\n"; print 0xFF, "\n"; ------------------- 255 255 255 255 |
|
Evaluate
#!/user/bin/perl #badnum.plx use warnings; print 255, "\n"; print 0378, "\n"; print 0b11111112, "\n"; print 0xFG, "\n"; |
Illegal octal digit '8' at line 5
Illegal binary digit '2' at line 6 Bareword found where operator expected at line 7 |
|
Use perl to print It's as easy as that
Use perl to print "Stop," he cried. Use perl to print '"Hi," said Jack. "Have you read Slashdot today?" |
print "It's as easy as that"
print '"Stop," he cried.'; print "'\"Hi,\" said Jack. \"Have your read Slashdot today?\"'\n"; dont forget that \n needs to go in separate double quotes to make it interpolate |
|
How to escape \ in string
|
\\
user warnings; print "C:\\WINNT\\Profiles\\\n"; print 'C:\WINNT\Profiles\', "\n"; |
|
For very large integers you may split up the thousand with commas, like this: 10,000,000
We can also do this in Perl but with an underscore _ Sample code... |
code:
#!/usr/bin/perl #number3.plx use warnings; print 25_000_000, " ", -4, "\n"; result: perl number3.plx 2500000 -4 |
|
perldoc -f allows you to see information about a particular function
|
perldoc -f print
print FILEHANDLE LIST print LIST print Prints a sting or acomma-separated list of strings. Returns TRUE if successful |
|
perlodoc -q
|
allows you to search the Perl FAQ
perldoc -q reverse Found in /usr/lib/perl5/5.6.0/pod/perlfaq4.pod How do I reverse a sting? Use reverse() in scalar context, as documented in the reverse entry in the perlfunc manpage $reversed = reverse $string; |
|
exponentiation operator
|
**
raises one number to the power of another #!/usr/bin/perl #arithop7.plx use warnings; print 2**4, " ", 3**5, " ", -2**4, "\n"; ------- Result 16 243 -16 -2**4 gets treated by the compiler like -(2**4) due to - being an unary minus |
|
Instead of &&, || and ! we can also use
|
and, or, not (there is also xor that doesn't have symbolic)
#!/usr/bin/perl #bool.plx use warnings; print "Test one: ", 6 > 3 && 3 > 4, "\n"; print "Test one: ", 6 > 3 and 3 > 4, "\n"; >perl bool.plx Test one: Test two: 1> What is the problem? - the lower precedence of and compared to &&. What actually happens is? print( "Test two: ", 6 > 3) and 3 > 4, "\n"; |
|
lazy evaluation (def)
|
as soon as it knows the answer to the question, it stops working
If you ask if X and Y are both true, and if it finds that X isn't, it doesnt need to look at Y it doesnt matter if Y is true or false 4 >=2 and print "Four is more than or equal to two\n"; If the first test is true, perl has to check if the other side if the first value is false, then there is no need to check the second part and message doesnt get printed |
|
What happens if we try and join a number to a string?
#!/usr/bing/perl #string.plx use warnings; print "Four seven are " . 4*7."\n"; What is the result? |
> perl string.plx
Four sevens are 28 |
|
What this prints?
#!/usr/bin/perl #string.plx print "Go!"x3 print "Ba"."na"x4 |
>perl string.plx
Go! Go! Go! Banananana |
|
What this prints?
#!/usr/bin/perl #string.plx print "Ba"."na"x4*3, "\n"; print "Ba"."na"x(4*3), "\n"; |
>perl sting.plx
Ba0 Banananananananananananana Why is the first one Ba0? First thing was the repetition, gibing us "nananana". The the multiplication - What's nananana times three? Since there is no digit in nananana it gets converted to zero |
|
What this will print?
#!/usr/bin/perl #str2num.plx userwarnings; print "12 monkeys" + 0, "\n"; print "Eleven to fly" + 0, "\n"; print "UB40" + 0, "\n"; print "-20 10" + 0, "\n"; print "0x30" + 0, "\n"; |
perl str2num.plx
12 0 -20 0 > 1)12 monkeys perl stopped after 12 2)Eleven to fly - english words dont get converted to nu 3)As it starts with UB perl cant convert 4)-20 10 perl stopped after the fist number 5)Perl doesnt convert binary, hex or octal to dec when they are presented as stings |
|
Print ascii code of a char
|
$!/usr/bin/perl
#ascii.plx use warnings; print "A # has ASCII value ", ord("#"), "\n"; print "A * has ASCII value ", ord("*"), "\n"; >perl ascii.plx A # has ASCII value 35 A * has ASCII value 42 |
|
Comparing string
lt - less than gt - greater than eq - equal cmp - compare |
#!/usr/bin/per
#strcomp1.plx use warnings; print "Which came first, the chicken or the eqq? "; print "chicken" cmp "egg", "\n"; print "Are dogs greater than cats? "; print "dog" gt "cat", "\n"; print "Is ^ less than + ?", "\n"; print "^" lt "+x", "\n"; Which came first, the chicken or the egg? -1 Are dogs greater than cats? 1 Is ^ less greater than + ? > |
|
Precedence table is Perl
|
- >
**+! ~ \ =~ !~ * / % x + - . << >> < > <= >= lt gt le ge == != <=> eq ne cmp & | ^ && || .. ... ?: , => not and or xor |
|
Types of variables in Perl
|
scalars starts with $
lists hashes |
|
Directive use strict;
we ask Perl to be strict about our variable use, it check if we have declared all our variable |
#!/usr/binperl
#scope.plx use strict; use warnings; $records = 4; print "We are at record ", $record, "\n"; { my $record; $record = 7; print "Inside the block, we are at record ", $record, "\n"; } print "We are still at record ', $record, "\n"; --------------- Global symbol "$record" requires explicit package name. Now the global $record is not declared. |
|
Leixcal variables - constrained to the enclosing block and all blocks inside it. If they are not currently inside a block the are constrained to the current file.
To tell perl that a variable is lexical, we say 'my $variable;' |
#!/usr/bin/perl
#scope.plx user warnings; $record = 4; print "We are at record ", $record, "\n"; { my $record; $record = 7; print "Inside the block, we are at record ", $record, "\n"; } print "we are still at record ", $record, "\n"; >perl scope.plx We are at record 4 Inside the block, we are at record 7 We are still at record 4 |
|
Autoincrement operator and string
#!/usr/bin/perl #auto.plx use warnings; $a = "A9"; print ++$a, "\n"; $a = "bz"; print ++$a, "\n"; $a = "Zz"; print ++$a, "\n"; $a = "z9"; print ++$a, "\n"; $a = "9z"; print ++$a, "\n"; |
perl auto.plx
B0 ca AAa aa0 10 > |
|
What following code yields?
#!/usr/bim/perl $auto.plx $a = 4; $b = 10; print "Our variables are ", $a, " and ", $b, "\n"; $b = $a++; print "After incrementing, we have ", $a, " and ", $b, "\n"; $b=++$a*2; print "Now, we have ", $2, " and", $b, "\n"; $a=--$b+4; print "finaly, we have ", $a, " and ", $b, "\n"; |
>perl auto.plx
Our variables are 4 and 10 After incrementing, we have 5 and 4 Now, we have 6 and 12 Finally, we have 15 and 11 |
|
How to update and install perl on Ubuntu
comment in perl |
sudo apt-get update && sudo apt-get install perl
print "Test" # This is a comment |
|
You can concatenate, or join, string values with the ___ operator
|
You can concatenate, or join, string values with the . operator
"hello" . "world" # same as "helloworld" "hello" . ' ' . "world" # same as 'hello world' 'hello world' . "\n" # same as "hello world\n" |
|
What will this produce 5 x 4.8
|
"fred" x 3 is really "fredfredfred"
# is really "5" x 4, which is "5555" The string repetition operator wants a string for a left operand, so the number 5 is converted to the string "5" The x copies the new string four times, yielding the four-character string 5555 |
|
If an operator expects a number (like + does), Perl will see the value as a number.
"12" * "3" gives the value "Z" . 5 * 7 gives the value |
If an operator expects a string (like . does), Perl will see the value as a string
"12" * "3" gives the value 36 Trailing nonnumber stuff and leading whitespace are discarded, so "12fred34" * " 3" will also give 36 without any complaints "Z" . 5 * 7 gives the value "Z35" |
|
How to turn on Perl warnings
|
With Perl 5.6 and later, you can turn on warnings with a pragma
#!/usr/bin/perl use warnings; -w option on the command line, which turns on warnings everywhere in your program: $ perl -w my_program |
|
If you get a warning message you don’t understand, you can get a longer description of the problem with the diagnostics pragma
|
#!/usr/bin/perl
use diagnostics; Perl’s command-line options, -M, to load the pragma only when needed instead of editing the source code each time to enable and disable diagnostics: $ perl -Mdiagnostics ./my_program |
|
scalar variable holds a single scalar value, as you’d expect. Scalar variable names begin with a
|
dollar sign, called the sigil, followed by a Perl identifier: a letter or underscore, and then possibly more letters, or digits, or underscore
|
|
binary assignment operator
|
Expressions like $fred = $fred + 5 occur frequently enough that Perl has a shorthand for the operation of altering a variable—the binary assignment operator
$fred = $fred + 5; $fred += 5; $barney = $barney * 3; $barney *= 3; |
|
Sometimes you want to create strings with characters that may not appear on your keyboard, such as é, å, α, or א. How you get these characters into your program
|
$alef = chr( 0x05D0 );
$alpha = chr( hex('03B1') ); $omega = chr( 0x03C9 ); |
|
if Control Structure
|
if ($name gt 'fred') {
print "'$name' comes after 'fred' in sorted order.\n"; } if ($name gt 'fred') { print "'$name' comes after 'fred' in sorted order.\n"; } else { print "'$name' does not come after 'fred'.\n"; print "Maybe it's the same string, in fact.\n"; } |
|
how to get a value from the keyboard into a Perl program.
|
simplest way: use the line-input operator, <STDIN>
Perl reads the next complete text line from standard input (up to the first newline) Reads until it encounters EOF Ctrl+D for Unix/Linux Ctrl+Z for Dos/Win |
|
chomp Operator
|
The variable has to hold a string, and if the string ends in a newline
character, chomp() removes the newline. That’s (nearly) all it does. For example: $text = "a line of text\n"; # Or the same thing from <STDIN> chomp($text); # Gets rid of the newline character |
|
undef Value
|
What happens if you use a scalar variable before you give it a value? Nothing serious, and definitely nothing fatal. Variables have the special undef value before they are first assigned
|
|
To tell whether a value is undef and not the empty string, use the defined function, which returns false for undef, and true for everything else
|
$madonna = <STDIN>;
if ( defined($madonna) ) { print "The input was $madonna"; } else { print "No input available!\n"; } |
|
For the array of rocks, the last element index is
If you have three elements in the array, the valid negative indices are –1 |
$#rocks
-1 the last element -2 the middle element -3 the first element |
|
list literal def
|
is a list of commaseparated values enclosed in parentheses. These values form the elements of the list.
For example: (1, 2, 3) # list of three values 1, 2, and 3 (1, 2, 3,) # the same three values (the trailing comma is ignored) ("fred", 4.5) # two values, "fred" and 4.5 ( ) # empty list - zero elements (1..100) # list of 100 integers |
|
range operator def
|
(1..5) # same as (1, 2, 3, 4, 5)
(1.7..5.7) # same thing; both values are truncated (5..1) # empty list; .. only counts "uphill" (0, 2..6, 10, 12) # same as (0, 2, 3, 4, 5, 6, 10, 12) ($m..$n) # range determined by current values of $m and $n (0..$#rocks) # the indices of the rocks array from the previous section |
|
qw Shortcut
|
qw shortcut makes it easy to generate them without typing a lot of extra quote marks:
qw( fred barney betty wilma dino ) # same as ("fred", "barney", "betty", "wilma", "dino") |
|
In much the same way as you can assign scalar values to variables, you can assign list values to variables:
|
($fred, $barney, $dino) = ("flintstone", "rubble", undef);
|
|
swap those values using list
|
($fred, $barney) = ($barney, $fred);
|
|
you have too many variables, the extras get the value undef:
($fred, $barney) = qw< flintstone rubble slate granite >; ($wilma, $dino) = qw[flintstone]; |
# two ignored items
# $dino gets undef |
|
when you wish to refer to an entire array Perl has a simpler notation. Just use the
at sign |
@ before the name of the array
You can read this as “all of the,” so @rocks is “all of the rocks.” |
|
pop operator takes the last element off of an array and returns it:
@array = 5..9; $fred = pop(@array); |
$fred = pop(@array); # $fred gets 9, @array now has (5, 6, 7, 8)
$barney = pop @array; # $barney gets 8, @array now has (5, 6, 7) pop @array; # @array now has (5, 6). (The 7 is discarded.) |
|
The converse operation is push, which adds an element (or a list of elements) to the end of an array:
|
push(@array, 0); # @array now has (5, 6, 0)
push @array, 8; # @array now has (5, 6, 0, 8) push @array, 1..10; # @array now has those 10 new elements @others = qw/ 9 0 2 1 0 /; push @array, @others; # @array now has those five new elements (19 total) |
|
shift and unshift Operators
|
unsift and shift operators perform the corresponding pop and push action on the "start" of the array
@array = qw# dino fred barney #; $m = shift(@array); # $m gets "dino", @array now has ("fred", "barney") $n = shift @array; # $n gets "fred", @array now has ("barney") shift @array; # @array is now empty $o = shift @array; # $o gets undef, @array is still empty unshift(@array, 5); # @array now has the one-element list (5) unshift @array, 4; # @array now has (4, 5) @others = 1..3; unshift @array, @others; # @array now has (1, 2, 3, 4, 5) |
|
splice operator with 1 parameter
|
@array = qw( pebbles dino fred barney betty );
@removed = splice @array, 2; # remove everything after fred # @removed is qw(fred barney betty) # @array is qw(pebbles dino) |
|
splice operator with 2 parameter
|
@array = qw( pebbles dino fred barney betty );
@removed = splice @array, 1, 2; # remove dino, fred # @removed is qw(dino fred) # @array is qw(pebbles barney betty) |
|
splice operator with 3 parameter
|
@array = qw( pebbles dino fred barney betty );
@removed = splice @array, 1, 2, qw(wilma); # remove dino, fred # @removed is qw(dino fred) # @array is qw(pebbles wilma # barney betty) |
|
You don’t have to remove any elements. If you specify a length of 0, you remove no elements but still insert the “replacement” list:
|
@array = qw( pebbles dino fred barney betty );
@removed = splice @array, 1, 0, qw(wilma); # remove nothing # @removed is qw() # @array is qw(pebbles wilma dino # fred barney betty) |
|
If you forget that arrays interpolate like this, you’ll be surprised when you put an email address into a double-quoted string:
$email = "fred@bedrock.edu"; |
# WRONG! Tries to interpolate @bedrock
To get around this problem, you either escape the @ in a double-quoted string or use a single-quoted string: $email = "fred\@bedrock.edu"; # Correct $email = 'fred@bedrock.edu'; # Another way to do that |
|
#! - explain
|
#!/usr/bin/perl
the very first two characters on the first line of a text file are #!, then what follows is the name of the program that actually executes the rest of the file. In this case, the program is stored in the file /usr/bin/perl. |
|
Internally Perl computes with what type of numbers
|
double-precesion floating point value
|
|
If you want to use Unicode literally in your program you need to add the utf8 pragma
|
use utf8;
|
|
What will this print?
print "The answer is "; print 6 * 7; print ".\n"; |
Same as
The answer is 42 ------- on 1 line |
|
Print dollar sign
|
print "\$"
|
|
Operators for comparing strings
|
lt
le eq ge gt ne |
|
Boolean in Perl
|
Perl doesn't have Boolean
if a value is a number 0 mean false all other nubmers mean true if a sting the empty string '' means false all other strings mean true if a value is another kind of scalar than a number or a string convert it to a number or a string |
|
Small app reads a program from STDIN checks ifts new line and prints That was just a blank line or prints the input
|
$line = <STDIN>
if ($line eq "\n") { print "That was just a blankline!\n"; } else { print "That line of input was: $line"; } |
|
Read the text without the newline character
|
chomp($text = <STDIN>);
|
|
list (def)
array (def) |
list is an ordered collection of scalars
array is variable that contain list |
|
What happens when the subscript indicates an element that would be beyond the end of the array
|
the corresponding value will be undef.
$blank = $fred[ 142_857 ]; # unused array element gives undef $blanc = $mel; # unused scalar $mel also gives undef If you store into an array element that is beyond the end of the array, the array is automatically extended as needed $rocks[0] = 'bedrock'; # One element... $rocks[1] = 'slate'; # another... $rocks[2] = 'lava'; # and another... $rocks[3] = 'crushed rock'; # and another... $rocks[99] = 'schist'; # now there are 95 undef elements |
|
qw with different delimiters
|
qw! fred barney betty wilma dino !
qw/ fred barney betty wilma dino / qw# fred barney betty wilma dino # If the opening delimiter is one of those “left” characters, the corresponding “right” character is the proper closing delimiter: qw( fred barney betty wilma dino ) qw{ fred barney betty wilma dino } qw[ fred barney betty wilma dino ] qw< fred barney betty wilma dino > |
|
qw - If you need to include the closing delimiter within the string as one of the characters
|
If you need to include the closing delimiter within the string as one of the characters you can still include the character using the backslash:
qw! yahoo\! google ask msn ! # include yahoo! as an element |
|
qw - two consecutive backslashes contribute one single backslash to the item
|
qw( This as a \\ real backslash );
|
|
List Assignment
|
($fred, $barney, $dino) = ("flintstone", "rubble", undef);
what happens if the number of variables (on the left side of the equals sign) isn’t the same as the number of values - еьтра жалуес аре игноре |
|
swap two variables using lsit
|
($fred, $barney) = ($barney, $fred); # swap those values
($betty[0], $betty[1]) = ($betty[1], $betty[0]); |
|
could build up an array of strings with a line of code like this
|
($rocks[0], $rocks[1], $rocks[2], $rocks[3]) = qw/talc mica feldspar quartz/;
|
|
at ___________ before the name of the array (and no index brackets after it) to refer to the entire array at once
|
@
@rocks = qw/ bedrock slate lava /; @tiny = ( ); # the empty list @giant = 1..1e5; # a list with 100,000 elements @stuff = (@giant, undef, @giant); # a list with 200,001 elements $dino = "granite"; @quarry = (@rocks, "crushed rock", @tiny, $dino); |
|
The value of an array variable that has not yet been assigned is
array is copied to another array with |
( ), the empty list
= - assignment @copy = @quarry; # copy a list from one array to another |
|
If the array is empty, pop leaves it alone (since there is no element to remove) and returns
|
undef
|
|
foreach Control Structure
|
foreach $rock (qw/ bedrock slate lava /) {
print "One rock is $rock.\n"; # Prints names of three rocks } |
|
If you omit the control variable from the beginning of the foreach loop, Perl uses its favorite default variable,
|
$_
foreach (1..10) { # Uses $_ by default print "I can count to $_!\n"; } |
|
calling just pring; after
setting $_ = "Yabba dabba doo\" |
$_ = "Yabba dabba doo\n";
print; # prints $_ by default # this will print "Yabba dabba doo" |
|
reverse operator takes a list of values
|
returns the list in the opposite order.
|
|
sort Operator
|
takes a list of values and sorts them in the internal character ordering
@rocks = qw/ bedrock slate rubble granite /; @sorted = sort(@rocks); # gets bedrock, granite, rubble, slate @back = reverse sort @rocks; # these go from slate to bedrock @rocks = sort @rocks; # puts sorted result back into @rocks @numbers = sort 97..102; # gets 100, 101, 102, 97, 98, 99 |
|
Does this sort array rocks
sort @rocks; Does this reverse array fred reverse @fred; |
sort @rocks; # WRONG, doesn't modify @rocks
@rocks = sort @rocks; # Now the rock collection is in order reverse @fred; # WRONG - doesn't change @fred @fred = reverse @fred; # that's better |
|
each Operator
|
Every time that you call each on an array, it returns two values for the next element in the array—the index of the value and the value itself
use 5.012; @rocks = qw/ bedrock slate rubble granite /; while( my( $index, $value ) = each @rocks ) { say "$index: $value"; } |
|
As Perl is parsing your expressions, it always expects either a scalar value or a list value. What Perl expects is called the context of the expression:
|
42 + something # The something must be a scalar
sort something # The something must be a list |
|
Explain context for "name" of an array
|
In a list context, it gives the list of elements. But in a scalar context, it returns the number of elements in the array:
@people = qw( fred barney betty ); @sorted = sort @people; # list context: barney, betty, fred $number = 42 + @people; # scalar context: 42 + 3 gives 45 |
|
assignment (to a scalar or a list) causes different contexts
|
@list = @people; # a list of three people
$n = @people; # the number 3 |
|
result
@backwards = reverse qw/ yabba dabba doo /; vs $backwards = reverse qw/ yabba dabba doo /; |
@backwards = reverse qw/ yabba dabba doo /;
# gives doo, dabba, yabba $backwards = reverse qw/ yabba dabba doo /; # gives oodabbadabbay |
|
$fred = something; # context?
@pebbles = something; # context? ($wilma, $betty) = something; # context? ($dino) = something; # context? |
$fred = something; # scalar context
@pebbles = something; # list context ($wilma, $betty) = something; # list context ($dino) = something; # still list context! |
|
Some expressions that provide scalar context
|
$fred = something;
$fred[3] = something; 123 + something something + 654 if (something) { ... } while (something) { ... } $fred[something] = something; |
|
Some expressions that provide list context
|
@fred = something;
($fred, $barney) = something; ($fred) = something; push @fred, something; foreach $fred (something) { ... } sort something reverse something print something |
|
@fred = 6 * 7; # what is the result
|
@fred = 6 * 7; # gets the one-element list (42)
|
|
@wilma = undef;
vs @betty = ( ); |
@wilma = undef; # OOPS! Gets the one-element list (undef)
# which is not the same as this: @betty = ( ); # A correct way to empty an array |
|
Forcing Scalar Context
On occasion, you may need to force scalar context where Perl is expecting a list. In that case, you can use the fake function ? |
scalar
@rocks = qw( talc quartz jade obsidian ); print "How many rocks do you have?\n"; print "I have ", @rocks, " rocks!\n"; # WRONG, prints names of rocks print "I have ", scalar @rocks, " rocks!\n"; # Correct, gives a number |
|
common way to read the lines from STDIN
|
chomp(@lines = <STDIN>); # Read the lines, not the newlines
only new Perl users do @lines = <STDIN>; # Read all the lines chomp(@lines); # discard all the newline characters |
|
define your own subroutine, use the keyword
|
sub
sub marine { $n += 1; # Global variable $n print "Hello, sailor number $n!\n"; } You invoke a subroutine from within an expression by using the subroutine name (with the ampersand) &marine; # says Hello, sailor number 1! &marine; # says Hello, sailor number 2! &marine; # says Hello, sailor number 3! &marine; # says Hello, sailor number 4! |
|
All Perl subroutines have a return value
|
As Perl chugs along in a subroutine, it calculates values as part of its series of actions. Whatever calculation is last performed in a subroutine is automatically also the return value.
For example, this subroutine has an addition as the last expression: sub sum_of_fred_and_barney { print "Hey, you called the sum_of_fred_and_barney subroutine!\n"; $fred + $barney; # That's the return value } |
|
what will this subroutine return
|
last expression evaluated is not the addition anymore; it’s now the print statement, whose return value is normally 1, meaning “printing was successful,”
|
|
example, this subroutine returns the larger value of $fred or $barney
|
sub larger_of_fred_or_barney {
if ($fred > $barney) { $fred; } else { $barney; } } |
|
subroutine arguments
|
$n = &max(10, 15); # This sub call has two parameters
Perl passes the list to the subroutine; that is, Perl makes the list available for the subroutine to use however it needs to. Perl automatically stores the parameter list in the special array variable named @_ for the duration of the subroutine |
|
you could write the subroutine &max that uses arguments
|
sub max {
# Compare this to &larger_of_fred_or_barney if ($_[0] > $_[1]) { $_[0]; } else { $_[1]; } } |
|
you have a subrotine
sub max { # Compare this to &larger_of_fred_or_barney if ($_[0] > $_[1]) { $_[0]; } else { $_[1]; } } What will this cause $n = &max(10, 15, 27); # Oops! |
max ignores the extra parameters since it never looks at $_[2].
|
|
you can create private variables called lexical variables at any time with the my operator
|
sub max {
my($m, $n); # new, private variables for this block ($m, $n) = @_; # give names to the parameters if ($m > $n) { $m } else { $n } } |
|
sub max {
my($m, $n); # new, private variables for this block ($m, $n) = @_; # give names to the parameters if ($m > $n) { $m } else { $n } } simplify this |
my($m, $n) = @_; # Name the subroutine parameters
That one statement creates the private variables and sets their values, so the first parameter now has the easier-to-use name $m and the second has $n. |
|
subroutines often have parameter lists of arbitrary length
|
sub max {
if (@_ != 2) { print "WARNING! &max should get exactly two arguments!\n"; } # continue as before... ... } |
|
&max to allow for any number of arguments, so you can call it like this:
|
$maximum = &max(3, 5, 10, 4, 6);
sub max { my($max_so_far) = shift @_; # the first one is the largest yet seen foreach (@_) { # look at the remaining arguments if ($_ > $max_so_far) { # could this one be bigger yet? $max_so_far = $_; } } $max_so_far; } |
|
Empty Parameter Lists
$maximum = &max(@numbers); &max do in the case that @number array is mepty |
will return undef
The first line of the subroutine sets $max_so_far by using shift on @_, the (now empty) parameter array. That’s harmless; the array is left empty, and shift returns undef to $max_so_far. Now the foreach loop wants to iterate over @_, but since that’s empty, you execute the loop body zero times. |
|
my($num) = @_; # list context, same as ($num) = @_;
my $num = @_; # scalar context, same as $num = @_; |
In the first one, $num gets the first parameter, as a list-context assignment;
in the second, it gets the number of parameters, in a scalar context |
|
foreach (1..10) {
my($square) = $_ * $_; # private variable in this loop print "$_ squared is $square.\n"; } variable $square is ?????????? to the enclosing block |
$square is private to the enclosing block; in this case, that’s the block of the foreach loop.
|
|
my $fred, $barney;
vs my($fred, $barney); |
my $fred, $barney; # WRONG! Fails to declare $barney
my($fred, $barney); # declares both |
|
strict Pragma
|
strict pragma tells Perl’s internal compiler that it should enforce some good programming rules for the rest of this block or source file.
use strict; # Enforce some good programming rules Starting with Perl 5.12, you implicitly use this pragma when you declare a minimum Perl version: use 5.012; # loads strict for you |
|
pragma def
|
is a hint to a compiler, telling it something about the code.
|
|
What if you want to stop your subroutine right away? You can use the _______ operator
|
return operator immediately returns a value from a subroutine:
my $result = &which_element_is("dino", @names); sub which_element_is { my($what, @array) = @_; foreach (0..$#array) { # indices of @array's elements if ($what eq $array[$_]) { return $_; # return early once found } } –1; # element not found (return is optional here) } |
|
Omitting the Ampersand
If the compiler sees the subroutine definition before invocation internal compiler has already seen the subroutine definition |
my @cards = shuffle(@deck_of_cards); # No & necessary on &shuffle
sub division { $_[0] / $_[1]; # Divide first param by second } |
|
the subroutine has the same name as a Perl built-in, you must use the ampersand to call your version
|
sub chomp {
print "Munch, munch!\n"; } &chomp; # That ampersand is not optional! |
|
Declaring our variable with state tells Perl to retain the variable’s value between calls to the subroutine and to make the variable private to the subroutine
|
use 5.010;
sub marine { state $n = 0; # private, persistent variable $n $n += 1; print "Hello, sailor number $n!\n"; } |
|
Since the line-input operator will return undef when you reach end-of-file, this is handy for dropping out of loops:
|
while (defined($line = <STDIN>)) {
print "I saw $line"; } |
|
Another way to read input is with the diamond§ operator
|
<>
This is useful for making programs that work like standard Unix‖ utilities, with while (defined($line = <>)) { chomp($line); print "It was $line that I saw!\n"; } So, if you run this program with the invocation arguments fred, barney, and betty, it will say something like: “It was [a line from file fred] that I saw!”, “It was [another line from file fred] that I saw!”, on and on until it reaches the end of file fred. Then, it will automatically go on to file barney, printing out one line after another, and then on through file betty. |
|
@ARGV array
|
This array is a special array that is preset by the Perl
interpreter as the list of the invocation arguments diamond operator looks in @ARGV to determine what filenames it should use |
|
you can process three specific files, regardless of what the user chose on the command line:
|
@ARGV = qw# larry moe curly #; # force these three files to be read
while (<>) { chomp; print "It was $_ that I saw in some stooge-like file!\n"; } |
|
print (2+3)*4; # Oops
|
When Perl sees this line of code, it prints 5, just as you asked. Then it takes the return value from print, which is 1, and multiplies that times 4.
|
|
printf operator takes a format string followed by a list of things to print
|
printf "Hello, %s; your password expires in %d days!\n",
$user, $days_to_die; conversions; each conversion begins with a percent sign (%) and ends with a letter. |
|
%g,‡ which automatically chooses floating-point, integer, or even exponential notation, as needed
|
printf "%g %g %g\n", 5/2, 51/17, 51 ** 17; # 2.5 3 1.0683e+29
|
|
%d format means a decimal§ integer, truncated as needed:
|
printf "in %d days!\n", 17.85; # in 17 days!
|
|
In Perl, you most often use printf for columnar data, since most formats accept a field
width. If the data won’t fit, the field will generally be expanded as needed: |
printf "%6d\n", 42; # output like ````42 (the ` symbol stands for a space)
printf "%2d\n", 2e3 + 1.95; # 2001 |
|
%s conversion means a string
|
printf "%10s\n", "wilma"; # looks like `````wilma
A negative field width is left-justified (in any of these conversions): printf "%-15s\n", "flintstone"; # looks like flintstone````` |
|
%f conversion (floating-point)
|
printf "%12f\n", 6 * 7 + 2/3; # looks like ```42.666667
printf "%12.3f\n", 6 * 7 + 2/3; # looks like ``````42.667 printf "%12.0f\n", 6 * 7 + 2/3; # looks like ``````````43 |
|
What this code does
my @items = qw( wilma dino pebbles ); my $format = "The items are:\n" . ("%10s\n" x @items); ## print "the format is >>$format<<\n"; # for debugging printf $format, @items; |
replicate the given string a number of times given by @items (which is being used in a scalar context). In this case, that’s 3, since there are 3 items, so the resulting format string is the same as if you wrote it as "The items are:\n%10s\n%10s\n%10s\n". And the output prints each item on its own line, right-justified in a 10-character column, under a heading line.
|
|
filehandle
|
A filehandle is the name in a Perl program for an I/O connection between your Perl process and the outside world.
|
|
There are six special filehandle names that Perl already uses
|
STDIN
STDOUT STDERR DATA ARGV ARGVOUT |
|
open operator
|
tells Perl to ask the operating system to open connection between your program and the outside world
|
|
open CONFIG, 'dino';
open CONFIG, '<dino'; |
opens a filehandle called CONFIG to a file called dino
whatever it holds will come into our program through the filehandle named CONFIG Second line of code dose the same but explicitly by using the < sign |
|
open BEDROCK, '>fred';
|
open the filehandle BEDROCK for output to the new file fred
> is not required file will be replaced if it already exists |
|
open LOG, '>>logfile'
|
if the file already exists data will be added to it
|
|
“three-argument” open:
|
starting Perl 5.6 you can use
open CONFIG, '<', 'dino'; open BEDROCK, '>', $file_name; open LOG, '>>', &logfile_name(); |
|
If you want to write your data to a file with a particular encoding, you do the same thing with one of the write modes:
|
open BEDROCK, '>:encoding(UTF-8)', $file_name;
open LOG, '>>:encoding(UTF-8)', &logfile_name(); |
|
open always tells you if it succeeded or failed by returning true for success or false for failure
|
my $success = open LOG, '>>', 'logfile'; # capture the return value
if ( ! $success ) { # The open failed ... } |
|
Closing a Filehandle
|
When you are finished with a filehandle, you may close it with the close operator like this:
close BEDROCK; Perl automatically closes a filehandle if you reopen it (that is, if you reuse the filehandle name in a new open) or if you exit the program.# |
|
error handling when opening a file
|
if ( ! open LOG, '>>', 'logfile' ) {
die "Cannot create logfile: $!"; } |
|
$!
|
when the system refuses to do something you’ve requested $! will give you a reason
|
|
return an error if there are not enough arguments passed to a variable
|
if (@ARGV < 2) {
die "Not enough arguments\n"; } If there aren’t at least two command-line arguments, that program will say so and quit. |
|
warn function
|
use the warn function to cause a warning that acts like one of Perl's built-in warning
|
|
autodie pragma
|
you can use the autodie pragma once in your program and automatically get the die if your open fails:
use autodie; open LOG, '>>', 'logfile'; |
|
Once a filehandle is open for reading, you can read lines from it just like you can read from standard input with STDIN. So, for example, to read lines from the Unix password file:
|
if ( ! open PASSWD, "/etc/passwd") {
die "How did you get logged in? ($!)"; } while (<PASSWD>) { chomp; ... } |
|
You can use a filehandle open for writing or appending with print or printf, appearing immediately after the keyword but before the list of arguments
|
print LOG "Captain's log, stardate 3.14159\n"; # output goes to LOG
printf STDERR "%d percent complete.\n", $done/$total * 100; |
|
Changing the Default Output Filehandle
select operator |
By default, if you don’t give a filehandle to print (or to printf, as everything we say here about one applies equally well to the other), the output will go to STDOUT.
default may be changed with the select operator select BEDROCK; print "I hope Mr. Slate doesn't find out about this.\n"; print "Wilma!\n"; |
|
special $| variable to 1 will set the currently selected filehandle to .....
|
to always flush the buffer after each output operation. So if you wanted to be sure that the logfile gets its entries at once, in case you might be reading the log to monitor progress of your longrunning program, you could use something like this:
select LOG; $| = 1; # don't keep LOG entries sitting in the buffer select STDOUT; # ... time passes, babies learn to walk, tectonic plates shift, and then... print LOG "This gets written to the LOG at once!\n"; |
|
Send errors to my private error log
|
if ( ! open STDERR, ">>/home/barney/.error_log") {
die "Can't open error log for append: $!"; } |
|
say built-in (same as print, but adds a new line to the end)
|
These forms all output the same thing:
use 5.010; print "Hello!\n"; print "Hello!", "\n"; say "Hello!"; |
|
say and interpolated array
my @array = qw( a b c d ); say @array; # output ??? say "@array"; # output ??? |
use 5.010;
my @array = qw( a b c d ); say @array; # "abcd\n" say "@array"; # "a b c d\n"; |
|
Just like with print, I can specify a filehandle with say
|
use 5.010;
say BEDROCK "Hello!"; |
|
Filehandles in a Scalar
|
*the naming convention is to use _fh at end of such variables
my $rocks_fh; open $rocks_fh, '<', 'rocks.txt' or die "Could not open rocks.txt: $!"; |
|
What Is a Hash?
|
hash is a data structure, not unlike an array in that it can hold any number of values and retrieve them at will. But instead of indexing the values by number, as you did with arrays, you look up hash values by name. That is, the indices, called keys, aren’t numbers, but instead they are arbitrary, unique strings
|
|
Hash Element Access
|
To access an element of a hash, you use syntax that looks like this:
$hash{$some_key} |
|
Perl won’t mind if you also have a scalar called $family_name and array elements like $family_name[5].
|
But dont use it
Aber keine verwenden |
|
if
$family_name{'fred'} = 'flintstone'; $family_name{'barney'} = 'rubble'; What will this pring $foo = 'bar'; print $family_name{ $foo . 'ney' }; |
print $family_name{ $foo . 'ney' }; # prints 'rubble'
|
|
When you store something into an existing hash element, it overwrites the previous value:
$family_name{'fred'} = 'astaire'; |
# gives new value to existing element
$bedrock = $family_name{'fred'}; # gets 'astaire'; old value is lost |
|
Hash elements spring into existence when you first assign to them:
|
$family_name{'wilma'} = 'flintstone'; # adds a new key (and value)
$family_name{'betty'} .= $family_name{'barney'}; # creates the element if needed |
|
To refer to the entire hash, use the
|
percent sign (%) as a prefix. So, the hash you’ve been
using for the last few pages is actually called %family_name. |
|
unwinding the hash;
|
turning a hash list of key-value pairs
@any_array = %some_hash; |
|
you can copy a hash using the obvious syntax of simply assigning one hash to another:
|
my %new_hash = %old_hash;
|
|
you could make an inverse hash:
|
my %inverse_hash = reverse %any_hash;
This takes %any_hash and unwinds it into a list of key-value pairs, making a list like (key, value, key, value, key, value, …). Then reverse turns that list end-for-end, making a list like (value, key, value, key, value, key, …). Now the keys are where the values used to be, and the values are where the keys used to be. this will work properly only if the values in the original hash were unique—otherwise you’d have duplicate keys in the new hash, and keys are always unique. Here’s the rule that Perl uses: the last one in wins |
|
=> and hashes
|
my %last_name = ( # a hash may be a lexical variable
'fred' => 'flintstone', 'dino' => undef, 'barney' => 'rubble', 'betty' => 'rubble', ); |
|
keys function
values function |
keys function yields a list of all the keys in a hash,
gives the corresponding values. |
|
each Function
|
If you wish to iterate over one of the usual ways is to use the each function, which returns a key-value pair as a two-element list.
In practice, the only way to use each is in a while loop, something like this: while ( ($key, $value) = each %hash ) { print "$key => $value\n"; } First, each %hash returns a key-value pair from the hash, as a two-element list; let’s say that the key is "c" and the value is 3, so the list is ("c", 3). That list is assigned to the list ($key, $value), so $key becomes "c", and $value becomes 3. |
|
If you need to go through the hash in order, simply sort the keys,
|
foreach $key (sort keys %hash) {
$value = $hash{$key}; print "$key => $value\n"; # Or, we could have avoided the extra $value variable: # print "$key => $hash{$key}\n"; } |
|
$books{'fred'} = 3;
$books{'wilma'} = 1; $books{"barney"} = 0; $books{"pebbles"} = undef; It’s easy to see whether an element of the hash is true or false; do this: if ($books{$someone}) { print "$someone has at least one book checked out.\n"; } |
$books{"barney"} = 0; # no books currently checked out
$books{"pebbles"} = undef; # no books EVER checked out; a new library card |
|
exists Function
|
see whether a key exists in the hash
if (exists $books{"dino"}) { print "Hey, there's a library card for dino!\n"; } |
|
delete Function
|
delete function removes the given key
my $person = "betty"; delete $books{$person}; # Revoke the library card for $person |
|
%ENV hash
|
Perl runs in a certain environment
our program can look at that environment by looking at the %ENV hash see a PATH key in %ENV: print "PATH is $ENV{PATH}\n"; |
|
match a pattern (regular expression) against the contents of $_, simply put the pattern between a pair of forward slashes (/). The simple sort of pattern is just a sequence of literal characters
|
$_ = "yabba dabba doo";
if (/abba/) { print "It matched!\n"; } |
|
To match any sort of space, you use \p{Space}:
|
if (/\p{Space}/) { # 26 different possible characters
print "The string has some whitespace.\n"; } |
|
If you want to match a digit, you use the Digit property:
|
if (/\p{Digit}/) { # 411 different possible characters
print "The string has a digit.\n"; } |
|
How about matching two hex digits, [0-9A-Fa-f], next to each other:
|
if (/\p{Hex}\p{Hex}/) {
print "The string has a pair of hex digits.\n"; } |
|
You can also match characters that don’t have a particular Unicode property. Instead of a lowercase p, you use an uppercase one to negate the property:
|
if (/\P{Space}/) { # Not space (many many characters!)
print "The string has one or more non-whitespace characters.\n"; } |
|
the dot (.) is a wildcard character
|
it matches any single character except a newline (which is represented by "\n"). So, the pattern /bet.y/ would match betty.
Or it would match betsy, or bet=y, or bet.y, or any other string that has bet, followed by any one character (except a newline), followed by y. It wouldn’t match bety or betsey, though, since those don’t have exactly one character between the t and the y. The dot always matches exactly one character. |
|
If you wanted the dot to match just a period, you can simply backslash it
|
a backslash in front of any metacharacter makes it nonspecial. So, the pattern /3\.14159/ doesn’t have a wildcard character.
|
|
backslash is the second metacharacter. If you mean a real backslash, just use a pair of them—a rule that applies just as well everywhere else in Perl:
|
$_ = 'a real \\ backslash';
if (/\\/) { print "It matched!\n"; } |
|
star (*) means to match the preceding item zero or more times.
|
/fred\t*barney/ matches any number of tab characters between fred and barney.
|
|
plus (+) is another example for quantifiers in regular expressions plus means to match the preceding item one or more times
/fred +barney/ matches if fred and barney are ... |
separated by spaces and only spaces
|
|
question mark (?) quantifier which means that the
|
preceding item is optional, preceding item may occur once or not at all
/bamm-?bamm/ matches either spelling: bamm-bamm or bammbamm |
|
/fred+/ matches strings like
|
freddddddddd because the quantifier only applies to the thing right before it, but strings like that don’t show up often in real life.
|
|
pattern /(fred)+/ matches strings like
|
fredfredfred,
You can use parentheses (“( )”) to group parts of a pattern |
|
d what about the pattern /(fred)*/?
|
That matches strings like hello, world
The star means to match zero or more repetitions of fred. When you’re willing to settle for zero, it’s hard to be disappointed! That pattern will match any string, even the empty string. |
|
When you use the parentheses around the dot, you match any non-newline character.
|
$_ = "abba";
if (/(.)\1/) { # matches 'bb' print "It matched same character next to itself!\n"; } The (.)\1 says that you have to match a character right next to itself. At first try, the (.) matches an a, but when it looks at the back reference, which says the next thing it must match is a, that trial fails. |
|
$_ = "yabba dabba doo";
if (/y(....) d\1/) { print "What it will match!\n"; } |
It matched the same after y and d
|
|
back reference in RegEx
|
(.)\1 says that you have to match a character right next to itself
$_ = "yabba dabba doo"; if (/y(.)(.)\2\1/) { # matches 'abba' print "What will this match!\n"; } matches 'abba' |
|
| in RegEx
/fred|barney|betty/ will match any string |
that mentions fred, or barney, or betty.
|
|
What will this match
/fred( |\t)+barney/, |
matches if fred and barney are separated by spaces, tabs, or a mixture of the two. The plus means to repeat one or more times; each time it repeats, the ( |\t) has the chance to match either a space or a tab.#
|
|
/fred( +|\t+)barney/. In this case, the separators must be all
|
the separators must be all spaces or all tabs.
|
|
/fred (and|or) barney/ matches any string
|
containing either of the two possible strings: fred and barney, or fred or barney.
|
|
character class in RegExp
|
list of possible characters inside square brackets ([]), matches any single character from within the class.
haracter class [abcwxyz] may match any one of those seven characters. |
|
For convenience, you may specify a range of characters with a
|
hyphen (-), so that class may also be written as [a-cw-z]
[a-zA-Z] to match any one letter out of that set of 52 |
|
caret (^) at the start of the character class
|
negates it.
[^def] will match any single character except one of those three |
|
you can abbreviate the character class for any digit as
|
\d
$_ = 'The HAL-9000 requires authorization to continue.'; if (/HAL-[\d]+/) { say 'The string mentions some model of HAL computer.'; } |
|
What shortcut is good for matching any whitespaces
|
\s
|
|
Matches with m//
|
you can put patterns in pairs of forward slashes, like /fred/. But this is actually a shortcut for the m// (pattern match operator), the pattern match operator. So, you could write that same expression as m(fred), m<fred>, m{fred}, or m[fred] using those paired delimiters, or as m,fred,"", m!fred!, m^fred^, or many other ways using nonpaired delimiters.*
qw// operator, you may choose any pair of delimiters to quote the contents. |
|
Case-Insensitive Matching with /i
|
To make a case-insensitive pattern match, so that you can match FRED as easily as fred or Fred, use the /i modifier:
print "Would you like to play a game? "; chomp($_ = <STDIN>); if (/yes/i) { # case-insensitive match print "In that case, I recommend that you go bowling.\n"; } |
|
Matching Any Character with /s
|
By default, the dot (.) doesn’t match newline
If you might have newlines in your strings, and you want the dot to be able to match them, the /s modifier will do the job. $_ = "I saw Barney\ndown at the bowling alley\nwith Fred\nlast night.\n"; if (/Barney.*Fred/s) { print "That string mentions Fred after Barney!\n"; } Without the /s modifier, that match would fail, since the two names aren’t on the same line. |
|
Adding Whitespace with /x
|
Since the /x allows whitespace inside the pattern, Perl ignores literal space or tab characters within the pattern.
|
|
Remember that Perl considers comments a type of whitespace, so you can put comments into that pattern to tell other people what you are trying to do:
|
/
-? # an optional minus sign [0-9]+ # one or more digits before the decimal point \.? # an optional decimal point [0-9]* # some optional digits after the decimal point /x # end of string |
|
Combining Option Modifiers
|
if (m{
barney # the little guy .* # anything in between fred # the loud guy }six) { # all three of /s and /i and /x print "That string mentions Fred after Barney!\n"; } |
|
/a tells Perl to use
/u tells Perl to use /l tells Perl to respect the |
/a tells Perl to use ASCII
/u tells Perl to use Unicode /l tells Perl to respect the locale use 5.014; /\w+/a # A-Z, a-z, 0-9, _ /\w+/u # any Unicode word charcter /\w+/l # The ASCII version, and word chars from the locale, # perhaps characters like OE from Latin-9 |
|
/l modifier to force Perl to interpret the regular expression using the locale’s rules:
|
$_ = <STDIN>;
my $OE = chr( 0xBC ); # get exactly what we intend if (/$OE/li) { # that's better print "Found $OE\n"; } |
|
\A anchor matches
|
\A anchor matches at the absolute beginning of a string, meaning that your pattern will not float down the string at all. This pattern looks for an https only at the start of
the string: m{\Ahttps?://}i |
|
\z anchor matches
|
If you want to anchor something to the end of the string, you use \z. This pattern matches .png only at the absolute end of the string:
m{\.png\z}i |
|
\Z vs \z
|
the \Z, which allows an optional newline after it
|
|
find strings that have fred at the end of any line instead of just at the end of the entire string
|
Perl 5, you can do that with the $ anchor and the /m modifier to turn on multiline matching. This pattern matches because in the multiline string, fred is at the end of a line:
/fred$/m |
|
you can use /\bfred\b/ to match
|
the word fred but not frederick or alfred or manfred mann
The \b anchor matches at the start or end of a group of \w characters. |
|
pattern /\bsearch\B/ will match
|
searches, searching, and searched, but not search or researching.
|
|
=~
|
The Binding Operator
tells Perl to match the pattern on the right against the string on the left, instead of matching against $_. my $some_other = "I dream of betty rubble."; if ($some_other =~ /\brub/) { print "Aye, there's the rub.\n"; } |
|
The following (bad) example is supposed to print a word
matched from $wilma. But if the match fails, it’s using whatever leftover string happens to be found in $1: my $wilma = '123'; $wilma =~ /([0-9]+)/; # What? $wilma =~ /([a-zA-Z]+)/; # What? print "Wilma's word was $1... or was it?\n"; # What? |
my $wilma = '123';
$wilma =~ /([0-9]+)/; # Succeeds, $1 is 123 $wilma =~ /([a-zA-Z]+)/; # BAD! Untested match result print "Wilma's word was $1... or was it?\n"; # Still 123! |
|
Consider a regular expression where we want to make part of it optional, but only capture another part of it. In this example, you want “bronto” to be optional, but to make it optional you have to group that sequence of characters with parentheses. Later in the pattern, you use an alternation to get either “steak” or “burger,” and you want to know which one you found:
|
if (/(bronto)?saurus (steak|burger)/) {
print "Fred wants a $2\n"; } |
|
noncapturing parentheses
|
Perl’s regular expressions have a way to use parentheses to group things but not trigger the capture groups,
use noncapturing parentheses around “bronto,” and the part that you want to remember now shows up in $1: if (/(?:bronto)?saurus (steak|burger)/) { print "Fred wants a $1\n"; } |
|
What is wrong
use 5.010; my $names = 'Fred or Barney'; if ( $names =~ m/(\w+) (and|or) (\w+)/ ) { # matches now say "I saw $1 and $2"; } |
The value in $2 is from the alternation and the second name is now in $3 (which we don’t output):
I saw Fred and or |
|
Named Captures
|
(?<LABEL>PATTERN) where you replace LABEL with your own names
You label the first capture name1 and the second one name2, and look in $+{name1} and $+{name2} to find their values: use 5.010; my $names = 'Fred or Barney'; if ( $names =~ m/(?<name1>\w+) (?:and|or) (?<name2>\w+)/ ) { say "I saw $+{name1} and $+{name2}"; } |
|
strange names to many of Perl’s built-in variables, names that “break the rules.”
|
In this case, the names are punctuation marks: $&, $`,
and $'. |
|
$&
|
The part of the string that actually matched the pattern is automatically stored in $&:
if ("Hello there, neighbor" =~ /\s(\w+),/) { print "That actually matched '$&'.\n"; } ("Hello there, neighbor" =~ /\s(\w+),/) { print "That actually matched '$&'.\n"; } That tells you that the part that matched was " there," (with a space, a word, and a comma). |
|
$`
$' |
Whatever came before the matched section is in $`, and whatever was after it is in $'.
if ("Hello there, neighbor" =~ /\s(\w+),/) { print "That was ($`)($&)($').\n"; } The message shows the string as (Hello)( there,)( neighbor), |
|
automatic match variables
|
$&
$` $' |
|
Perl 5.10 or higher instead of $&, $`, $' and their side effect of slowing down all matching you can use
|
${^PREMATCH}, ${^MATCH}, or ${^POSTMATCH}.
use 5.010; if ("Hello there, neighbor" =~ /\s(\w+),/p) { print "That actually matched '${^MATCH}'.\n"; } That tells you that the part that matched was " there," (with a space, a word, and a comma). if ("Hello there, neighbor" =~ /\s(\w+),/p) { print "That was (${^PREMATCH})(${^MATCH})(${^POSTMATCH}).\n"; } The message shows the string as (Hello)( there,)( neighbor), |
|
quantifier in a pattern def
|
quantifier in a pattern means to repeat the preceding item a certain number of times. In addition to *,+ and > quantifiers Perl offers {}
So the pattern /a{5,15}/ will match from five to fifteen repetitions of the letter a. |
|
/\w{8}/ will match
|
exactly eight word characters (occurring as part of a larger string, perhaps).
|
|
/,{5}chameleon/ matches
|
“comma comma comma comma comma chameleon”.
|
|
At the top of the precedence chart are the
|
parentheses, (“( )”), used for grouping and capturing. Anything in parentheses will “stick together” more lightly than anything else
|
|
second level is the quantifiers.
|
These are the repeat operators—star (*), plus (+), and question mark (?)—as well as the quantifiers made with curly braces, like {5,15}, {3,}, and {5}. These always stick to the item they’re following.
|
|
The third level of the precedence chart holds anchors and sequence.
|
The anchors are the \A, \Z, \z, ^, $, \b, \B
|
|
next-to-lowest level of precedence is the
|
the vertical bar (|) of alternation. Since this is at the bottom of the chart, it effectively cuts the pattern into pieces. It’s at the bottom of the chart because you want the letters in the words in /fred|barney/ to stick together more tightly than the alternation. If alternation were higher priority than sequence, that pattern would mean to match fre, followed by a choice of d or b, followed by arney. So, alternation is at the bottom of the chart, and the letters within the names stick together
|
|
At the lowest level, there are the so-called atoms
|
atoms that make up the most basic pieces of the pattern. These are the individual characters, character classes, and back references.
|
|
/\A(fred|barney)\z/, which will match
|
if the whole line has nothing but fred or nothing but barney
|
|
/\A(\w+)\s+(\w+)\z/ matches
|
lines that have a “word,” some required whitespace, and another “word,” with nothing else before or after. That might be used to match lines like fred flintstone, for example. The parentheses around the words aren’t needed for grouping, so they may be intended to save those substrings into the regular expression captures.
|
|
This program is useful to test out a pattern on some strings and see just what it matches, and where
|
#!/usr/bin/perl
while (<>) { # take one input line at a time chomp; if (/YOUR_PATTERN_GOES_HERE/) { print "Matched: |$`<$&>$'|\n"; # the special match vars } else { print "No match: |$_|\n"; } } |