Below are notes I've taken while reading books on Perl and just by coding. There are bits of code with notes and descriptions on what they do.
Open/Close Commands
open(OUT_FILE,">output.txt"); # Open for output (create) open(OUT_FILE,">>output.txt"); # Open for output (append, or create) open(OUT_FILE,"input-command |"); # Input filter/redirect open(OUT_FILE,"| output-command"); # Output filter/redirect close(OUT_FILE);
Simple Read Loop
open(MY_FILE,"some.dat") or die ("Can't open file."); while( <MY_FILE> ) { print if 1 .. 5; #print the first 5 lines of the file } close(MY_FILE);
Printing to a file
open(OUT_FILE,">output.txt"); # Open for output print OUT_FILE "Yo!\n"; close(OUT_FILE);
Specifying File to open for input on the command line
open(MY_FILE,$ARGV[0]) or die ("Can't open $ARGV[0]: $!\n");
Reading filenames into and array.
@logfilenames = </var/log/*>;
String Concatenation
$string1 = "Hi " . "There!";
Testing if a string contains a sub string
if ($a_string =~ m/Elephant/) if ($a_string =~ m/Elephant/i) # case insensitive search
Search, and Replace on a String
$a_string =~ s/Peking/Bejing/g;
Change String to all upper, or lower case
$string =~ tr/a-z/A-Z/; # convert to upper case $string =~ tr/A-Z/a-z/; # convert to lower case
Finding First occurence of sub string in string
$tmp_ptr = index($current_line,"something");
Grabbing a sub-string from a string
$a_bit_of_string = substr($some_string,$start_pos,$length);
Splitting Strings by a character
$a_string= "name=john"; ($tmp1, $tmp2) = split(/=/,$a_string); #$tmp1 now has "name" in it, $tmp2 has "john"
Assignments and Operations
$a = 3 - 4; # Subtracts 4 from 3. Stores in $a $a = 3 + 4; # Adds 3 and 4. Stores in $a $a = 3 * 4; # Multiply 3 and 4 $a = 3 / 4; # Divide 3 by 4 $a = 3 ** 4; # three to the power of four $a = 3 % 4; # Remainder of 3 divided by 4 ++$x; # Increment $x. Then return its value $x++; # Return $x. Then increment its value --$x; # Decrement $x. Then return its value $x--; # Return $x. Then decrement its value $x = $y . $z; # Concatenate $y and $z $x = $y x $z; # $y repeated $z times Assigning values $x = $y; # Assign $y to $x $x += $y; # Add $y to $x $x -= $y; # Subtract $y from $x $x .= $y; # Append $y onto $x Whan a value is assigned with $x = $y it makes a copy of $y and then assigns it to $x. The next time you change $y it will not change $x. $a = 'rock'; $b = 'paper'; print $a . $b; #prints: rockpaper print $a. '123' .$b; #prints: rock123paper print "$a 123 $b"; #prints: rock 123 paper
Global Scalar Variables
The $_ variable
Lots of Perl functions and operators will modify the contents of $_ if you do not explicitly specify a scalar variable on which they are to operate. These functions and operators work with the $_ variable by default: * The pattern-matching operator * The substitution operator * The translation operator * The <> operator, if it appears in a while or for conditional expression * The chop function * The print function * The study function print ("found") if ($_ =~ /xyz/); print ("found") if (/xyz/); #You can leave off =~ if using $_ to match s/abc/xyz/; #Substitution operator uses the $_ variable if you do not specify a variable using =~ $substitcount = s/abc/xyz/g #Substituting inside $_, returns the number of substitutions performed tr/a-z/A-Z/; #Translates all lowercase letters in the value stored in $_ to their uppercase $transcount = tr/z/z/; #Counts the number of z's in $_. Then hash %transcount keeps track of #the number of occurrences of each of the characters being counted. while (<>) { #Resulting input line is assigned to the scalar variable $_ } while (<>) { chop; #Uses $_ to get rid of the newline character print; #Prints whats in $_ } print; #Just prints whats in $_ by default
The $0 variable
The $0 variable contains the name of the Perl script you are running. print ("Name of current script printing this is: $0\n");
The $< and $> variables
The $< variable contains the real user ID and $> contains the effective user ID for user of the program. If they have more than one id $< and $> will contain a list of user IDs, with each pair of user IDs being separated by spaces. Use the split function to retrieve them. print ("UserID(s) running this script are: $<\n");
The $( and $) variables
The $( variable contains the real group ID and $) contains the effective group ID for user of the program. If they are in more than one group $( and $) contain a list of group IDs, with each pair of group IDs being separated by spaces. Use the split function to retrieve them. print ("GroupID(s) running this script are: $(\n");
The $] variable
The $] contains the the current version of Perl running. And other info. print ("Info about the Perl installed on this system: $]\n");
The $/ variable
The $/ contains the current input line separator. Newline character is the default. $/ = "->"; #Set the input line separator to ->. It will keep reading a line until it hits ->
The $\ variable
The $\ contains the current output line separator. It is set to a null character by default. Which means no output. This is automatically printed after every call to print. $\ = "->"; print ("Current output line separator is: $\"); #you be shown -> after every print statement
The $, variable
The $, contains the character or sequence of characters that are printed between elements when print is called. Defaults to a null character. $, = "->"; $x = "foo"; $y = "bar"; print ($x, $y); #prints: foo->bar
The $" variable
The $" contains the array element separator. Defaults to a single blank space. @array = ("x", "y", "z"); print ("@array\n"); # Prints: x y z $" = ","; @array = ("x", "y", "z"); print ("@array\n"); # Prints: x,y,z
The $# variable
The $# variable holds the number output format. Defaults to 20-digit floating point number in compact format. $x = 21.9876543219876543219876; $# = "%.5g"; print "$x\n"; # Prints: 21.988
The $? variable
The $? variable checks return value from last last pipe close, backtick command or system operator. Exit value of 0 means everything looks like it went ok. To retrieve the actual exit value, use the >> operator to shift the eight bits to the right: $returncode = $? >> 8;. $command = `hostname`; if ($? != 0) { die ("\nProgram did not exit correctly. Try a test manualy.\n"); }
The $! variable
The $! system error messge variable. When the system library function generates an error, the error code it generates (by the function) gets assigned to this variable. open (FILE1, "nonexistantfile") or die ("\nProgram died trying to open file with error: $!\n");
The $. variable
The $. variable contains the line number of the last line read from an input file. open (FILE1, "filename") || die ("Can't open file1\n"); $input = <FILE1>; print ("line number is $.\n"); close(FILE1);
The $$ variable
The $$ variable contains the process id of your script your running. print ("The process id of this script is: $$\n");
The $ARVG variable
When the <> operator reads from a file for the first time, it assigns the name of the file to the $ARGV system variable. The Perl interpreter reads input from each file named on the command line. Code below needs to be executed from the command line like: ./program file1 . So the name of the filename you list is read into the $ARGV var. while (<>) { print ("Filename being read currently is: $ARGV \n"); exit(); } close(FILE1);
The $^T variable
The $^T variable contains the time at which your program began running. This time is in the same format as is returned by the time function. The number of seconds since January 1, 1970. ($Second, $Minute, $Hour, $Day, $Month, $Year, $WeekDay, $DayOfYear, $IsDaylightSavings) = localtime($^T); $Month += 1; $Year += 1900; if ($Month < 10) { $Month = "0" . $Month; } if ($Hour < 10) { $Hour = "0" . $Hour; } if ($Minute < 10) { $Minute = "0" . $Minute; } if ($Second < 10) { $Second = "0" . $Second; } if ($Day < 10) { $Day = "0" . $Day; } print "Program started at Time: $Hour:$Minute:$Second Date: $Month-$Day-$Year\n";
Pattern System Variables
The $1 $2 $3 and so on variables
In a pattern match you can enclose a sub-pattern in parentheses. Like: /(\W+)/. After there is a pattern match, the system variables $1, $2, and so on get set to the subpatterns enclosed in parentheses. $test = "123ABC"; if ($test =~ /(\d+)([A-Z])/) { #Match one or more digits and an uppercase letter after digits print "Found $1 $2\n"; #Prints: Found 123 A }
The $& variable
In a pattern match you can use $& to retrieve the entire pattern. $test = "123ABC"; if ($test =~ /(\d+)([A-Z])/) { #Match one or more digits and an uppercase letter after the digits print "Found $&\n"; #Prints: Found 123A }
The $` and $' variables
In a pattern match you can use $& to retrieve the entire pattern. The rest of the string is stored in two other system variables.The unmatched text preceding the match is stored in the $` variable. The unmatched text following the match is stored in the $' variable. $test = "123ABC"; if ($test =~ /(\d)([A-Z])/) { #Match a digit and a uppercase letter print "Found $` before $& and $' after it\n"; #Prints: Found 12 before 3A and BC after it }
The $+ variable
In a pattern match the $+ variable matches the last subpattern enclosed in parentheses. $test = "123ABC"; if ($test =~ /(\d+)([A-Z])/) { #Match one or more digits and an uppercase letter after the digits print "Last matched subpattern was $+\n"; #Prints: Last matched subpattern was A }
Array System Variables
The @_ variable
The @_ variable is defined inside each subroutine. It is a list (array) of all the arguments passed to the subroutine. yellowsub ("1starg","2ndarg"); # run subroutine with 2 arguments. Prints: Argument1 1starg Argument2: 2ndarg sub yellowsub { print "Argument1: $_[0] Argument2: $_[1]"\n; } # create subroutine
The @ARGV variable
The @ARGV variables get set when running a Perl program from the command line. You can specify the values that are to be passed to the program by including them on the command line. #execute program from command line: ./test.pl foo bar print ("@ARGV\n"); #prints: foo bar
The @F variable
If you specify the -n or -p option, you can also supply the -a option. This option tells the Perl interpreter to break each input line into individual words. It will throw away all tabs and spaces. These words are stored in the built-in array variable @F.
The @INC variable
The @INC array variable contains a list of directories that are searched for files requested by the function require. The directories specified by the -I option are searched first. Then the Perl library directory (which is normally /usr/local/bin/perl). Last the current working directory.
The %INC variable
The associative array %INC will list files requested by the require function that it has already found. When require function finds a file, the hash element $INC{file} is defined. This element is the name of the file. The value of this hash element is the location of the actual file.
The %ENV variable
The %ENV hash lists the environment variables defined for the program and their values. The environment variables are the array subscripts, and the values of the variables are the values of the array elements. print $ENV{PATH}; #Prints the user executing the scripts path
The %SIG variable
This array contains one element for each available signal. The signal name serves as the subscript for the element. For example, the INT (interrupt) signal is represented by the $SIG{"INT"} element.
Looping thru an Array, and examining each value
foreach (@some_array){ print $_; # the value is in $_ by default }
Adding a key/value pair to an hash
$hashname{key}{value} = "newvalue";
Keeping a running count of the times a string is equal to a certain value
$different_strings{$a_string} = $different_strings{$a_string}++;
Looping thru an Associative Array, and examining each value
foreach (@some_assoc_array){ print $_; # the value is in $_ by default }
Looping thru an Associative Array, and printing each key and value
foreach $key keys(%hash_1) { print "$key => $hash_1{$key}\n"; }
Print the keys of a Associative Array in sorted order (alphabetical/ascending)
foreach $key (sort (keys(%hash_1))) { print "$key => $hash_1{$key}\n"; }
Testing if a Associative Array has a key/value
if ($some_assoc_arry{$the_key} eq "" ) { print "key not in array"; }
Testing if a Associative Array has any key/value. Is it empty?
if (%some_assoc_array) { print "Hash is not empty. It has data!"; }
Hashes of Arrays
%families = ( smith => [ "jim", "bob" ], jones => [ "roger", "jan", "roy" ], kent => [ "mark", "mag", "bert" ], );
Add another Array to the hash of arrays
$families{bluejean} = [ "norma jean", "jack", "pam", "port" ];
Append new members to an existing existing array in the hash
push @{ $families{smith} }, "jen", "jack";
Accessing (changing) the first element of an array in the hash
# Change "jim" to "Jimmy" $families{smith}[0] = "Jimmy";
You can print all of the families in the arrays by looping through the keys of the hash
for $family ( keys %families ) { print "$family => @{ $families{$family} }\n"; }
Sort the arrays in the hash by how many elements they have
for $family ( sort { @{$families{$b}} <=> @{$families{$a}} } keys %families ) { print "$family => @{ $families{$family} }\n" }
Loop thru the hash of arrays. Sort on the second value (name) in the arrays. Print the key and only values 0 and 1.
for $family ( sort { $families {$b}[1] cmp $families{$a}[1] } keys %families ) { print "Key => $family\n Value0 => $families{$family}[0]\n Value1 => $families{$family}[1]\n"; }
IF statement
if ( $string eq "Hi There" ) if ( "A" gr "B" ) if ( $num == 10) if ( $num != 10) if ( $num > 10) if ($x > 208 && $x < 275 && $y > 64 && $y < 78 ) # && is AND, || is OR
Subroutine example
# create subroutine sub write_current_date { } # Run subroutine write_current_date ();
Print a formatted print line-
#Define the Header Line format STDOUT_TOP= Name Address Num ----------- ---------------------- ----- . #Dot marks end of format #Define the detail Line format STDOUT = @<<<<<<<<<<<<<<<@<<<<<<<<<<<<<<<<<<<@######
Network example
use Socket; # use port 9999 as default $port = shift || 9999; # create a socket, make it reusable socket(SERVER, PF_INET, SOCK_STREAM, getprotobyname('tcp')) or die "socket: $!"; setsockopt(SERVER, SOL_SOCKET, SO_REUSEADDR, 1) or die "setsock: $!"; # grab a port on this machine $paddr = sockaddr_in($port, inet_aton("127.0.0.1")); # bind to a port, then listen bind(SERVER, $paddr) or die "bind: $!"; listen(SERVER, SOMAXCONN) or die "listen: $!"; accept(CLIENT, SERVER); select CLIENT; #select filehandle $| = 1; #make filehandle hot so data shows up unbuffered print CLIENT "\ntest\n"; # telnet to port 9999 on localhost to see your printed word above.
Checking if system call failed or not.
@args = ("wget","-qN","-T15"); $rc = 0xffff & system @args; if ($rc != 0) { printf "system(%s) returned %#04x: ", "@args", $rc; die ("\nProgram did not exit correctly. Try a test manualy.\n"); }
I needed to find a simple way to search for a pattern of text on one line in a file and replace it with two lines of text. I could not find any editor like Vi or sed or awk that could do it easily. I'm betting they can though. These editors would do a pattern match in the search area but when it came to the replace area it would not take the same type of replace commands like a new line (\n) command. So of course Perl comes to the rescue and it's as easy as pie. Below is the example of how to do it. The line is executed from a shell and finds the word "test" and replaces it with the word test1 and test2 both on seperate lines. After the substitution it backs up the orginal file specified at the end (the word "file" below) with the extention .bak. You can also use a "*" instead a file name so it will work on all the files in a directory.
perl -i.bak -pe 's/test/test1\ntest2/' file
Really it's put best from Perl.com on different ways to use this feature of Perl:
perl -pe 'some code' < input.txt > output.txt
This takes records from input.txt, carries out some kind of transformation, and writes the transformed record to output.txt. In some cases you don't want to write the changed data to a different file, it's often more convenient if the altered data is written back to the same file.
You can get the appearance of this using the -i option. Actually, Perl renames the input file and reads from this renamed version while writing to a new file with the original name. If -i is given a string argument, then that string is appended to the name of the original version of the file. For example, to change all occurrences of "PHP" to "Perl" in a data file you could write something like this:
perl -i -pe 's/\bPHP\b/Perl/g' file.txt
Perl reads the input file a line at a time, making the substitution, and then writing the results back to a new file that has the same name as the original file -- effectively overwriting it. If you're not so confident of your Perl abilities you might take a backup of the original file, like this:
perl -i.bak -pe 's/\bPHP\b/Perl/g' file.txt