pantz.org banner
Perl notes
Posted on 11-03-2006 02:13:00 UTC | Updated on 11-03-2006 02:13:00 UTC
Section: /software/perl/ | Permanent Link

Below are notes I've taken while reading books on Perl and just by coding. There are bits of code with notes and descriptions on what they do.

Perl Notes

Files

Strings

Scalers

System Variables

Arrays

Associative Arrays (Hashes)

If Statement

Formatted Reports

Subroutines

Network

Misc Stuff


Files

Open/Close Commands

open(OUT_FILE,">output.txt");     # Open for output (create)
open(OUT_FILE,">>output.txt"); # Open for output (append, or create)
open(OUT_FILE,"input-command |");    # Input filter/redirect
open(OUT_FILE,"| output-command");   # Output filter/redirect
close(OUT_FILE);

Simple Read Loop


open(MY_FILE,"some.dat") or die ("Can't open file.");
while( <MY_FILE> ) {
print if 1 .. 5; #print the first 5 lines of the file
}
close(MY_FILE);

Printing to a file

open(OUT_FILE,">output.txt");  # Open for output
print OUT_FILE "Yo!\n";
close(OUT_FILE);

Specifying File to open for input on the command line

open(MY_FILE,$ARGV[0]) or die ("Can't open $ARGV[0]: $!\n");

Reading filenames into and array.

@logfilenames = </var/log/*>;

Strings

String Concatenation

$string1 = "Hi " . "There!";

Testing if a string contains a sub string

if ($a_string =~ m/Elephant/)

if ($a_string =~ m/Elephant/i) # case insensitive search

Search, and Replace on a String

$a_string =~ s/Peking/Bejing/g;

Change String to all upper, or lower case

  $string =~ tr/a-z/A-Z/;  # convert to upper case
  $string =~ tr/A-Z/a-z/;  # convert to lower case

Finding First occurence of sub string in string

  $tmp_ptr = index($current_line,"something");

Grabbing a sub-string from a string

    $a_bit_of_string = substr($some_string,$start_pos,$length);

Splitting Strings by a character

$a_string= "name=john";
($tmp1, $tmp2) = split(/=/,$a_string);
#$tmp1 now has "name" in it, $tmp2 has "john"

Scalers

Assignments and Operations


$a = 3 - 4;     # Subtracts 4 from 3. Stores in $a
$a = 3 + 4;	# Adds 3 and 4. Stores in $a
$a = 3 * 4;	# Multiply 3 and 4
$a = 3 / 4;	# Divide 3 by 4
$a = 3 ** 4;	# three to the power of four
$a = 3 % 4;	# Remainder of 3 divided by 4
++$x;		# Increment $x. Then return its value
$x++;		# Return $x. Then increment its value
--$x;		# Decrement $x. Then return its value
$x--;		# Return $x. Then decrement its value
$x = $y . $z;	# Concatenate $y and $z
$x = $y x $z;	# $y repeated $z times

Assigning values

$x = $y;	# Assign $y to $x
$x += $y;	# Add $y to $x
$x -= $y;	# Subtract $y from $x
$x .= $y;	# Append $y onto $x

Whan a value is assigned with $x = $y it makes a copy of $y and then assigns it to $x. The next time you change $y it will not 
change $x.

$a = 'rock';
$b = 'paper';
print $a . $b; #prints: rockpaper
print $a. '123' .$b; #prints: rock123paper
print "$a 123 $b"; #prints: rock 123 paper

System Variables

Global Scalar Variables

The $_ variable

Lots of Perl functions and operators will modify the contents of $_ if you
do not explicitly specify a scalar variable on which they are to operate.

These functions and operators work with the $_ variable by default:

    * The pattern-matching operator
    * The substitution operator
    * The translation operator
    * The <> operator, if it appears in a while or for conditional expression
    * The chop function
    * The print function
    * The study function 

print ("found") if ($_ =~ /xyz/);
print ("found") if (/xyz/); #You can leave off =~ if using $_ to match

s/abc/xyz/; #Substitution operator uses the $_ variable if you do not specify a variable using =~
$substitcount = s/abc/xyz/g #Substituting inside $_, returns the number of substitutions performed

tr/a-z/A-Z/; #Translates all lowercase letters in the value stored in $_ to their uppercase
$transcount = tr/z/z/; #Counts the number of z's in $_. Then hash %transcount keeps track of
#the number of occurrences of each of the characters being counted. 

while (<>) {
 #Resulting input line is assigned to the scalar variable $_
}

while (<>) {
  chop; #Uses $_ to get rid of the newline character 
  print; #Prints whats in $_ 
}

print; #Just prints whats in $_ by default


The $0 variable

The $0 variable contains the name of the Perl script you are running.

print ("Name of current script printing this is: $0\n");

The $< and $> variables


The $< variable contains the real user ID and $> contains the effective user ID for user of the program.
If they have more than one id $< and $> will contain a list of user IDs, with each pair of user IDs being
separated by spaces. Use the split function to retrieve them.

print ("UserID(s) running this script are: $<\n");


The $( and $) variables

The $( variable contains the real group ID and $) contains the effective group ID for user of the program.
If they are in more than one group $( and $) contain a list of group IDs, with each pair of group IDs being
separated by spaces. Use the split function to retrieve them.

print ("GroupID(s) running this script are: $(\n");

The $] variable

The $] contains the the current version of Perl running.  And other info.

print ("Info about the Perl installed on this system: $]\n");

The $/ variable


The $/ contains the current input line separator. Newline character is the default. 

$/ = "->"; #Set the input line separator to ->. It will keep reading a line until it hits ->

The $\ variable

The $\ contains the current output line separator. It is set to a null character by default.
Which means no output. This is automatically printed after every call to print.

$\ = "->";
print ("Current output line separator is: $\"); #you be shown -> after every print statement

The $, variable

The $, contains the character or sequence of characters that are printed between elements when print is called.
Defaults to a null character.

$, = "->";
$x = "foo";
$y = "bar";
print ($x, $y); #prints: foo->bar

The $" variable

The $" contains the array element separator. Defaults to a single blank space.

@array = ("x", "y", "z");
print ("@array\n"); # Prints: x y z
$" = ",";
@array = ("x", "y", "z");
print ("@array\n"); # Prints: x,y,z

The $# variable

The $# variable holds the number output format. Defaults to 20-digit floating point number in compact format.

$x = 21.9876543219876543219876;
$# = "%.5g";
print "$x\n"; # Prints: 21.988

The $? variable

The $? variable checks return value from last last pipe close, backtick command or system operator.
Exit value of 0 means everything looks like it went ok. To retrieve the actual exit value, use the >> operator
to shift the eight bits to the right: $returncode = $? >> 8;.

$command = `hostname`;
if ($? != 0) {
  die ("\nProgram did not exit correctly. Try a test manualy.\n");
}

The $! variable

The $! system error messge variable. When the system library function generates an error, the error code
it generates (by the function) gets assigned to this variable.

open (FILE1, "nonexistantfile") or die ("\nProgram died trying to open file with error: $!\n");

The $. variable

The $. variable contains the line number of the last line read from an input file.

open (FILE1, "filename") || die ("Can't open file1\n");
$input = <FILE1>;
print ("line number is $.\n");
close(FILE1);

The $$ variable

The $$ variable contains the process id of your script your running.

print ("The process id of this script is: $$\n");

The $ARVG variable

When the <> operator reads from a file for the first time, it assigns the name of the file to the $ARGV system variable.
The Perl interpreter reads input from each file named on the command line. Code below needs to be executed from the
command line like: ./program file1 . So the name of the filename you list is read into the $ARGV var. 

while (<>) {
  print ("Filename being read currently is: $ARGV \n");
  exit();
}
close(FILE1);

The $^T variable

The $^T variable contains the time at which your program began running.
This time is in the same format as is returned by the time function.
The number of seconds since January 1, 1970.

($Second, $Minute, $Hour, $Day, $Month, $Year, $WeekDay, $DayOfYear, $IsDaylightSavings) = localtime($^T);
$Month += 1;
$Year += 1900;
if ($Month < 10) { $Month = "0" . $Month; }
if ($Hour < 10) { $Hour = "0" . $Hour; }
if ($Minute < 10) { $Minute = "0" . $Minute; }
if ($Second < 10) { $Second = "0" . $Second; }
if ($Day < 10) { $Day = "0" . $Day; }

print "Program started at Time: $Hour:$Minute:$Second Date: $Month-$Day-$Year\n";

Pattern System Variables

The $1 $2 $3 and so on variables

In a pattern match you can enclose a sub-pattern in parentheses. Like: /(\W+)/.
After there is a pattern match, the system variables $1, $2, and so on get set
to the subpatterns enclosed in parentheses.

$test = "123ABC";
if ($test =~ /(\d+)([A-Z])/) { #Match one or more digits and an uppercase letter after digits
  print "Found $1 $2\n"; #Prints: Found 123 A
}

The $& variable

In a pattern match you can use $& to retrieve the entire pattern.

$test = "123ABC";
if ($test =~ /(\d+)([A-Z])/) { #Match one or more digits and an uppercase letter after the digits
  print "Found $&\n"; #Prints: Found 123A
}

The $` and $' variables

In a pattern match you can use $& to retrieve the entire pattern. The rest of the string is
stored in two other system variables.The unmatched text preceding the match is stored in
the $` variable. The unmatched text following the match is stored in the $' variable. 

$test = "123ABC";
if ($test =~ /(\d)([A-Z])/) { #Match a digit and a uppercase letter
  print "Found $` before $& and $' after it\n"; #Prints: Found 12 before 3A and BC after it
}

The $+ variable

In a pattern match the $+ variable matches the last subpattern enclosed in parentheses.

$test = "123ABC";
if ($test =~ /(\d+)([A-Z])/) { #Match one or more digits and an uppercase letter after the digits
  print "Last matched subpattern was  $+\n"; #Prints: Last matched subpattern was A
}

Array System Variables

The @_ variable

The @_ variable is defined inside each subroutine. It is a list (array) of all the arguments passed to the subroutine.

yellowsub ("1starg","2ndarg"); # run subroutine with 2 arguments. Prints: Argument1 1starg Argument2: 2ndarg
sub yellowsub { print "Argument1: $_[0] Argument2: $_[1]"\n; } # create subroutine

The @ARGV variable

The @ARGV variables get set when running a Perl program from the command line. You can specify the values that are
to be passed to the program by including them on the command line.

#execute program from command line: ./test.pl foo bar
print ("@ARGV\n"); #prints: foo bar

The @F variable

If you specify the -n or -p option, you can also supply the -a option. This option tells the Perl interpreter to break
each input line into individual words. It will throw away all tabs and spaces. These words are stored in the built-in
array variable @F.

The @INC variable

The @INC array variable contains a list of directories that are searched for files requested by the function require.
The directories specified by the -I option are searched first. Then the Perl library directory (which is normally
/usr/local/bin/perl). Last the current working directory. 

The %INC variable

The associative array %INC will list files requested by the require function that it has already found.
When require function finds a file, the hash element $INC{file} is defined. This element is the name of the file.
The value of this hash element is the location of the actual file. 

The %ENV variable

The %ENV hash lists the environment variables defined for the program and their values. The environment variables are
the array subscripts, and the values of the variables are the values of the array elements.

print $ENV{PATH}; #Prints the user executing the scripts path

The %SIG variable

This array contains one element for each available signal. The signal name serves as the subscript for the element.
For example, the INT (interrupt) signal is represented by the $SIG{"INT"} element.

Arrays

Looping thru an Array, and examining each value

foreach (@some_array){
   print $_; # the value is in $_ by default
}

Associative Arrays (Hashes)

Adding a key/value pair to an hash

$hashname{key}{value} = "newvalue";

Keeping a running count of the times a string is equal to a certain value

$different_strings{$a_string} = $different_strings{$a_string}++;

Looping thru an Associative Array, and examining each value

foreach (@some_assoc_array){
   print $_; # the value is in $_ by default
} 

Looping thru an Associative Array, and printing each key and value

foreach $key keys(%hash_1) {
   print "$key => $hash_1{$key}\n";
}

Print the keys of a Associative Array in sorted order (alphabetical/ascending)

foreach $key (sort (keys(%hash_1))) {
   print "$key => $hash_1{$key}\n";
}

Testing if a Associative Array has a key/value

if ($some_assoc_arry{$the_key} eq "" ) {
   print "key not in array"; 
}

Testing if a Associative Array has any key/value. Is it empty?

if (%some_assoc_array) {
print "Hash is not empty. It has data!";
}

Hashes of Arrays

%families = (
    smith => [ "jim", "bob" ],
    jones => [ "roger", "jan", "roy" ],
    kent  => [ "mark", "mag", "bert" ],
);

Add another Array to the hash of arrays

$families{bluejean} = [ "norma jean", "jack", "pam", "port" ];

Append new members to an existing existing array in the hash


push @{ $families{smith} }, "jen", "jack";

Accessing (changing) the first element of an array in the hash

# Change "jim" to "Jimmy"
$families{smith}[0] = "Jimmy";

You can print all of the families in the arrays by looping through the keys of the hash

for $family ( keys %families ) {
    print "$family =>  @{ $families{$family} }\n";
}

Sort the arrays in the hash by how many elements they have

for $family ( sort { @{$families{$b}} <=> @{$families{$a}} } keys %families ) {
    print "$family => @{ $families{$family} }\n"
}

Loop thru the hash of arrays. Sort on the second value (name) in the arrays. Print the key and only values 0 and 1.

for $family ( sort {  $families {$b}[1] cmp $families{$a}[1] }  keys %families ) {
  print "Key => $family\n
  Value0 => $families{$family}[0]\n
  Value1 => $families{$family}[1]\n";
}

IF Statement

IF statement

if ( $string eq "Hi There" )
if ( "A" gr "B" )
if ( $num == 10)
if ( $num != 10)
if ( $num > 10)
if ($x > 208 && $x < 275 && $y > 64 && $y < 78 )   # && is AND, || is OR

Subroutines

Subroutine example

# create subroutine
sub write_current_date 
{
}

# Run subroutine
write_current_date ();

Reports

Print a formatted print line-

#Define the Header Line
format STDOUT_TOP=
Name               Address                Num
-----------     ----------------------   -----
.   #Dot marks end of format

#Define the detail Line
format STDOUT =
@<<<<<<<<<<<<<<<@<<<<<<<<<<<<<<<<<<<@######


Network

Network example


use Socket;

# use port 9999 as default
$port = shift || 9999;

# create a socket, make it reusable
socket(SERVER, PF_INET, SOCK_STREAM, getprotobyname('tcp')) or die "socket: $!";
setsockopt(SERVER, SOL_SOCKET, SO_REUSEADDR, 1) or die "setsock: $!";

# grab a port on this machine
$paddr = sockaddr_in($port, inet_aton("127.0.0.1"));

# bind to a port, then listen
bind(SERVER, $paddr) or die "bind: $!";
listen(SERVER, SOMAXCONN) or die "listen: $!";
accept(CLIENT, SERVER);

select CLIENT; #select filehandle
$| = 1; #make filehandle hot so data shows up unbuffered

print CLIENT "\ntest\n";

# telnet to port 9999 on localhost to see your printed word above.


Misc

Checking if system call failed or not.

@args = ("wget","-qN","-T15");
$rc = 0xffff & system @args;
if ($rc != 0) {
  printf "system(%s) returned %#04x: ", "@args", $rc;
  die ("\nProgram did not exit correctly. Try a test manualy.\n");
}

Reddit!

Related stories

Doing perl search and replace from the commandline
Posted on 10-11-2005 02:27:00 UTC | Updated on 10-11-2005 02:27:00 UTC
Section: /software/perl/ | Permanent Link

I needed to find a simple way to search for a pattern of text on one line in a file and replace it with two lines of text. I could not find any editor like Vi or sed or awk that could do it easily. I'm betting they can though. These editors would do a pattern match in the search area but when it came to the replace area it would not take the same type of replace commands like a new line (\n) command. So of course Perl comes to the rescue and it's as easy as pie. Below is the example of how to do it. The line is executed from a shell and finds the word "test" and replaces it with the word test1 and test2 both on seperate lines. After the substitution it backs up the orginal file specified at the end (the word "file" below) with the extention .bak. You can also use a "*" instead a file name so it will work on all the files in a directory.

perl -i.bak -pe 's/test/test1\ntest2/' file

Really it's put best from Perl.com on different ways to use this feature of Perl:

perl -pe 'some code' < input.txt > output.txt

This takes records from input.txt, carries out some kind of transformation, and writes the transformed record to output.txt. In some cases you don't want to write the changed data to a different file, it's often more convenient if the altered data is written back to the same file.

You can get the appearance of this using the -i option. Actually, Perl renames the input file and reads from this renamed version while writing to a new file with the original name. If -i is given a string argument, then that string is appended to the name of the original version of the file. For example, to change all occurrences of "PHP" to "Perl" in a data file you could write something like this:

perl -i -pe 's/\bPHP\b/Perl/g' file.txt

Perl reads the input file a line at a time, making the substitution, and then writing the results back to a new file that has the same name as the original file -- effectively overwriting it. If you're not so confident of your Perl abilities you might take a backup of the original file, like this:

perl -i.bak -pe 's/\bPHP\b/Perl/g' file.txt

Reddit!

Related stories


RSS Feed RSS feed logo

About


3com

3ware

alsa

alsactl

alsamixer

amd

android

apache

areca

arm

ati

auditd

awk

badblocks

bash

bind

bios

bonnie

cable

carp

cat5

cdrom

cellphone

centos

chart

chrome

chromebook

cifs

cisco

cloudera

comcast

commands

comodo

compiz-fusion

corsair

cpufreq

cpufrequtils

cpuspeed

cron

crontab

crossover

cu

cups

cvs

database

dbus

dd

dd_rescue

ddclient

debian

decimal

dhclient

dhcp

diagnostic

diskexplorer

disks

dkim

dns

dos

dovecot

drac

dsniff

dvdauthor

e-mail

echo

editor

emerald

encryption

ethernet

expect

ext3

ext4

fat32

fedora

fetchmail

fiber

filesystems

firefox

firewall

flac

flexlm

floppy

flowtools

fonts

format

freebsd

ftp

gdm

gmail

gnome

google

gpg

greasemonkey

greylisting

growisofs

grub

hacking

hadoop

harddrive

hba

hex

hfsc

html

html5

http

https

hulu

idl

ie

ilo

intel

ios

iperf

ipmi

iptables

ipv6

irix

javascript

kde

kernel

kickstart

kmail

kprinter

krecord

kubuntu

kvm

lame

ldap

linux

logfile

lp

lpq

lpr

maradns

matlab

memory

mencoder

mhdd

mkinitrd

mkisofs

moinmoin

motherboard

mouse

movemail

mplayer

multitail

mutt

myodbc

mysql

mythtv

nagios

nameserver

netflix

netflow

nginx

nic

ntfs

ntp

nvidia

odbc

openbsd

openntpd

openoffice

openssh

openssl

openvpn

opteron

parted

partimage

patch

perl

pf

pfflowd

pfsync

photorec

php

pop3

pop3s

ports

postfix

power

procmail

proftpd

proxy

pulseaudio

putty

pxe

python

qemu

r-studio

raid

recovery

redhat

router

rpc

rsync

ruby

saltstack

samba

schedule

screen

scsi

seagate

seatools

sed

sendmail

sgi

shell

siw

smtp

snort

solaris

soundcard

sox

spam

spamd

spf

spotify

sql

sqlite

squid

srs

ssh

ssh.com

ssl

su

subnet

subversion

sudo

sun

supermicro

switches

symbols

syslinux

syslog

systemd

systemrescuecd

t1

tcpip

tcpwrappers

telnet

terminal

testdisk

tftp

thttpd

thunderbird

timezone

ting

tls

tools

tr

trac

tuning

tunnel

ubuntu

unbound

vi

vpn

wget

wiki

windows

windowsxp

wireless

wpa_supplicant

x

xauth

xfree86

xfs

xinearama

xmms

youtube

zdump

zeromq

zic

zlib