Zitat von http://www.linuxforums.org/programming/learn_perl_in_10_easy_lessons__lesson_4.html
Parsing files
There are many ways to parse a text file. In Perl, if the file has its data organized line by line with delimiters, it is very easy to parse it.
Let's study a simple example. We have a set of employees in a file called employees.txt. In this file, each line represents an employee. The information relative to each employee is delimited with tabs, the first column is the name of the employee, the second column indicates his department and the third one his salary. Here is an overview of the file:
Mr John Doe R&D 21000
Miss Gloria Dunne HR 23000
Mr Jack Stevens HR 45000
Mrs Julie Fay R&D 30000
Mr Patrick Reed R&D 33000
In order to obtain some statistics, the HR department wants to establish a list of all male employees who work in the R&D department and which salary is more than 25000.
To obtain this list, we design a simple Perl script, which:
1.
opens the employees.txt file
2.
loops through each line
3.
identifies the name, department and salary of the employee
4.
ignores and goes to the next line if the employee is female (the name does not start with Mr)
5.
ignores and goes to the next line if the salary is less or equal to 25000.
6.
ignores and goes to the next line if the department is not “R&D”.
7.
prints the name and the salary of the employee on the screen.
To do this, we'll introduce two Perl functions:
*
“chomp” is used to remove the carriage return found in the end of the line. For instance chomp $variable removes all carriage returns in the variable.
*
“split” is used to cut the line in different parts where it finds a delimiter. For instance split /o/, “hello world” returns an array containing “hell”, “ w” and “rld”. In our example we'll split the lines with the tab delimiter, which in Perl is written “\t”.
Here is the script which establishes the list of male employees from the R&D department with a salary greater than 25000. To make things a bit clearer, comments were introduced within the scripts (comments in Perl start with a # sign):
#open the employees file
open (EMPLOYEES, "employees.txt");
#for each line
while ($line = <EMPLOYEES>) {
#remove the carriage return
chomp $line;
#split the line between tabs
#and get the different elements
($name, $department, $salary) = split /\t/, $line;
#go to the next line unless the name starts with "Mr "
next unless $name =~ /^Mr /;
#go to the next line unless the salary is more than 25000.
next unless $salary > 25000;
#go to the next line unless the department is R&D.
next unless $department eq "R&D";
#since all employees here are male,
#remove the particle in front of their name
$name =~ s/Mr //;
print "$name\n";
}
close (EMPLOYEES);