Search in:
Login Form
Username

Password

Remember me

Forgotten your password?
No account yet? Create one
Using Perl and Regular Expressions to Process HTML Files - Part 5

  Rating :
  
  Contributed by : The Administrator
  Member Level : Master
  Posted on : 2008-10-28



Article Index
 1.Using Perl and Regular Expressions to Process HTML Files - Part 5
Using Perl and Regular Expressions to Process HTML Files - Part 5
( Page 1 of 1 )
 

In Part 1 we had a quick look at what Perl and regular expressions are, and introduced the idea of using them to process HTML files. In Part 2 we developed a Perl script to process a single HTML file. In part 3 we looked at one way of processing multiple files. In Part 4 we looked at how to read in all the files in the current directory. In this, the last part, we'll look at how to read in specific files in specific directories.

In Part 4 we wrote a script that enabled us to read in all the files in the current directory. Sometimes, however, you might need to process files that are located in different directories. script4.pl lists a script that will do this.

Note: Due to display considerations, in the example code shown in this article, square brackets '[..]' are used in HTML/script tags instead of angle brackets '<..>'.

script4.pl

1 @allfiles=glob("file1.htm directory1/subdirectory1/*.shtm directory2/*.htm");

2 foreach $name (@allfiles) {

3 rename $file, "$file.bak";

4 open (IN, "<$file.bak");

5 open (OUT, ">$file");

6 while ($line = [IN]) {

7 $line =~ s/[h1]/[h1 class="big"]/;

8 (print OUT $line);

9 }

10 close IN;

11 close OUT;

12 }

The only new line here is line 1, which uses the glob function to search through specified directories and files. Firstly, it searches for file1.htm in the current directory, and then it search for all files ending in .shtm in directory1/subdirectory1, and then all files ending in .htm in directory2. The asterisk (*) is a wildcard, which means any filename.

Running the script

c:>perl script4.pl

About the Author: John Dixon is a web developer working for My Health Questions Matter, a company that helps users of the health service to ask the right questions when discussing their medical condition with health professionals. John is also interested in computer history, and maintains http://www.computernostalgia.net, a site dedicated to the history of the computer. John also provides web development services to large and small clients via his own company John Dixon Technology Limited.

John Dixon - EzineArticles Expert Author
 
 


No review(s) Found !!



Specifications
  Submission Date  28-Oct-2008
  Last Update  28-Oct-2008


Member Rating Totals
  Poor  
 
 0%
  Fair  
 
 0%
  Average  
 
 0%
  Good  
 
 0%
  Excellent  
 
 0%




The following members only content has been hidden:
Member Rating Breakdown By Period and Graph of the same.

To view this content:
if you are already a registered user then login from the left panel or
click here to register - IT IS FREE