Cool Web Solutions with Perl

[This article was originally published in c|net's Builder.com on May 5, 1998. "Perl Primer" is Builder.com's updated article about Perl.]

Perl is an essential and inescapable part of Web development as we know it. Ask any hardcore Web developer what they consider to be their most indispensable tools, and they're likely to mention Perl right away.

Perl (practical extraction and report language) is both a workhorse for back-end data processing and a prototype development tool. Before the Web, Unix administrators and developers embraced Perl as an improvement over traditional scripting languages such as awk and sed. Perl creator Larry Wall never intended for it to take the place of robust compiled languages like C, but developers found that Perl let them accomplish many of the same tasks with much less code.

Perl is now a decade old, but it's hardly old news. The devoted members of the Perl community are constantly finding new ways to modularize Perl, link it to other common languages, and make it work with various operating systems. In fact, efforts are underway to make Perl the top scripting language for XML.

If you're new to Perl--or just want to get up to speed--check out "What is Perl?" and "What's new with Perl?" If you're already familiar with Perl, here's your chance to catch up on four of the language's most interesting and useful new developments:

1. JPL, a useful toolkit that meshes Java and Perl

2. ActiveState's Perl for Win32, a module that lets you write and run Perl scripts on 32-bit Windows operating systems

3. PerlScript, an ActiveX Control that accepts Perl syntax

4. MacPerl, the marriage of Perl and MacOS

Regardless of your platform or development environment, you can use Perl--and its new features--to automate time-consuming tasks and add power to your Web site.

What is Perl?

Perl is an easy-to-learn yet surprisingly versatile and robust programming language. Larry Wall's intention in creating Perl was to "rehumanize" computing technology by giving users and developers an accessible computer language to make software, administration, and processing solutions easy to implement. He hoped that programmers with backgrounds in C, sed, awk, Unix shell scripting, Lisp, or even Microsoft Excel macros could pick up the concepts and semantics of Perl easily. Using their existing knowledge, they could essentially speak Perl with the accent of their native tongue.

Perl's facility for automating system administration tasks and processing interactive Web sites--as well as its powerful regular expressions engine and its strong process, file, and text manipulation functionalities--makes it ideal for:

Taken together, that adds up to a terrific Web development language. Perl is also the most popular language for writing common gateway interface (CGI) scripts, which are used for most back-end Web processes (such as meshing databases with Web pages).

The fact that Perl is nonproprietary and freely distributed hasn't hurt its popularity among Web developers. Independent programmers have developed hundreds of useful--and free--modules and volumes of documentation. But the open development environment hasn't stopped the appearance of commercial applications of Perl, which can be found in everything from OS-specific Perl modules to Web traffic analysis tools.

Best of all, Perl is remarkably simple to learn. Without having to worry about types, compilers, or application program interfaces (APIs), programmers can pick up as much as they need to accomplish a particular task. Concepts and functions build on themselves, so anyone can learn as much or as little of Perl as is required. If you have any programming experience at all (especially in C), you can become productive in Perl in just a few weeks.

If you have no prior Perl experience and would like to get your feet wet, check out our sample Perl script.

What's new with Perl?

In the last couple of years, Perl's source code has been completely rewritten. Perl 4.x is outdated, with its last patch--Perl 4.036--released in 1992. If you haven't already, upgrade to Perl 5.x, which is modular, object-oriented, and well optimized. (As of early May 1998, the current version is 5.004_04.)

Fortunately, Perl 5.x's interface is quite similar to that of Perl 4.x. Many 4.x scripts will have no problem running under the 5.x interpreter. Because Perl remains easy to learn, you can take your time easing into 5.x's object-oriented nature, gradually taking advantage of such new features as multithreading (handling multiple server requests simultaneously) and compiler support for C and Java.

Perl developers--including Larry Wall--are committed to keeping Perl accessible, up-to-date, and versatile. With the creation of JPL, a toolkit for creating Java-Perl applications, programmers can now write Java classes with Perl implementations. (JPL is derived from Java and .pl, a common file extension used for Perl scripts.) With JPL applications, programmers can implement the Java Abstract Window Toolkit (AWT)--the standard Java tool for developing graphical user interfaces (GUIs) and the widgets they contain--within a Perl wrapper. JPL utilizes the Java Native Interface (JNI), a standard interface that allows Java to call up native libraries written in C, C++, assembly, or other languages--such as Perl. Similarly, you'll soon be able to wrap Perl scripts within C and C++ code.

In addition to these efforts, Perl now lets you do a number of other important new things:

One of the best indicators of the success of a programming language is the availability of documentation and the enthusiasm of discussion, and Perl enjoys plenty of both. For more information, check out our list of Perl resources.

Java and JPL

Because Perl and Java are both widely used to enhance Web functionality--and most technically savvy Web developers are familiar with both--it makes sense to mix the two.

If you've coded with both Java and Perl, however, you may have noticed major structural and conceptual differences. Java is a strongly typed language; for instance, it requires you to specifically declare the functionality of data, or the data types, in order for the program to evaluate them and--when necessary--convert one data type to another. Perl, in contrast, is typeless, eliminating the need to declare data types within programs. (This is one of the features that makes Perl easy to use.) In spite of these differences, Perl creator Larry Wall developed a toolkit--called JPL--for creating Java and Perl hybrid applications.

Combining the robustness of Java and the efficiency of Perl--the best of both programming worlds--the JPL toolkit lets programmers:

Perhaps the best thing about JPL is that you no longer have to choose between Java and Perl. You can use Java for its robustness and effective front-end design, and then turn to Perl for its efficient data manipulation and ease of implementation.

Java and JPL: Initial setup

The JPL toolkit is helpful only if you're familiar with the data types and structures in both Perl and Java. You also need to understand the mechanisms by which classes are defined and objects are instantiated. (At press time, the most comprehensive documentation and code source is O'Reilly's Perl Resource Kit.)

To use the JPL you'll need a fully functional version of the Java Development Kit (version 1.1 or higher) and either Solaris 2.5.1 or Linux 2.0.30. Refer to Brian Jepson's Perl Utilities Guide for specific instructions on how to use the set-up tool or how to copy the necessary files and permissions manually. (You can also download a PDF version of the Perl Utilities Guide.)

Once the JPL is set up, you can compile and run the included examples to see how they work. Each JPL compilation produces a number of associated files, each containing a different type of compiled code, including files for Java, Perl, C, a Java class, and libraries.

Now you're ready to create your own example of a simple JPL file, one that defines a Perl method. Once you understand how JPL works, you can use it to create all sorts of useful utilities and applications, such as an interface to an FTP search server.

Java and JPL: Using AWT via JPL

Once you become familiar with JPL, you'll probably want to experiment with Perl and the Java Abstract Window Toolkit (AWT) in order to create a graphical user interface for a program. You can assign Perl code to a number of aspects within the Java AWT, including components, containers, and layouts; event handling; image processing; peer communication; and data transfer.

The following JPL code--modified from Brian Jepson's Perl Utilities Guide--is a simple example of applying a label component:

/*
* LabelExample.jpl - An AWT example using JPL code
*/
import java.awt.*;
public class LabelExample extends Frame {
perl void LabelExample() {{
use JPL::Class 'java::awt::Label';
my $setText = getmeth("setText", ['java.lang.String'], {});
my $add = getmeth("add", ['java.awt.Component'], ['java.awt.Component'];
my $cnetLabel = java::awt::Label->new();
$cnetLabel->$setText("cnet");
$pub->$add($cnetLabel);
my $new = getmeth("new", ['java.lang.String'], []);
my $builderLabel = java::awt::Label->$new("BUILDER.COM");
$pub->$add($builderLabel);
}}
public static void main(String[] argv) {
LabelExample ex = new LabelExample();
ex.setLayout (new GridLayout (0,1,0,0));
ex.labelExample();
ex.pack();
ex.show();
}
}

Perl does Windows

There are several Perl modules for 32-bit Windows, but the most widely used and best supported one is Perl for Win32 by ActiveState.

When using Perl on any platform, including Windows, you can divide tasks into two categories: operating-system-independent tasks and operating-system-specific tasks. An example of an OS-independent task is running a complex mathematical or data calculation, which can be performed on any platform. An OS-specific task could be automating a Windows program or changing a Control Panel setting. Perl for Win32 lets you do both. You can even reuse Perl code across platforms for OS-independent tasks.

Perl does Windows: Using Win32 Perl to manipulate the Windows Registry

If your Web site runs on a Windows machine, the Windows Registry--the central database of the Windows 95 and Windows NT operating systems--is useful for remote administration tasks. You can use a Perl script for a Win32 CGI program to store settings (such as the database it uses) in the Registry.

The following script removes components of path-style settings in the Registry. You can use this script to remove one particular kind of path-style setting or several of them, and you can disable remote downloading of ActiveX objects.

# An example that shows how to remove
# a part of a Registry entry.
#
# Gurusamy Sarathy
# gsar@umich.edu
use Win32;
use Win32::Registry;
use strict;
use vars qw($HKEY_LOCAL_MACHINE);
die <<EOT unless @ARGV;
This tool removes a particular component from any ";"-delimited
string in the Registry. As the boundary case, it will set a key to the
empty string value if the value to be removed matches the single
original component.
Usage: $0 registry_setting keyname string_to_remove
Example:
$0 "Software\\Microsoft\\Windows\\CurrentVersion\\Internet
Settings" CodeBaseSearchPath CODEBASE
will disable remote download of ActiveX objects. Do not run this
example if you don't fully understand what that means.
EOT
my $cfg = shift;
my $keyname = shift;
my $remove = shift;
my $regkey;
my %values;
$HKEY_LOCAL_MACHINE->Open($cfg, $regkey) or die "failed to
open $cfg\n";
$regkey->GetValues(\%values) or die "failed to get values from
$cfg\n";
if (exists $values{$keyname}) {
my $curtype = $values{$keyname}[1];
if ($curtype == REG_SZ) {
my $cur = $values{$keyname}[2];
print "$cfg\\$keyname is `$cur'\n";
my @cur = split /;/, $cur;
my @new = grep { $_ ne $remove } @cur;
if (@new < @cur) {
$cur = join ';', @new;
$regkey->SetValueEx($keyname, NULL, REG_SZ, $cur)
or die "Failed to modify $cfg\\$keyname.\n"
."You probably have insufficient rights to do that.\n";
print "$cfg\\$keyname has been set to `$cur'\n";
}
else {
warn "$cfg\\$keyname doesn't have `$remove', leaving it as is\n";
}
}
else {
warn "$cfg\\$keyname is not a string (REG_SZ) type\n";
}
}
else {
warn "$cfg\\$keyname doesn't even exist\n";
}
$regkey->Close;

Perl does Windows: Debugging Perl scripts in O'Reilly's WebSite Professional

If you are running WebSite Professional, a popular and robust Windows-based Web server made by O'Reilly Software, you can configure it to run and debug Perl scripts by following these steps:

1. Install the latest version of Perl for Win32 from ActiveState and the ActiveState Perl Debugger on the machine running WebSite Professional.

2. Configure WebSite Professional to run Perl scripts with Perl.exe by opening your WebSite Server Properties and selecting the Mapping property sheet.

3. Create a Standard CGI mapping to the directory that contains your Perl scripts.

4. Create an Association mapping from the extension you use for your Perl scripts (such as .pl) to the Perl.exe file. This is generally located in your Perl executable directory.

5. Compose and run a Perl script.

Here's a simple example of a Perl script for Windows:

#!/bin/perl
print "Content-type: text/html\n\n";
print "CNET's BUILDER.COM\n";

Notice that the first line of the script uses a Unixism--that is, the path /bin/perl probably does not exist on your Windows machine. The ActiveState Perl interpreter, however, scans the first line and translates it into a Windows-friendly path.

To run the debugger, simply change the first line of your Perl script to:

#!/bin/perl -d

Then request the script using your Web browser. This technique lets you easily debug your Web-based CGI scripts.

Perl does Windows: Using Perl debuggers on Web server scripts

The common gateway interface (CGI) module, CGI.pm, lets you easily develop Perl scripts for the Web. The module itself has been recently updated and optimized, and new developments in CGI.pm let you run scripts and enter in script parameters from the command line. This makes debugging your scripts more efficient, since you can quickly enter and test different parameters.

The ActiveState Perl Debugger--ActiveState's complement to its Perl for Win32 product--also allows you to step through your code as you run it from the command line. If you have a script called perlscript.pl, you can debug it by entering perl -d perlscript.pl at the command line.

If you want to debug a script every time you run it, simply add the -d option, as shown above, to the first line of the script.

Perl does Windows: Running Perl faster

There are various ways to run Perl faster, depending on your server software and operating system. If you use an Apache Web server, the module mod_perl runs Perl scripts 2 to 50 times faster than by launching a Perl process normally. PerlEx, ActiveState's solution for running Perl programs on Windows NT Web servers, offers the same options for running Netscape, Microsoft, and O'Reilly Web servers.

Both mod_perl and PerlEx work by reducing the Perl script to a subroutine and then calling that subroutine whenever the script is supposed to execute. This is faster because the server saves time by not having to launch a new process and compile the script each time. Also, data structures created in the script are saved between invocations.

An example of this is reading configuration information that does not change. Code that reads data from a configuration file into a BEGIN block executes once when the script is first compiled. Subsequent executions of the script will already have the configuration information.

Using PerlEx, you can dictate that certain code is executed only once by putting it in either a BEGIN or END block. Here is a simple example, courtesy of Dick Hardt of ActiveState:

# Very simple example of preloading a configuration option
#
# We read in the background color from a file and then use
# it in the script without having to read it in on each
# invocation.
#
BEGIN {
# read in the color we will be using
if (open CF, "<color.cfg")) {
$color = <CF>;
close CF;
} else {
$color='#ffffff';
}
}
print '<HTML><HEAD><TITLE>Color.cfg sample</TITLE></HEAD>'
print "<BODY bgcolor=$color>"

With this method, you can read in custom headers and footers, copyright information, and other data that shows up repeatedly in Web applications.

Embedding PerlScript in HTML

PerlScript is an ActiveX scripting engine that lets you write PerlScript code for any ActiveX host, including servers and browsers. PerlScript joins the ranks of JavaScript and Visual Basic Script as an easy-to-use Web scripting language.

You can use PerlScript to perform typical Web client-scripting tasks such as triggering events when a button is clicked, performing calculations, manipulating text and data, and so on. Because PerlScript is a standard ActiveX Control, it works only with Microsoft Internet Explorer.

To use PerlScript, you should be familiar with basic HTML and have a working knowledge of Perl. After that, it's a simple exercise to embed PerlScript into an HTML page:

<HTML>
<HEAD>
<TITLE>Web page with embedded PerlScript</TITLE>
<SCRIPT language="PerlScript">
sub ShowExample {
$window->document->write("This is text from an embedded
PerlScript.<br><hr>");
}
</SCRIPT>
</HEAD>
<BODY onLoad="ShowExample()">
</BODY>
</HTML>

MacPerl pointers

One of the best features of the Macintosh operating system is its simplicity and freedom from dealing with cryptic command lines and code. Mac developers, however, often find this freedom limiting. MacPerl, a freely available version of Perl for the Macintosh, is a powerful tool for accomplishing tasks that are very difficult under the Mac Finder's GUI paradigm--including those involving regular desktop applications and Web-related applications.

Some of the things that MacPerl makes much easier include:

Copying a tree of files, including only .html and .gif files

Locating duplicate files, including files with the same contents but different names

Traversing a collection of files, converting all of the GIF files to JPEGs, and updating all <IMG> references appropriately

Invoking a spreadsheet program that generates a chart that is then displayed as an image map on a Web page

In addition, for many Web content developers and graphic designers who work on Macs, MacPerl offers an excellent local CGI-scripting test bed. Even if the Web pages eventually will be served by a Unix machine, testing the scripts on a Macintosh lets you do all the content development on one machine.

MacPerl pointers: Adding custom headers and footers to Web files

If you're a Web content developer and keep all your files locally on your Mac, Perl can make life easier. Instead of manually (and repeatedly) adding a header and footer to each file, you can create custom directives with MacPerl. This header was supplied courtesy of Chris Nandor, author of MacPerl: Power and Ease:

#createdate: Tuesday, May 5, 1998
#title: My Web Page
<P>Here is some text.</P>

The variable #createdate denotes the modification date of the file; you can also add a line in the MacPerl script to obtain the date automatically from the MacOS. The MacPerl script reads in these directives before reading in the rest of the file, and it uses them to set data in the program. For instance:

print <<EOT;
<TITLE>$title</TITLE>
<!-- File created on: $createdate -->
EOT

The MacPerl script can automatically add the custom header and footer to each file and then invoke a Mac FTP application to transfer all the files to the public Web directory. Here's how:

#!perl -w
use Mac::Apps::Fetch;
$ftp = new Fetch;
$file = 'page.html';
$ftp-›host('ftp.domain.com');
$ftp-›store("HD:Desktop Folder:$file", "/pub/$file");

This procedure lets you change the custom header and footer whenever you'd like without the drudgery of copying the new set to each file.

Perl resources

General resources
www.perl.com
The central Web site for the Perl community (originally launched by Tom Christiansen).

Comprehensive Perl Archive Network (CPAN)
After www.perl.com, this is the most useful and comprehensive source of Perl information available on the Web. It includes a listing of all the Perl modules. Also, check out CPAN's Perl FAQ.

TPJ (The Perl Journal)
Editor Jon Orwant calls TPJ "the first and only periodical devoted to Perl."

The Perl Institute
A nonprofit organization for "the creators, developers, maintainers, and users of Perl."

O'Reilly & Associates
The technical book publisher is a long-standing Perl supporter and provides a number of resources:
O'Reilly's Perl Resource Center
O'Reilly's Perl Resource Kit information
O'Reilly's about-JPL page

ActiveState Tool Corp.
Formerly known as ActiveWare. A comprehensive source of information on Perl for Win32.

The Perl Clinic
A commercial Perl support service run by the Paul Ingram Group and ActiveState.

The Apache/Perl Integration Project
The information and code source of mod_perl, which lets you write Apache modules entirely in Perl.

"Perl Opens Arms to XML"
Report of a recent summit between Perl programmers and the authors of the Extensible Markup Language (XML).

MacPerl resources

MacPerl Homepage
An important information source for MacPerl. Here you can find a central hub of links to a variety of Perl resources.

The MacPerl Pages
Another important source of information on Macintosh and Perl, including information on MacPerl 5.0. Maintained by Prime Time Freeware.

Newsgroups
comp.lang.perl.misc
General and miscellaneous discussion about Perl.

comp.lang.perl.announce
Moderated newsgroup containing announcements about Perl.

comp.lang.perl.modules
Discussion about Perl modules.

comp.lang.perl.tk
Discussion about using Perl with Tk.

comp.infosystems.www.authoring.cgi
Discussion about general CGI programming, including Perl.

About the author

Mariva H. Aviram is an Internet consultant and writer covering the computer industry. Her last article for BUILDER.COM was "Analyze your Web site's traffic."

The author wishes to thank Ellen Elias, Gurusamy Sarathy, Dick Hardt, Brian Jepson, Rich Morin, Chris Nandor, Matthias Ulrich Neeracher, Jon Orwant, and especially Larry Wall. Thanks also to Steve Linde, BUILDER.COM software engineer; Saul Jimenez, BUILDER.COM software engineer; and Bill Ho, CNET's senior operations manager.