.d8888b.    .d8888b.   8888888
                      d88P  Y88b  d88P  Y88b    888
                      888    888  888    888    888
                      888         888           888
                      888         888  88888    888
                      888    888  888    888    888
                      Y88b  d88P  Y88b  d88P    888
                       "Y8888P"    "Y8888P88  8888888

                       PCP => Perl CGI Program (ming)

                                Version 1.0

Shishir Gundavaram <shishir@ora.com>
Tom Christiansen <tchrist@perl.com>

Under Construction

This page needs severe updating. I've a list of things yet to do or fix, but more suggestions are always welcome. See also the Idiot's Guide to Solving Perl CGI Problems.

--tchrist



1.0 - Introduction

[an error occurred while processing this directive]

Q1.2: What does CGI stand for?

Here is an excellent description that my editors Andy Oram and Linda Mui (they're great!) wrote up:

Common          Assures you that CGI can be used by many
		languages and interact with many different
		types of systems.  It doesn't tie you down to
		one way of doing what you want.

Gateway         Suggests that CGI's strength lies not in what
		it does by itself, but in the potential access
		it offers to other systems such as databases
		and graphic generators.

Interface       Simply means that CGI provides a well-defined
		way to call up its features--in other words,
		that you can write programs that use it.
[an error occurred while processing this directive]

Q1.4: What is Perl and why do so many people use it for CGI?

The answer is located in the first three lines of the Perl manpage:

Perl is an interpreted language optimized for scanning arbitrary text files, extracting information from those text files, and printing reports based on that information.
Most CGI applications involve manipulating data in some fashion and accessing external programs and applications. Perl provides easy to use tools that make these tasks a cinch.

[an error occurred while processing this directive]


Q1.6: Is there a mailing list or newsgroup for this kind of thing?

There is a very useful newsgroup: comp.infosystems.www.authoring.cgi, that's "monitored" by numerous CGI experts. However, you should not post a question to this group (or any other group, for that matter), until you have read the FAQ.

Various mailing lists for Perl, CGI, and the Web exist. Here are two of the most popular:

cgi-perl-request@webstorm.com [ Hypermail archive ]

This list is for those who are writing/or interested in writing Perl 5 modules for CGI. It is not intended for any type of CGI support.

libwww-perlrequest@ics.uci.edu [ Hypermail Archive ]

libwww-perl is a Perl library that provides a simple programming interface for writing Web clients and servers.

You can access the Perl 4 distribution at:
http://www.ics.uci.edu/pub/websoft/libwww-perl

The Perl 5 libwww modules are located at:
http://www.oslonett.no/home/aas/perl/www

CPAN: Perl modules may also be retrieved from the multiplexed distributed CPAN system. This chooses a "site near you". For example, LWP modules are retrievable as source or just the readme.

[an error occurred while processing this directive]


2.0 - Modules

[an error occurred while processing this directive]

Q2.2: How do I figure out how xyz module works?

Most modules have manpages embedded within the module itself. If that's the case, you can use the pod2man script to view the manpage:

% pod2text name_of_module.pm
% pod2man name_of_module.pm | nroff -man | more

[an error occurred while processing this directive]


Q2.4: What CGI modules are available for Perl 5? Which should I use, and why?

CGI.pm
This wonderful module has some of the same functionality as the CGI::* modules. You can use this if you don't want to deal with multiple modules. We will show you an example of how to use CGI.pm to debug your CGI scripts later in this document.

Lincoln has also written an excellent book on the Web and CGI, titled How to Set Up and Maintain a World Wide Web Site.

CGI::* Modules
These modules, most of them originally written by Tim Bunce and now maintained by Lincoln Stein, allow you to create and decode forms, debug your CGI programs and maintain state between forms.

CGI Lite
This lightweight module is an alternative to the CGI::* modules. It is something of a glorified version of the old cgi-lib.pl with some added functionality.

All three of these modules have the ability to decode the multipart form data (i.e file upload).

[an error occurred while processing this directive]


3.0 - CGI and the WWW Server

[an error occurred while processing this directive]

Q3.2: What are file access permissions? How do I change them?

File permissions allow read, write, and execute access to users based on their user identification (also known as uid), and their membership to certain groups. You can use the command: chmod to change a file's permissions. Here is an example:

% ls -ls form.cgi

  1 -rwx------  1 shishir       974 Oct 31 22:15 form.cgi*
This has a permission of 0700 (octal), which means that no one (besides the owner) can read to, write from, and execute this file. Let's use the chmod command to change the permissions:
% chmod 755 form.cgi
% ls -ls form.cgi

  1 -rwxr-xr-x  1 shishir       974 Oct 31 22:15 form.cgi*
This changes the permissions so that users in the same group as "shishir", as well as all other users have the permission to read from, and execute this file.

See the manpages for the chmod command for a full explanation of the various octal codes.

[an error occurred while processing this directive]


Q3.4: Why am I getting the "Server: Error 500" message?

You can get a server error for the following reasons:

[an error occurred while processing this directive]


4.0 - Specific Programming Questions

[an error occurred while processing this directive]

Q4.2: The formmail script looks complicated. Why can't I use a mailto: URL so that it just mails me the info the user filled in?

Unfortunately, the mailto: command is not supported by all browsers. If you have this in your document, it is a limiting factor, as people who use browsers that do not support this, will not have the ability to send you mail.

[an error occurred while processing this directive]


Q4.4: What are STDERR and STDIN and STDOUT connected to in a PCP?

In a CGI environment, STDERR points to the server error log file. You can take this to your advantage by outputting debug messages, and then checking the log file later on.

Both STDIN and STDOUT point to the browser. In actuality, STDIN actualls points to the server which interprets the client (or browser's) request and information, and sends that to the script.

In order to catch errors, you can "dupe" STDERR to STDOUT early on in your script (after outputting the valid HTTP headers):

open (STDERR, ">&STDOUT");
This redirects all of the error messages to STDOUT (i.e the browser).

[an error occurred while processing this directive]


Q4.6: How can I strip all the html tags from a document with a Perl substitute?

Here is a simple regular expression that will strip HTML tags:

$line =~ s/<(([^ >]|\n)*)>//g;

Or you can "escape" certain characters in a HTML tag so that it can be displayed:

$line =~ s/<(([^>]|\n)*)>/<$1>/g;

For more information, see Tom's striphtml program, which is also included in his tour of perl5 regexps. [an error occurred while processing this directive]


Q4.8: Can people read my PCP? If they do, is it a security problem that they know how my code works? How can I hide it?

If you configure your server so that it recognizes that all files in a specific directory (i.e "cgi-bin"), or files with certain extensions (i.e ".pl", ".tcl", ".sh", etc) are CGI programs, then the server will execute these programs. There is no way for users to see the script itself.

On the other hand, if you allow people to look at your script (by placing it, for example, in the document root directory), it is not a security problem, in most cases, providing that there are no security holes in the program. If the program does contain security holes and you allow users to view the program, then they can and will exploit the problem.

[an error occurred while processing this directive]


Q4.10: Why shouldn't I have people type in passwords or social security numbers or credit card numbers? Isn't that what TYPE="password" is for?

No! The forms interface allows you to have a "password" field, but it should not be used for anything highly confidential. The main reason for this is that all form data (including "password" fields) gets sent from the browser to the Web server as plain text, and not as encrypted data.

If you want to solicit secure information, you need to use a secure server, such as Netscape's Commerce Server.

[an error occurred while processing this directive]


Q4.12: Why doesn't my system () output come out in the right order?

This has to do with the way the standard output is buffered. In order for the output to display in the correct order, you need to turn buffering off by using the $| variable:

$| = 1;

[an error occurred while processing this directive]


Q4.14: How can I access my environment variables? Why are they different sometimes?

You can access the environment variables through the %ENV associative array. Here is a simple script that dumps out all of the environment variables (sorted):

#!/usr/local/bin/perl

print "Content-type: text/plain", "\n\n";
	
foreach $key (sort keys %ENV) {
    print $key, " = ", $ENV{$key}, "\n";
}

exit (0);

Unfortunately, not all browsers set the same environment variables. For example, HTTP_REFERER is not set by all browsers.

[an error occurred while processing this directive]


Q4.16: How come when I run it from the command line, my PCP works, but not from the browser?

This most likely is due to permission problems. Remember, your server is probably running as "nobody", "www", or a process with very minimal privileges. As a result, it will not be able to execute your script unless it has permissions to do so.

[an error occurred while processing this directive]


Q4.18: How do I make a form that maintains state, or has several entry points?

You can use the CGI::MiniSvr module to keep state between multiple entry points.

Or you can create a series of dynamic documents that pass a unique session identification, either as a query, extra path name, or as a hidden field, to each other.

[an error occurred while processing this directive]


Q4.20: How can I call a PCP without using a <FORM> tag?

You can call a CGI program by simply opening the URL to it:

http://some.machine/cgi-bin/your_program.pl

You can also have a link in a document, such as:

<A HREF="http://some.machine/cgi-bin/your_program.pl">
Click here to access my CGI program</A>

[an error occurred while processing this directive]


Q4.22: What are all the server response codes and what do they mean?

A CGI program can send specific response codes to the server, which in turn, it will send to the browser. For example, if you want a "No Response" (meaning that the browser will not load a new page), then you need to send a response code of 204 (see above).

[an error occurred while processing this directive]


Q4.24: How can I automatically include a:

"Last updated: ..."
line at the bottom of all my HTML pages? Or can I only do that for SSI pages? How do I get the date of the CGI script?

If you are dynamically creating documents using CGI, then you can insert a timestamp pretty easily. Here is an example (Perl 5 only).

$last_updated = localtime (time);
print "Last updated: $last_updated\n";
or:
require "ctime.pl";

$last_updated = &ctime (time);
print "Last updated: $last_updated\n";
or even:
chop ($date = `/usr/local/bin/date`);
print "Last updated: $last_updated\n";
You can accomplish this with SSI like this:
<--#echo var="LAST_MODIFIED"-->

[an error occurred while processing this directive]


5.0 - Security

[an error occurred while processing this directive]

Q5.2: What particular security concerns should I be aware of?

Never expose any form data to the shell. All of the following are possible security holes:

However, the second construct can be made safer by passing the arguments as a list, rather than a string -- which the shell will mess with:

system ("/usr/ucb/finger", $form_user);

You should also look at:

[an error occurred while processing this directive]


Q5.4:How can I call progam with backticks securely? Is it true that:
@ans = `grep '$user_field' some.file`;
is insecure?

Yes! It's very dangerous! Imagine if $user_field contains something like the following:

; rm -fr / ;

A much safer way to accomplish the above is:

if (open (GREP, "-|")) {
    @ans = <GREP>;
} else {
    exec ("/usr/local/bin/grep", $user_field, "some.file")
        || die "Error exec'ing command", "\n";
}

close (GREP);
[an error occurred while processing this directive]
This document, and all its parts, are Copyright (c) 1996, Shishir Gundavaram and Tom Christiansen. All rights reservered. Permisson to distribute this collection, in part or full, via electronic means (emailed, posted or archived) or printed copy are granted providing that no charges are involved, reasonable attempt is made to use the most current version, and all credits and copyright notices are retained. Requests for other distribution rights, including incorporation in commercial products, such as books, magazine articles, or CD-ROMs should be made to either of the authors.

Return to:
Copyright 1996 Tom Christiansen.
All rights reserved.