| |||
Links Sections Filling in a Form and Mailing the Information Debugging Form Processing CGI Scripts Creating a Guest Book for Your Site Chapters Part I: Basic Perl 02-Numeric and String
Literals Part II: Intermediate Perl Part III: Advanced Perl 13-Handling Errors and
Signals Part IV: Perl and the Internet 21-Using Perl with Web
Servers Appendixes |
One of the most popular uses for CGI programs is to process information from HTML forms. This chapter gives you an extremely brief overview of HTML and Forms. Next you see how the form information is sent to CGI programs. After being introduced to form processing, a Guest book application is developed.
HTML documents need to have certain tags in order for them to be considered "correct". The <HEAD>..</HEAD> set of tags surround the header information for each document. Inside the header, you can specify a document title with the <TITLE>..</TITLE> tags.
Tip |
HTML tags are case-insensitive. For example, <TITLE> is the same as <title>. However, using all upper case letters in the HTML tags make HTML documents easier to understand because you can pick out the tags more readily. |
After the document header, you need to have a set of <BODY>..</BODY> tags. Inside the document's body, you specify text headings by using a set of <H1>..</H1> tags. Changing the number after the H changes the heading level. For example, <H1> is the first level. <H2> is the second level, and so on.
You can use the <P> tag to indicate paragraph endings or use the <BR> to indicate a line break. The <B>..</B> and <I>..</I> tags are used to indicate bold and italic text.
The text and tags of the entire HTML document must be surrounded by a set of <HTML>..</HTML> tags. For example:
<HTML>
<HEAD><TITLE>This is the Title</TITLE></HEAD>
<BODY>
<H1>This is a level one header</H1>
<P>This is the first paragraph.
<P>This is the second paragraph and it has <I>italic</I> text.
<H2>This is a level two header</H2>
<P>This is the third paragraph and it has <B>bold</B> text.
</BODY>
</HTML>
Most of the time, you will be inserting or modifying text
inside the <BODY>..</BODY> tags.
That's enough about generic HTML. The next section discusses Server-Side Includes. Today, Server-Side Includes are replacing some basic CGI programs, so it is important to know about them.
The inserted information can take the form of a local file or a file referenced by a URL. You can also include information from a limited set of variables - similar to environmental variables. Finally, you can execute programs that can insert text into the document.
Note |
The only real difference between CGI programs and SSI programs is that CGI programs must output an HTTP header as their first line of output. See "HTTP Headers" in Chapter 19, "What Is CGI?," for more information. |
Most web servers need the file extension to be changed from html to shtml in order for the server to know that it needs to look for Server-Side directives. The file extension is dependent on server configuration, but shtml is a common choice.
All SSI directives look like HTML comments within a document. This way, the SSI directives will simply be ignored on web servers that do not support them.
Table 20.1 shows a partial list of SSI directives supported by the webSite server from O'Reilly. Not all web servers will support all of the directives in the table. You need to check the documentation of your web server to determine what directives it will support.
Note |
Table 20.1 shows complete examples of SSI directives. You need to modify the examples so that they work for your web site. |
Directive | Description |
---|---|
<!--#config timefmt="%c"--> | Changes the format used to display dates. |
<!--#config sizefmt="%d bytes"--> | Changes the format used to display file sizes. You may also be able to specify bytes (to display file sizes with commas) or abbrev (to display the file sizes in kilobytes or megabytes). |
<!--#config errmsg="##ERROR!##"--> | Changes the format used to display error messages caused by wayward SSI directives. Error messages are also sent to the server's error log. |
<!--#echo var=?--> | Displays the value of the variable specified by ?. Several of the possible variables are mentioned in this table. |
<!--#echo var="DOCUMENT_NAME"--> | Displays the full path and filename of the current document. |
<!--#echo var="DOCUMENT_URI"--> | Displays the virtual path and filename of the current document. |
<!--#echo var="LAST_MODIFIED"--> | Displays the last time the file was modified. It will use this format for display: 05/31/96 16:45:40. |
<!--#echo var="DATE_LOCAL"--> | Displays the date and time using the local time zone. |
<!--#echo var="DATE_GMT"--> | Displays the date and time using GMT. |
<!--#exec cgi="/cgi-bin/ssi.exe"--> | Executes a specified CGI program. It must be activated to be used. You can also use a cmd= option to execute shell commands. |
<!--#flastmod virtual="/docs/demo/ssi.txt"--> | Displays the last modification date of the specified file given a virtual path. |
<!--#flastmod file="ssi.txt"--> | Displays the last modification date of the specified file given a relative path. |
<!--#fsize virtual="/docs/demo/ssi.txt"--> | Displays the size of the specified file given a virtual path. |
<!--#fsize file="ssi.txt"--> | Displays the size of the specified file given a relative path. |
<!--#include virtual="/docs/demo/ssi.txt"--> | Displays a file given a virtual path. |
<!--#include file="ssi.txt"--> | Displays a file given a relative path. The relative path can't start with the ../ character sequence or the / character to avoid security risks. |
SSI provides a fairly rich set of features to the programmer. You might use SSI if you had an existing set of documents to which you wanted to add modification dates. You might also have a file you want to include in a number of your pages - perhaps to act as a header or footer. You could just use the SSI include command on each of those pages, instead of copying the document into each page manually. When available, Server-Side Includes provide a good way to make simple pages more interesting.
Before Server-Side Includes were available, a CGI program was needed in order to automatically generate the last modification date text or to add a generic footer to all pages.
Your particular web server might have additional directives that you can use. Check the documentation that came with it for more information.
Tip |
If you'd like more information about Server-Side
Includes, check out the following web site:
http://www.sigma.net/tdunn/Tim Dunn has created a nice site that documents some of the more technical aspects of web sites. |
Caution |
I would be remiss if I didn't mention the down side of Server-Side Includes. They are very processor intensive. If you don't have a high-powered computer running your web server and you expect to have a lot of traffic, you might want to limit the number of documents that use Server-Side Includes. |
There are several modifiers or options used with the <FORM> tag. The two most important are METHOD and ACTION:
Most field elements are defined using the <INPUT> tag. Like the <FORM> tag, <INPUT> has several modifiers. The most important are:
Let's look at how to specify a plain text field:
<INPUT TYPE=text NAME=lastName VALUE=WasWaldo SIZE=25 MAXLENGTH=50>
This
HTML line specifies an input field with a default value of WasWaldo. The input
box will be 25 characters long although the user can enter up to 50 characters.
At times, you may want the user to be able to enter text without that text being readable. For example, passwords need to be protected so that people passing behind the user can't secretly steal them. In order to create a protected field, use the password type.
<INPUT TYPE=password NAME=password SIZE=10>
Caution |
The password input option still sends the text through the Internet without any encryption. In other words, the data is still sent as clear text. The sole function of the password input option is to ensure that the password is not visible on the screen at the time of entry. |
The <INPUT> tag is also used to define two possible buttons - the submit and reset buttons. The submit button sends the form data to a specified URL - in other words to a CGI program. The reset button restores the input fields on the forms to their default states. Any information that the user had entered is lost. Frequently, the VALUE modifier is used to change the text that appears on the buttons. For example:
<INPUT TYPE=submit VALUE="Process Information">
Hidden
fields are frequently used as sneaky ways to pass information into a CGI
program. Even though the fields are hidden, the field name and value are still
sent to the CGI program when the submit button is clicked. For example, if your
script generated an email form, you might include a list of email addresses that
will be carbon-copied when the message is sent. Since the form user doesn't need
to see the list, the field can be hidden. When the submit button is clicked, the
hidden fields are still sent to the CGI program along with the rest of the form
information.
The last two input types are checkboxes and radio buttons. Checkboxes let the user indicate either of two responses. Either the box on the form is checked or it is not. The meaning behind the checkbox depends entirely on the text that you place adjacent to it. Checkboxes are used when users can check off as many items as they'd like. For example:
<INPUT TYPE=checkbox NAME=orange CHECKED>Do you like the color Orange?
<INPUT TYPE=checkbox NAME=blue CHECKED>Do you like the color Blue?
Radio
buttons force the user to select only one of a list of options. Using radio
buttons for a large number of items (say, over five) is not recommended because
they take up too much room on a web page. The <SELECT> tag should be used
instead. Each grouping of radio buttons must have the same name but different
values. For example,
Operating System:<BR>
<INPUT TYPE=radio NAME=os VALUE=Win95>Windows 95
<INPUT TYPE=radio NAME=os VALUE=WinNT>Windows NT
<INPUT TYPE=radio NAME=os VALUE=UNIX CHECKED>UNIX
<INPUT TYPE=radio NAME=os VALUE=OS2>OS/2
CPU Type:<BR>
<INPUT TYPE=radio NAME=cpu VALUE=Pentium>Intel Pentium
<INPUT TYPE=radio NAME=cpu VALUE=Alpha CHECKED>DEC Alpha
<INPUT TYPE=radio NAME=cpu VALUE=Unknown>Unknown
You should
always provide a default value for radio buttons because it is assumed that one
of them must be selected. Quite often, it is appropriate to provide a "none" or
"unknown" radio button (like the "CPU Type" in the above example) so that the
user won't be forced to pick an item at random.
Another useful form element is the drop-down list input field specified by the <SELECT>..</SELECT> set of tags. This form element provides a compact way to let the user choose one item from a list. The options are placed inside the <SELECT>..</SELECT> tags. For example,
<SELECT NAME=weekday>
<OPTION SELECTED>Monday
<OPTION>Tuesday
<OPTION>Wednesday
<OPTION>Thursday
<OPTION>Friday
</SELECT>
You can use the SELECTED modifier to make one of the
options the default. Drop-down lists are very useful when you have three or more
options to choose from. If you have less, consider using radio buttons. The
<SELECT> tag has additional options that provide you with much
flexibility. You can read about these advanced options at:
http://robot0.ge.uiuc.edu/~carlosp/cs317/ft.4-5f.html
The last
form element that I should mention is the text box. You can create a multi-line
input field or text box using the <TEXTAREA>..</TEXTAREA> set of
tags. The <TEXTAREA> tag requires both a ROWS and a COLS modifer. You can
place any default text for the text box inside the
<TEXTAREA>..</TEXTAREA> tags.
<TEXTAREA NAME=comments ROWS=3 COLS=60></TEXTAREA>
The
user's web browser will automatically provide scroll bars as needed. However,
the text box will probably not word-wrap. In order to move to the next line, the
user must press the enter key.
Note |
If you'd like a more advanced introduction to HTML forms, try this web site: |
http://robot0.ge.uiuc.edu/~carlosp/cs317/ft.1.html
<FORM METHOD=get ACTION=/cgi-bin/gestbook.pl>
The GET method
appends all of the form data to the end of the URL used to invoke the CGI
script. A question mark is used to separate the original URL (specified by the
ACTION modifier in the <FORM> tag) and the form information. The server
software then puts this information into the QUERY_STRING environment variable
for use in the CGI script that will process the form.
The GET method can't be used for larger forms because some web servers limit the length of the URL portion of a request. (Check the documentation on your particular server.) This means that larger forms might blow up if submitted using the GET method. For larger forms, the POST method is the answer.
The POST method sends all of the form information to the CGI program using the STDIN filehandle. The web server will set the CONTENT_LENGTH environment variable to indicate how much data the CGI program needs to read.
The rest of this section develops a function capable of reading both types of form information. The goal of the function is to create a hash that has one entry for each input field on the form.
The first step is simply to read the form information. The method used to send the information is stored in the REQUEST_METHOD environment variable. Therefore, we can examine it to tell if the function needs to look at the QUERY_STRING environment variable or the STDIN filehandle. Listing 20.1 contains a function called getFormData() that places the form information in a variable called $buffer regardless of the method used to transmit the information.
Pseudocode |
Define the getFormData() function. Initialize a buffer. If the GET method is used, copy the form information into the buffer. If the POST method is used, read the form information into the buffer. |
Listing 20.1-20LST01.PL - The First Step is to Get the Form Information. |
|
Tip |
Since a single function can handle both the GET and POST methods, you really don't have to worry about which one to use. However, because of the limitation regarding URL length, I suggest that you stick with the POST method. |
I'm sure that you find this function pretty simple. But you might be wondering what information is contained in the $buffer variable.
Form information is passed to a CGI program in name=value format and each input field is delimited by an ampersand (&). For example, if you have a form with two fields - one called name and one called age - the form information would look like this:
name=Rolf+D%27Barno&age=34
Can you see the two input fields?
First, split up the information using the & as the delimiter:
name=Rolf+D%27Barno
age=34
Next, split up the two input fields based on the = character:
Field Name: name Field Value: Rolf+D%27Barno
Field Name: age Field Value: 34
Remember the section on URL
encoding from Chapter 19? You see it in action in the name field. The name is
really Rolf D'Barno. However, with URL encoding spaces are converted to plus
signs and some characters are converted to their hexadecimal ASCII equivalents.
If you think about how a single quote might be mistaken for the beginning of an
HTML value, you can understand why the ASCII equivalent is used.
Let's add some features to the getFormData() function to split up the input fields and store them in a hash variable. Listing 20.2 shows the new version of the getFormData() function.
Pseudocode |
Declare a hash variable to hold the form's input fields. Call the getFormData() function. Define the getFormData() function. Declare a local variable to hold the reference to the input field hash. Initialize a buffer. If the GET method is used, copy the form information into the buffer. If the POST method is used, read the form information into the buffer. Iterate over the array returned by the split() function. Decode both the input field name and value. Create an entry in the input field hash variable. Define the decodeURL() function. Get the encoded string from the parameter array. Translate all plus signs into spaces. Convert character coded as hexadecimal digits into regular characters. Return the decoded string. |
Listing 20.2-20LST02.PL - The First Step is to Get the Form Information. |
|
The getFormData() function could be considered complete at this point. It correctly reads from both the GET and POST transmission methods, decodes the information, and places the input fields into a hash variable for easy access.
There are some additional considerations of which you need to be aware. If you simply display the information that a user entered, there are some risks involved that you may not be aware of. Let's take a simple example. What if the user enters <B>Rolf</B> in the name field and you subsequently displayed that field's value? Yep, you guessed it, Rolf would be displayed in bold! For simple formatting HTML tags this is not a problem, and may even be a feature. However, if the user entered an SSI tag, he or she may be able to take advantage of a security hole - remember the <!--#exec --> tag?
You can thwart would-be hackers by converting every instance of < to < and of > to >. The HTML standard allows for certain characters to be displayed using symbolic codes. This allows you to display a < character without the web browser thinking that a new HTML tag is starting.
If you'd like to give users the ability to retain the character formatting HTML tags, you can test for each tag that you want to allow. When an allowed tag is found, reconvert it back to using normal < and > tags.
You might want to check for users entering a series of <P> tags in the hopes of generating pages and pages of blank lines. Also, you might want to convert pressing the enter key into spaces so that the line endings that the user entered are ignored and the text will wrap normally when displayed by a web browser. One small refinement of eliminating the line endings could be to convert two consecutive newlines into a paragraph (<P>) tag.
When you put all of these new features together, you wind up with a getFormData() function that looks like Listing 20.3.
Pseudocode |
Declare a hash variable to hold the form's input fields. Call the getFormData() function. Define the getFormData() function. Declare a local variable to hold the reference to the input field hash. Initialize a buffer. If the GET method is used, copy the form information into the buffer. If the POST method is used, read the form information into the buffer. Iterate over the array returned by the split() function. Decode both the input field name and value. Compress multiple <P> tags into one. Convert < into < and > into > stopping HTML tags from interpretation. Turn back on the bold and italic HTML tags. Remove unneded carriage returns. Convert two newlines into a HTML paragraph tag. Convert single newlines into spaces. Create an entry in the input field hash variable. Define the decodeURL() function. Get the encoded string from the parameter array. Translate all plus signs into spaces. Convert character coded as hexadecimal digits into regular characters. Return the decoded string. |
Listing 20.3-20LST03.PL - The First Step is to Get the Form Information. |
|
Caution |
Tracking security problems seems like a never-ending task but it is very important, especially if you are responsible for a web server. As complicated as the getFormData() function is, it is still not complete. The <TEXTAREA> tag lets users enter an unlimited amount of information. What would happen to your web server if someone used the cut and paste ability in Windows 95 to insert four or five megabytes into your form? Perhaps the getFormData() function should have some type of limitation that any individual field should only be 1,024 bytes in length? |
<FORM METHOD=get ACTION=mailto:medined@mtolive.com>
When the
form's submit button is clicked, the form's information will be mailed to the
email address specified in the <FORM> tag. The information will be URL
encoded and all on one line. This means you can't read the information until it
has been processed.
It is generally a bad idea to email form information because of the URL encoding that is done. It is better to save the information to a data file so that you can easily read and analyze it later. Sending notifications by email is a good idea. For example, you could tell an email reader that a certain form has been completed and that the log file should be checked. If you want to send email from a CGI script, you can use the sample program from Listing 18.2 in Chapter 18, "Using Internet Protocols."
Before sending any form information, ensure that it has been decoded. If you are using one of the CGI modules or the decoding functions from Chapter 19, "What Is CGI?," then you don't have to worry about this requirement. Otherwise, please reread the section called "URL Encoding" in Chapter 19.
Make sure to use a Reply-To field in the body of your email message because you won't know which login name the CGI program will be running under. Including the Reply-To field will ensure that the reader of the message can easily respond to the email message if needed.
The first thing to look at is how to set environment variables. The method used depends on your operating system. Table 20.2 shows you how to set environment variables for a variety of operating systems.
Operating System or UNIX Shell | Command |
---|---|
csh | setenv HTTP_USER_AGENT "Mozilla" |
ksh or bash | export HTTP_USER_AGENT="Mozilla" |
Win95, WinNT, OS/2 | set HTTP_USER_AGENT = Mozilla |
In order to recreate the environmental variables that a server sets, you need to initialize at least the following environmental variables:
You also need to initialize any other variables that your program needs. Rather than retyping the set commands each time you want to test your CGI program, create a shell or batch file.
The next step is to create a text file that will be substituted for STDIN when the CGI program is run. You only need to create this text file if you are using the POST method. The text file can be called anything you'd like and should contain just one line - the line of form information. For example:
name=Rolf D'Barno&age=34
Notice that you don't need to use URL
encoding because the information will not be sent through the Internet.
When you are ready, execute your CGI program from the command line with a command like this:
perl -w gestbook.pl < input.dat
To summarize the debugging
process follows these steps:
The sample Guest book application will be presented in two stages. First, an HTML form is used to request information, then the information is saved and all the Guest book entries are displayed by a CGI program. Second, the CGI program is enhanced with better error handling and some new features. Figure 20.1 shows what the finished Guest book will look like.
Fig. 20.1 - The Finished Guest Book
<A HREF="addgest.htm">[Guestbook]</A>
Then place the
web page in Listing 20.4 into the virtual root directory of your web server.
Clicking on the hypertext link will bring visitors to the Add Entry form.
Pseudocode |
Start the HTML web page. Define the web page header which holds the title. Start the body of the page. Display a header. Display some instructions. Start a HTML form. Start a HTML table. Each row of the table is another input field. Define the submit button. End the table. End the Form. End the body of the page. End the page. |
Listing 20.4-ADDGEST.htm - The Add Entry to Guest book HTML Form |
|
The only thing you might need to change in order for this form to work is the ACTION modifier in the <FORM> tag. The directory where you place the CGI program might not be /cgi-bin. The addgest.htm file will generate a web page that looks like the following figure.
Fig. 20.2 - The Add Entry Form
The CGI program in Listing 20.5 is invoked when a visitor clicks on the submit button of the Add Entry HTML form. This program will process the form information, save it to a data file and then create a web page to display all of the entries in the data file.
Pseudocode |
Turn on the warning option. Turn on the strict pragma. Declare a hash variable to hold the HTML form field data. Get the local time and pretend that it is one of the form fields. Get the data from the form. Save the data into a file. Send the HTTP header to the remote web browser. Send the start of page and header information. Send the heading and request a horizontal line. Call the readFormData() function to display the Guest book entries. End the web page. Define the getFormData() function. Declare a local variable to hold the reference to the input field hash. Initialize a buffer. If the GET method is used, copy the form information into the buffer. If the POST method is used, read the form information into the buffer. Iterate over the array returned by the split() function. Decode both the input field name and value. Compress multiple <P> tags into one. Convert < into < and > into > stopping HTML tags from interpretation. Turn back on the bold and italic HTML tags. Remove unneded carriage returns. Convert two newlines into a HTML paragraph tag. Convert single newlines into spaces. Create an entry in the input field hash variable. Define the decodeURL() function. Get the encoded string from the parameter array. Translate all plus signs into spaces. Convert character coded as hexadecimal digits into regular characters. Return the decoded string. Define the zeroFill() function - turns "1" into "01". Declare a local variable to hold the number to be filled. Declare a local variable to hold the string length that is needed. Find difference between current string length and needed length. If the string is big enough (like "12") then return it. If the string is too big, prefix it with some zeroes. Define the saveFormData() function. Declare two local variables to hold the hash and file name. Open the file for appending. Store the contents of the hash in the data file. Close the file. Define the readFormData() function. Declare a local variable to hold the file name. Open the file for reading. Iterate over the lines of the file. Split the line into four variables using ~ as demlimiter. Print the Guest book entry using a minimal amount of HTML tags. Use a horizontal rule to separate entries. Close the file. |
Listing 20.5-20LST05.PL - A CGI Program to Add a Guest book Entry and Display a Guest book HTML Page |
|
This program introduces no new Perl tricks so you should be able to easily understand it. When the program is invoked, it will read the form information and then save the information to the end of a data file. After the information is saved, the program will generate an HTML page to display all of the entries in the data file.
While the program in Listing 20.5 works well, there are several things that can improve it:
The CGI program in Listing 20.6 implements these new features. If you add ?display to the URL of the script, the script will simply display the entries in the data file. If you add ?add to the URL of the script, it will redirect the client browser to the addgest.htm web page. If no additional information is passed with the URL, the script will assume that it has been invoked from a form and will read the form information. After saving the information, the Guest book page will be displayed.
A debugging routine called printENV() has been added to this listing. If you have trouble getting the script to work, you can call the printENV() routine in order to display all of the environment variables and any form information that was read. Place the call to printENV() right before the </BODY> tag of a web page. The displayError() function calls the printENV() function so that the error can have as much information as possible when a problem arises.
Pseudocode |
Turn on the warning option. Turn on the strict pragma. Declare a hash variable to hold the HTML form field data. Get the local time and pretend that it is one of the form fields. Get the data from the form. Was the program was invoked with added URL information? if the display command was used, display the Guest book. if the add command was use, redirect to the Add Entry page. otherwise display an error page. If no extra URL information, check for blank fields. if blank fields, display an error page. Save the form data. Display the Guest book. Exit the program. Define the displayError() function. Display an error page with a specified error message. Define the displayPage() function. Read all of the entries into a hash. Display the Guest book. Define the readFormData() function. Declare local variables for a file name and a hash reference. Open the file for reading. Iterate over the lines of the file. Split the line into four variables using ~ as demlimiter. Create a hash entry to hold the Guest book information. Close the file. Define the getFormData() function. Declare a local variable to hold the reference to the input field hash. Initialize a buffer. If the GET method is used, copy the form information into the buffer. If the POST method is used, read the form information into the buffer. Iterate over the array returned by the split() function. Decode both the input field name and value. Compress multiple <P> tags into one. Convert < into < and > into > stopping HTML tags from interpretation. Turn back on the bold and italic HTML tags. Remove unneded carriage returns. Convert two newlines into a HTML paragraph tag. Convert single newlines into spaces. Create an entry in the input field hash variable. Define the decodeURL() function. Get the encoded string from the parameter array. Translate all plus signs into spaces. Convert character coded as hexadecimal digits into regular characters. Return the decoded string. Define the zeroFill() function - turns "1" into "01". Declare a local variable to hold the number to be filled. Declare a local variable to hold the string length that is needed. Find difference between current string length and needed length. If the string is big enough (like "12") then return it. If the string is too big, prefix it with some zeroes. Define the saveFormData() function. Declare two local variables to hold the hash and file name. Open the file for appending. Store the contents of the hash in the data file. Close the file. |
Listing 20.6-20LST06.PL - A More Advanced Guest Book |
|
One of the major changes between Listing 20.5 and Listing 20.6 is in the readFormData() function. Instead of actually printing the Guest book data, the function now creates hash entries for it. This change was done so that an error page could be generated if the data file could not be opened. Otherwise, the error message would have appeared it the middle of the Guest book page - leading to confusion on the part of vistors.
A table was used to add two hypertext links to the top of the web page. One link will let visitors add a new entry and the other refreshes the page. If a second visitor has added a Guest book entry while the first visitor was reading, refreshing the page will display the new entry.
A "correct" HTML document will be entirely enclosed inside of a set of <HTML>..</HTML> tags. Inside the <HTML> tag are <HEAD>..</HEAD> (surrounds document identification information) and <BODY>..</BODY> (surrounds document content information) tags.
After the brief introduction to HTML, you read about Server-Side Includes. They are used to insert information into a document at the time that the page is sent to the web browser. This lets the document designer create dynamic pages without needing CGI programs. For example, you can display the last modification date of a document, or include other document such as a standard footer file.
Next, HTML forms were discussed. HTML forms display input fields that query the visitor to your web site. You can display input boxes, checkboxes, radio buttons, selection lists, submit buttons and reset buttons. Everything inside a set of <FORM>..</FORM> tags is considered one form. You can have multiple forms on a single web page.
The <FORM> tag takes two modifiers. The ACTION modifier tell the web browser the name of the CGI program that gets invoked when the form's submit button is clicked. And the METHOD modifier determines how the form information should be sent to the CGI program. If the GET method is used, the information from the form's fields will be available in the QUERY_STRING environment variable. IF the POST method is used, the form information will be available via the STDIN variable.
The getFormData() function was developed to process form information about make it available via a hash variable. This function is the first line of defense against hackers. By investing time developing this function to close security holes, you are rewarded by having a safer, more stable web site.
Debugging a CGI script takes a little bit of preparation. First, create a batch or shell file that defines the environment variables that your CGI program needs. Then, create a test input file if you are using the POST method. Lastly, execute the CGI program from the command line using re-direction to point STDIN to your test input file.
Next, a Guest book application was presented. This application used an HTML form to gather comments from a user. The comments are saved to a database. Then, all of the comments stored in the database are displayed. The first version of the Guest book required the user to add an entry before seeing the contents of the Guest book. The second version of the Guest book let users view the contents without this requirement. In addition, better error checking and new features were added.
The next chapter, "Using Perl with Web Servers," explores web server log files and ways to automatically create web pages.