| |||
Links Sections Module Constructors and Destructors The require Compiler Directive Chapters Part I: Basic Perl 02-Numeric and String
Literals Part II: Intermediate Perl Part III: Advanced Perl 13-Handling Errors and
Signals Part IV: Perl and the Internet 21-Using Perl with Web
Servers Appendixes |
In the last chapter, you were introduced to object-oriented programming. Along the way, you learned some aspects of programming with Modules although you may not have realized it. I believe the shortest definition of a module is a namespace defined in a file. For example, the English module is defined in the English.pm file and the Find module is defined in the Find.pm file.
Of course, modules are more than simply a namespace in a file. But, don't be concerned - there's not much more.
Perl 4, the last version of Perl, depended on libraries to group functions in units. There were 31 libraries shipped with Perl 4.036. These have been replaced with a standard set of modules. However, the old libraries are still available in case you run across some old Perl scripts that need them.
Libraries - and modules - are generally placed in a subdirectory called Lib. On my machine, the library directory is c:\perl5\lib. If you don't know what your library directory is, ask your system administrator. Some modules are placed in subdirectories like Lib/Net or Lib/File. The modules in these sibdirectories are loaded using the subdirectory name, two colons, and the module name. For example, Net::Ping or File::Basename.
Libraries are made available to your script by using the require compiler directive. Directives may seem like functions, but they aren't. The difference is that compiler directives are carried out when the script is compiled and functions are executed while the script is running.
Note |
You might think the distinction between compiler
directives and functions is minor. And you might be right. I like to be as
precise as possible when using computer terminology. After all, the
computer is precise; why shouldn't we be, too?
Unfortunately, Perl doesn't make it easy to create simple definitions and place every feature into a nice orderly category. So don't get hung up on attaching a label to everything. If you know what something does, the names won't matter a whole lot. |
Some modules are just collections of functions - like the libraries - with some "module" stuff added. Modules should follow these guidelines:
Modules are loaded by the use directive, which is similar to require except it automates the importing of function and variable names.
Modules that are simply a collection of functions can be thought of as classes without constructors. Remember that the package name is the class name. Whenever you see a package name, you're also seeing a class - even if none of the object-oriented techniques are used.
Object-oriented modules keep all function and variable names close to the vest - so to speak. They are not available directly, you access them through the module name. Remember the Inventory_item->new() notation?
However, simple function collections don't have this object-oriented need for secrecy. They want your script to directly access the defined functions. This is done using the Exporter class, @EXPORT, and @EXPORT_OK.
The Exporter class supplies basic functionality that gives your script access to the functions and variables inside the module. The import() function, defined inside the Exporter class, is executed at compile-time by the use compiler directive. The import() function takes function and variable names from the module namespace and places them into the main namespace. Thus, your script can access them directly.
You may occasionally see a reference to what may look like a nested module. For example, $Outer::Inner::foo. This really refers to a module named Outer::Inner, so named by the statement: package Outer::Inner;. Module designers sometimes use this technique to simulate nested modules.
Perl has constructors and destructors that work at the module level as well as the class level. The module constructor is called the BEGIN block, while the module destructor is called the END block.
Pseudocode |
Define a BEGIN block for the main package. Display a string indicating the begin block is executing. Start the Foo package. Define a BEGIN block for the Foo package. Display a string indicating the begin block is executing. |
Listing 15.1-15LST01.PL - Using BEGIN Blocks |
|
This program displays:
main
Foo
Listing 15.2-15LST02.PL - Using END Blocks |
|
This program displays:
Foo
Main
Note |
Signals that are sent to your script can bypass the END blocks. So, if your script is in danger of stopping due to a signal, be sure to define a signal-handler function. See Chapter 13, "Handling Errors and Signals," for more information. |
Pseudocode |
Define the dispSymbols() function. Get the hash reference that should be the first parameter. Declare local temporary variables. Initialize the %symbols variable. This is done to make the code easier to read. Initialize the @symbols variables. This variable is also used to make the code easier to read. Iterate over the symbols array displaying the key-value pairs of the symbol table. Call the dispSymbols() function to display the symbols for the Foo package. Start the Foo package. Initialize the $bar variable. This will place an entry into the symbol table. Define the baz() function. This will also create an entry into the symbol table. |
Listing 15.3-15LST03.PL - How to Display the Entries in a Symbol Table |
|
This program displays:
bar | *Foo::bar
baz | *Foo::baz
This example shows that there are only two things
in the %Foo:: symbol table - only those things that the script placed
there. This is not the case with the %main:: symbol table. When I
display the entries in %main:: I see over 85 items. Part of the reason
for the large number of name in the main package is that some variables
are forced there. For example, STDIN, STDOUT, STDERR,
@ARGV, @ARGVOUT, %ENV, @INC, and
%SIG are forced into the main namespace regardless of when
they are used.
require Room.pl;
No exporting of symbols is done by the
require directive. So all symbols in the libraries must be explicitly
placed into the main namespace. For example, you might see a library
that looks like this:
package abbrev;
sub main'abbrev {
# code for the function
}
Two things in this code snippet point out that it is Perl 4 code. The
first is that the package name is in all lowercase. And the second is that a
single-quote is used instead of double-colons to indicate a qualifying package
name. Even though the abbrev() function is defined inside the
abbrev package, it is not part of the %abbrev:: namespace
because of the main' in front of the function name.
The require directive can also indicate that your script needs a certain version of Perl to run. For example, if you are using references you should place the following statement at the top of your script:
require 5.000;
And if you are using a feature that is only
available with Perl 5.002 - like prototypes - use the following:
require 5.002;
Perl 4 will generate a fatal error if these lines
are seen.
Note |
Prototypes are not covered in this book. If you are using Perl 5.002 or later, prototypes should be discussed in the documentation that comes with the Perl distribution. |
The use directive will automatically export function and variable names to the main namespace by calling the module's import() function. Most modules don't have their own import() function; instead they inherit it from the Exporter module. You have to keep in mind that the import() function is not applicable to object-oriented modules. Object-oriented modules should not export any of their functions or variables.
You can use the following lines as a template for creating your own modules:
package Module;
require(Exporter);
@ISA = qw(Exporter);
@EXPORT = qw(funcOne $varOne @variable %variable);
@EXPORT_OK = qw(funcTwo $varTwo);
The names in the @EXPORT
array will always be moved into the main namespace. Those names in the
@EXPORT_OK will only be moved if you request them. This small module
can be loaded into your script using this statement:
use Module;
Since use is a compiler directive, the module
is loaded as soon as the compiler sees the directive. This means that the
variables and functions from the module are available to the rest of your
script.
If you need to access some of the names in the @EXPORT_OK array, use a statement like this:
use Module qw(:DEFAULT funcTwo); # $varTwo is not exported.
Once
you add optional elements to the use directive you need to explicitly
list all of the names that you want to use. The :DEFAULT is a short way
of saying, "give me everything in the @EXPORT list."
Remember all of the new terminology that was developed for objects? The computer scientists have also developed their own term for a compiler directive. And that term is Pragma. The use statement controls the other pragmas. Listing 15.4 shows a program that use the integer pragma.
Listing 15.4-15LST04.PL - Using the integer Pragma |
|
This program displays:
Floating point math: 3.33333333333333
Integer math: 3
Pragmas can be turned off using the no
compiler directive. For example, the following statement turns off the
integer pragma:
no integer;
Table 15.1 shows a list of the pragmas that you can
use.
Pragma | Description |
---|---|
integer | Forces integer math instead of floating point or double precision math. |
less | Requests less of something - like memory or cpu time - from the compiler. This pragma has not been implemented yet. |
sigtrap | Enables stack backtracing on unexpected signals. |
strict | Restricts unsafe constructs. This pragma is highly recommended! Every program should use it. |
subs | Lets you predeclare function names. |
Symbolic references use the name of a variable as the reference to the variable. They are a kind of shorthand widely used in the C programming language, but not available in Perl. Listing 15.5 shows a program that uses symbolic references.
Pseudocode |
Declare two variables. Initialize $ref with a reference to $foo. Dereference $ref and display the result. Initialize $ref to $foo. Dereference $ref and display the result. Invoke the strict pragma. Dereference $ref and display the result. |
Listing 15.5-15LST05.PL - Detecting Symbolic References |
|
When run with the command perl 15lst05.pl, this program displays:
Testing.
Can't use string ("Testing.") as a SCALAR ref while "strict refs" in
use at 15lst05.pl line 14.
The second print statement, even though
obviously wrong, does not generate any errors. Imagine if you were using a
complicated data structure like the ones described in Chapter 8, "References." You could spend
hours looking for a bug like this. After the strict pragma is turned
on, however, a run-time error is generated when the same print statement is
repeated. Perl even displays the value of the scalar that attempted to
masquerade as the reference value.
The strict pragma ensures that all variables that are used are either local to the current block or they are fully qualified. Fully qualifying a variable name simply means to add the package name where the variable was defined to the variable name. For example, you would specify the $numTables variable in package Room by saying $Room::numTables. If you are not sure which package a variable is defined it, try using the dispSymbols() function from Listing 15.3. Call the dispSymbols() function once for each package that your script uses.
The last type of error that strict will generate an error for is the non-quoted word that is not used as a subroutine name or file handle. For example, the following line is good:
$SIG{'PIPE'} = 'Plumber';
And this line is bad:
$SIG{PIPE} = 'Plumber';
Perl 5, without the strict
pragma, will do the correct thing in the bad situation and assume that you meant
to create a string literal. However, this is considered bad programming
practice.
Tip |
Always use the strict pragma in your scripts. It will take a little longer to declare everything, but the time saved in debugging will more than make up for it. |
Module | Description |
---|---|
Text::Abbrev | Creates an abbreviation table from a list. The abbreviation table consists of the shortest sequence of characters that can uniquely identify each element of the list. |
AnyDBM_File | Provides a framework for accessing multiple DBMs. This is a UNIX-based module. |
AutoLoader | Loads functions on demand. This enables your scripts to use less memory. |
AutoSplit | Splits a package or module into its component parts for autoloading. |
Benchmark | Tracks the running time of code. This module can be modified to run under Windows but some of its functionality will be lost. |
Carp | Provides an alternative to the warn() and die() functions that report the line number of the calling routine. See "Example: The Carp Module" later in the chapter for more information. |
I18N::Collate | Compares 8-bit scalar data according to the current locale. This helps to give an international viewpoint to your script. |
Config | Accesses the Perl configuration options. |
Cwd | Gets the pathname of the current working direcory. This module will generate a warning message when used with the -w command line option under the Windows and VAX VMS operating systems. You can safely ignore the warning. |
Dynaloader | Lets you dynamically load C libraries into Perl code. |
English | Lets you use english terms instead of the normal special variable names. |
Env | Lets you access the system environment variables using scalars instead of a hash. If you make heavy use of the environment variables, this module might improve the speed of your script. |
Exporter | Controls namespace manipulations. |
Fcntl | Loads file control definition used by the fcntl() function. |
FileHandle | Provides an object-oriented interface to filehandles. |
File::Basename | Parses a file name and path from a specification. |
File::CheckTree | Runs many filetest checks on a directory tree. |
File::Find | Traverse a file tree. This module will not work under the Windows operating systems without modification. |
Getopt | Provides basic and extended options processing. |
ExtUtils::MakeMaker | Creates a Makefile for a Perl extension |
Ipc::Open2 | Opens a process for both reading and writing. |
Ipc::Open3 | Opens a process for reading, writing and error handling. |
POSIX | Provides an interface to IEEE 1003.1 namespace. |
Net::Ping | Checks to see if a host is available. |
Socket | Loads socket definitions used by the socket functions. |
If you need to declare variables that are local to a package, fully qualify your variable name in the declaration or initialization statement, like this:
use strict;
$main::foo = '';
package Math;
$Math::PI = 3.1415 && $Math::PI;
This code snippet declares
two variables: $foo in the main namespace and $PI in
the Math namespace. The && $Math::PI part of the
second declaration is used to avoid getting error messages from the -w command
line option. Since the variable is inside a package, there is no guarantee that
it will be used by the calling script and the -w command line option generates a
warning about any variable that is only used once. By adding the harmless
logical and to the declaration, the warning messages are avoided.
Pseudocode |
Load the Carp module. Invoke the strict pragma. Start the Foo namespace. Define the foo() function. Call the carp() function. Call the croak() function. Switch to the main namespace. Call the foo() function. |
Listing 15.6-15LST06.PL - Using the carp() and croak() from the Carp Module |
|
This program displays:
carp called at line 9,
but foo() was called at e.pl line 18
croak called at line 10,
but foo() was called at e.pl line 18
This example uses a compiler
symbol, __LINE__, to incorporate the current line number in the string passed to
both carp() and croak(). This technique enables you to see
both the line number where carp() and croak() were called
and the line number were foo() was called.
The Carp module also defines a confess() function which is similar to croak() except that a function call history will also be displayed. Listing 15.7 shows how this function can be used. The function declarations were placed after the foo() function call so that the program flow reads from top to bottom with no jumping around.
Pseudocode |
Load the Carp module. Invoke the strict pragma. Call foo(). Define foo(). Call bar(). Define bar(). Call baz(). Define baz(). Call Confess(). |
Listing 15.7-15LST07.PL - Using confess() from the Carp Module |
|
This program displays:
I give up! at e.pl line 16
main::baz called at e.pl line 12
main::bar called at e.pl line 8
main::foo called at e.pl line 5
This daisy-chain of function
calls was done to show you how the function call history looks when displayed.
The function call history is also called a stack trace. As each function
is called, the address from which it is called gets placed on a stack. When the
confess() function is called, the stack is unwound or read. This lets
Perl print the function call history.
Note |
Some of the same concepts embodied by the special variables are used by the UNIX-based awk program. The English module also provides aliases that match what the special variables are called in awk. |
Tip |
I think that this module is especially useful because it provides aliases for the regular expression matching special variables and the formatting special variables. You'll use the other special variables often enough so that their use becomes second nature. Or else you won't need to use them at all. |
Special Variable | Alias |
---|---|
Miscellaneous | |
$_ | $ARG |
@_ | @ARG |
$" | $LIST_SEPARATOR |
$; | $SUBSCRIPT_SEPARATOR or $SUBSEP |
Regular Expression or Matching | |
$& | $MATCH |
$` | $PREMATCH |
$' | $POSTMATCH |
$+ | $LAST_PAREN_MATCH |
Input | |
$. | $INPUT_LINE_NUMBER or $NR |
$/ | $INPUT_RECORD_SEPARATOR or $RS |
Output | |
$| | $OUTPUT_AUTOFLUSH |
$, | $OUTPUT_FIELD_SEPARATOR or $OFS |
$\ | $OUTPUT_RECORD_SEPARATOR or $ORS |
Formats | |
$% | $FORMAT_PAGE_NUMBER |
$= | $FORMAT_LINES_PER_PAGE |
$_ | $FORMAT_LINES_LEFT |
$~ | $FORMAT_NAME |
$^ | $FORMAT_TOP_NAME |
$: | $FORMAT_LINE_BREAK_CHARACTERS |
$^L | $FORMAT_FORMFEED |
Error Status | |
$? | $CHILD_ERROR |
$! | $OS_ERROR or $ERRNO |
$@ | $EVAL_ERROR |
Process Information | |
$$ | $PROCESS_ID or $PID |
$< | $REAL_USER_ID or $UID |
$> | $EFFECTIVE_USER_ID or $EUID |
$( | $REAL_GROUP_ID or $GID |
$) | $EFFECTIVE_GROUP_ID or $EGID |
$0 | $PROGRAM_NAME |
Internal Variables | |
$] | $PERL_VERSION |
$^A | $ACCUMULATOR |
$^D | $DEBUGGING |
$^F | $SYSTEM_FD_MAX |
$^I | $INPLACE_EDIT |
$^P | $PERLDB |
$^T | $BASETIME |
$^W | $WARNING |
$^X | $EXECUTABLE_NAME |
Listing 15.8 shows a program that uses one of the English variables to access information about a matched string.
Pseudocode |
Load the English module. Invoke the strict pragma. Initialize the search space and pattern variables. Perform a matching operation to find the pattern in the $searchSpace variable. Display information about the search. Display the matching string using the English variable names. Display the matching string using the standard Perl special variables. |
Listing 15.8-15LST01.PL - Using the English Module |
|
This program displays
Search space: TTTT BBBABBB DDDD
Pattern: /B+AB+/
Matched String: BBBABBB
Matched String: BBBABBB
You can see that the $& and
$MATCH variables are equivalent. This means that you can use another
programmer's functions without renaming their variables and still use the
English names in your own functions.
Pseudocode |
Load the Env module. Invoke the strict pragma. Declare the @files variable. Open the temporary directory and read all of its files. Display the name of the temporary directory. Display the names of all files that end in tmp. |
Listing 15.9-15LST09.PL - Displaying Temporary Files Using the Env Module |
|
This program displays:
C:\WINDOWS\TEMP
~DF182.TMP
~DF1B3.TMP
~DF8073.TMP
~DF8074.TMP
~WRS0003.tmp
~DF6116.TMP
~DFC2C2.TMP
~DF9145.TMP
This program is pretty self-explanatory, except
perhaps for the manner in which the $main::TEMP variable is specified.
The strict pragma requires all variables to be lexically declared or to
be fully qualified. The environment variables are declared in the Env
package, but exported into the main namespace. Therefore, they need to
be qualified using the main:: notation.
The require compiler directive is used to load Perl libraries that were distributed with Perl 4. Modules, however, are loaded with the use directive. In addition to loading the module, use will move variable and function names into the main namespace where your script can easily access them. The name movement is done by using the @EXPORT and @EXPORT_OK arrays.
Next, you read about the BEGIN and END blocks which are like module constructors and destructors. The BEGIN block is evaluated as soon as it is defined. END blocks are evaluated just before your program ends - in reverse order. The last END block defined is the first to be evaluated.
Symbols tables are used to hold the function and variable names for each package. You learned that each symbol table is stored in a hash named after the package name. For example, the symbol table for the Room package is stored in %Room::. Listing 15.3 contained a function - dispSymbol - that displays all of the names in a given symbol table.
Libraries are loaded using the require compiler directive and modules are loaded with the use directive. Unlike the require directive, use will automatically call a module's import() function to move function and variable names from the module's namespace into the main namespace. The name movement is controlled using the @EXPORT and @EXPORT_OK array. Names in @EXPORT are always exported. Those in @EXPORT_OK must be explicitly mentioned in the use statement.
The use directive also controls other directives which are called pragmas. The most useful pragmas are integer and strict. Use the integer pragma when you need fast integer math. And use strict all of the time to enforce good programming habits - like using local variables.
Table 15.2 shows the 25 modules that are distributed with Perl. And then some more light was shed on how the my() function won't create variables that are local to a package. In order to create variables in the packages' namespace, you need to fully qualify them with the package name. For example, $Math::PI or $Room::numChairs.
The last section of the chapter looked at specific examples of how to use modules. The Carp, English, and Env modules were discussed. Carp defines three functions: carp(), croak(), and confess() that aid in debugging and error handling. English provides aliases for all of Perl's special variables so that Perl code is easier to understand. And Env provides aliases for environmental variables so that you can access them directly instead of through the %Env hash variable.
In the next chapter, you learn about debugging Perl code. You read about syntax or compile-time errors versus run-time errors. And the strict pragma will be discussed in more detail.