| |||
Links Sections Chapters Part I: Basic Perl 02-Numeric and String
Literals Part II: Intermediate Perl Part III: Advanced Perl 13-Handling Errors and
Signals Part IV: Perl and the Internet 21-Using Perl with Web
Servers Appendixes |
A reference is a scalar value that points to a memory location that holds some type of data. Everything in your Perl program is stored inside your computer's memory. Therefore, all of your variables and functions are located at some memory location. References are used to hold the memory addresses. When a reference is dereferenced, you retrieve the information referred to by the reference.
Note |
Note: I briefly mentioned hashes in Chapter 3, "Variables." Just to refresh your memory, hashes are another name for associative arrays. Since "hash" is shorter than "associative array," I'll be using both terms in this chapter. |
Reference Assignment | How to Dereference |
---|---|
$refScalar = \$scalar; | ${$refScalar} is a scalar value. |
$refArray = \@array; | @{$refArray} is an array value. |
$refHash = \%hash; | %{$refHash} is a hash value. |
$refFunction = \&function; | &{$refFunction} is a function location. |
$refGlob = \*FILE; | $refGlob is a reference to a file handle and seems to be automatically dereferenced by Perl. |
$refRef = \$refScalar; | ${${$refRef}} is a scalar value. |
Essentially all you need to do in order to create a reference is to add the backslash to the front of a value or variable.
Errata Note |
The printed version of this book did not mention
Glob references. They can be used to pass file handles to subroutines.
However, it seems like Glob references are not needed since Perl will let
you pass file handles directly (as well as indirectly) to subroutines. For
example: open(FILE, '<test.dat'); $refGlob = \*FILE; readLineFromFile($refGlob); # this line and the next seem to work in the same way. readLineFromFile(FILE); sub readLineFromFile { my($temp) = shift; $line = <$temp>; }If you try this example - create a test.dat file first - you will see that the two calls to readLineFromFile() work correctly. |
Let's start off by passing two arrays into a function to show that the function only sees one array.
Pseudocode |
Call firstSub() with two arrays as parameters. Define the firstSub() function. Create local variables and assign elements from the parameter array to them. Print the local arrays. |
firstSub( (1..5), ("A".."E"));
sub firstSub {
my(@firstArray, @secondArray) = @_;
print("The first array is @firstArray.\n");
print("The second array is @secondArray.\n");
}
This program displays:
The first array is 1 2 3 4 5 A B C D E.
The second array is .
Inside the firstSub() function, the
@firstArray variable was assigned the entire parameter array, leaving
nothing for the @secondArray variable. By passing references to the
@arrayOne and @arrayTwo, we can preserve the arrays for use
inside the function. Very few changes are needed to enable the above example to
use references. Take a look.
Pseudocode |
Call firstSub() using the backslash operator to pass a reference to each array. Define the firstSub() function. Create two local scalar variables to hold the array references. Print the local variables, dereferencing them to look like arrays. This is done using the @{} notation. |
firstSub( \(1..5), \("A".."E") ); # One
sub firstSub {
my($ref_firstArray, $ref_secondArray) = @_; # Two
print("The first array is @{$ref_firstArray}.\n"); # Three
print("The second array is @{$ref_secondArray}.\n"); # Three
}
This program displays:
The first array is 1 2 3 4 5.
The second array is A B C D E.
Three things were done to make this
example use references:
Pseudocode |
Call firstSub() and pass a reference to a scalar and a reference to an array. Define the firstSub() function. Create two local scalar variables to hold the array references. Print the local variables, dereferencing them to look like arrays. |
Listing 8.1-08LST01.PL - Passing a Scalar Reference When the Function Expects an Array Reference Causes Problems |
|
This program displays:
Not an ARRAY reference at 08lst01.pl line 9.
Perl provides the
ref() function so that you can check the reference type before
dereferencing a reference. The next example shows how to trap the mistake of
passing a scalar reference instead of an array reference.
Pseudocode |
Call firstSub() and pass a reference to each variable. Define the firstSub() function. Create two local scalar variables to hold the array references. Print the local variables if each variable is a reference to an array. Otherwise, print nothing. |
Listing 8.2-08LST02.PL - How to Test for an Array Reference Passed as a Parameter |
|
This program displays:
The second array is A B C D E.
Only the second parameter is
printed because the first parameter - the scalar reference - failed the test on
the line marked "One." The statement modifiers on the lines marked "One" and
"Two" ensure that we are dereferencing an array reference. This prevents the
error message that appeared earlier. Of course, in your own programs you might
want to set an error flag or print a warning.
Table 8.2 shows the different values that the ref() function can return.
Function Call | Return Value | ||
---|---|---|---|
ref( 10 ); | undefined | ||
ref( \10 ); | SCALAR | ||
ref( \(1, 2) ); | SCALAR
| ||
ref( \{1 => "Joe"} ); | HASH | ||
ref( \&firstSub ); | CODE | ||
ref( \\10 ); | REF |
Listing 8.3 shows another example of the ref() function in action.
Pseudocode |
Initialize scalar, array, and hash variables. Pass the variables to the printRef() function. These are non-references so the undefined value should be returned. Pass variable references to the printRef() function. This is accomplished by prefixing the variable names with a backslash. Pass a function reference and a reference to a reference to the printRef() function. Define the printRef() function. Iterate over the parameter array. Assign the reference type to $refType. If the current parameter is a reference then print its reference type otherwise print that it's a non-reference. |
Listing 8.3-08LST03.PL - Using the ref() Function to Determine the Reference Type of a Parameter |
|
This program displays:
Non-reference Non-reference Non-reference Non-reference Non-reference
SCALAR ARRAY HASH
CODE REF
By using the ref() function you can protect program
code that dereferences variables from producing errors when the wrong type of
reference is used. Notice that five 'Non-references' strings are displayed. Why?
Because both @array and %hash are 'flattened' when they are moved into the
@_ array.
%record = ( "Name" => "Jane Hathaway",
"Address" => "123 Anylane Rd.",
"Town" => "AnyTown",
"State" => "AnyState",
"Zip" => "12345-1234"
);
The %record associative array can also be considered a
data record with five members. Each member is a single item of
information. The data record is a group of members that relate to a single
topic. In this case, that topic is a customer address. And a database is
one or more data records.
Each member is accessed in the record by using its name as the key. For example, you can access the state member by saying $record{"State"}. In a similar manner all of the members can be accessed.
Of course, a database with only one record is not very useful. By using references, you can build a multiple record array. Listing 8.4 shows two records and how to initialize a database array.
Pseudocode |
Declare a data record called %recordOne as an associative array. Declare a data record called %recordTwo as an associative array. Declare an array called @database with references to the associative arrays as elements. |
Listing 8.4-08LST04.PL - A Database with Two Records |
|
You can print the address member of the first record like this:
print( %{$database[0]}->{"Address"} . "\n");
which displays:
123 Anylane Rd.
Let's dissect the dereferencing expression in this
print statement. Remember to work left to right and always evaluate brackets and
parentheses first. Ignoring the print() function and the newline, you
can evaluate this line of code in the following way.
Note |
Using the %{$database[0]}->{"Address"} notation is pretty cumbersome. Although it lets you can see exactly what is happening in terms of referencing and dereferencing. You can also let Perl handle the details by using the following notation: $database[0]->{"Address"}. |
The variable declaration in the above example uses three variables to define the data's structure. We can condense the declaration down to one variable as shown in Listing 8.5.
Pseudocode |
Declare an array called @database with two associative arrays as elements. Because the associative arrays are not being assigned directly to a variable, they are considered anonymous. Print the value associated with the "Name" key for the first element of the @database array. Print the value associated with the "Name" key for the second element of the @database array. |
Listing 8.5-08LST05.PL - Declaring the Database Structure in One Shot |
|
This program displays:
Jane Hathaway
Kevin Hughes
Let's analyze the dereferencing code in the first print
line.
Even though the structure declarations in the last two examples look different, they are equivalent. You can confirm this because the structures are dereferenced the same way. What's happening here? Perl is creating anonymous associative array references that become elements of the @database array.
In the previous example, each hash had a name - %recordOne and %recordTwo. In the current example, there is no variable name directly associated with the hashes. If you use an anonymous variable in your programs, Perl will automatically provide a reference to it.
We can explore the concepts of data records a bit further using this basic example. So far, we've used hash references as elements of an array. When one data type is stored inside of another data type, this is called nesting data types. You can nest data types as often and as deeply as you'd like.
At this stage of the example, %{$database[0]}->{"Name"} was used to dereference the "Name" member of the first record. This type of dereferencing uses an array subscript to tell Perl which record to look at. However, you could use an associative array to hold the records. With an associative array, you could look at the records using a customer number or other id value. Listing 8.6 shows how this can be done.
Pseudocode |
Declare a hash called %database with two keys, MRD-100 and MRD-250. Each key has a reference to an anonymous hash as its value. Find the reference to the hash associated with the key "MRD-100." Then print the value associated with the key "Name" inside the first hash. Find the reference to the hash associated with the key "MRD-250." Then print the value associated with the key "Name" inside the first hash. |
Listing 8.6-08LST06.PL - Using an Associative Array to Hold the Records |
|
This program displays:
Jane Hathaway
Kevin Hughes
You should be able to follow the same steps that we used
previously to decipher the print statement in this listing. The key is that the
associative array index is surrounded by the curly brackets instead of the
square brackets used previously.
There is one more twist that I'd like to show you using this data structure. Let's see how to dynamically add information. First, we'll look at adding an entire data record and then we'll look at adding new members to an existing data record. Listing 8.7 shows you can use a standard hash assignment to dynamically create a data record.
Pseudocode |
Assign a reference to a hash to the "MRD-300" key in the %database associative array. Assign the reference to the hash associated with the key "MRD-300" to the $refCustomer variable. Print the value associated with the key "Name" inside hash referenced by $refCustomer. Print the value associated with the key "Address" inside hash referenced by $refCustomer. |
Listing 8.7-08LST07.PL - Creating a Record Using Hash Assignment |
|
This program displays:
Nathan Hale
999 Centennial Ave.
Notice that by using a temporary variable
($refCustomer) the program code is more readable. The alternative would
be this:
print(%{$database{"MRD-300"}}->{"Name"} . "\n");
Most
programmers would agree that using the temporary variable aids in the
understanding of the program.
Our last data structure example shows how to add members to an existing customer record. Listing 8.8 shows how to add two phone number members to customer record MRD-300.
Pseudocode |
Assign a reference to an anonymous function to $codeRef. This function will print the elements of the %database hash. Since each value in the %database hash is a reference to another hash, the function has an inner loop to dereference the sub-hash. Assign a reference to a hash to the "MRD-300" key in the %database associative array. Call the anonymous routine by dereferencing $codeRef to print the contents of %database. This is done by surrounding the code reference variable with curly braces and prefixing it with a & to indicate that it should be dereferenced as a function. Assign the reference to the hash associated with the key "MRD-300" to the $refCustomer variable. Add "Home Phone" as a key to the hash associated with the "MRD-300" key. Add "Business Phone" as a key to the hash associated with the "MRD-300" key. Call the anonymous routine by dereferencing $codeRef to print the contents of %database. |
Listing 8.8-08LST08.PL - How to Dynamically Add Members to a Data Structure |
& {$codeRef}; $refCustomer = $database{"MRD-300"}; %{$refCustomer}->{"Home Phone"} = "(111) 511-1322"; %{$refCustomer}->{"Business Phone"} = "(111) 513-4556"; # print database after dynamic changes.& {$codeRef}; |
This program displays:
MRD-300 = {
Town => AnyTown
State => AnyState
Name => Nathan Hale
Zip => 12345-1234
Address => 999 Centennial Ave.
};
MRD-300 = {
Town => AnyTown
State => AnyState
Name => Nathan Hale
Home Phone => (111) 511-1322
Zip => 12345-1234
Business Phone => (111) 513-4556
Address => 999 Centennial Ave.
};
This example does two new things. The first thing is that it uses an
anonymous function referenced by $codeRef. This is done for
illustration purposes. There is no reason to use an anonymous function. There
are actually good reasons for you not to do so in normal programs. I think that
anonymous functions make programs much harder to understand.
Note |
When helper functions are small and easily understood, I like to place them at the beginning of code files. This helps me to quickly refresh my memory when coming back to view program code after time spent doing other things. |
The second thing is that a regular hash assignment statement was use to add values. You can use any of the array functions with these nested data structures.
Pseudocode |
Call the makeLine() function from inside a double-quoted string. Define the makeLine() function. Return the dash character repeated a specified number of times. The first element in the parameter array is the number of times to repeat the dash. |
print("Here are 5 dashes ${\makeLine(5)}.\n");
print("Here are 10 dashes ${\makeLine(10)}.\n");
sub makeLine {
return("-" x $_[0]);
}
This program displays:
Here are 5 dashes -----.
Here are 10 dashes ----------.
The trick in this example is that the
backslash turns the scalar return value into a reference and then the dollar
sign and curly braces turn the reference back into a scalar value that the
print() function can interpret correctly. If the backslash character is
not used to create the reference to the scalar return value then the
${} dereferencing operation does not have a reference to dereference
and you will get an "initialized value" error.
There are five types of references that you can use in Perl. You can have a reference to scalars, arrays, hashes, functions, and other references. If you need to determine what type of reference is passed to a function, use the ref() function.
The ref() function returns a string that indicates which type of reference was passed to it. If the parameter was not a reference, the undefined value is returned. You discovered that it is always a good idea to check reference types to prevent errors caused by passing the wrong type of reference. An example was given that caused a error by passing a scalar reference when the function expected an array reference.
NOTE: It seems that the return value for ref() might be platform dependent. When using ref on a non-reference, Windows 95 returns an undefined null value. Whereas on Linux, ref on a non-reference returned defined null. And this behavior may be simply a bug on one of the platforms. The upshot of this note is that you should test the return value of ref before depending on it.
A lot of time was spent discussing data records and how to access information stored in them. You learned how to step through dissecting a dereferencing expression; how to dynamically add new data records to an associative array; and how to add new data members to an existing record.
The last thing covered in this chapter was how to interpolate function calls inside double-quoted strings. You'll use this technique - at times - to avoid using temporary variables when printing or concatenating the output of functions to other strings.
Chapter 9, "Using Files," introduces you to opening, reading, and writing files. You find out how to store the data records you've constructed in this chapter to a file for long-term storage.
print("${\ref(\(1..5))}");
print(%{$database{"MRD-100"}}->{"Zip"} . "\n");
$ref = \\\45;
"Best days to call" => ["Monday", "Thursday" ]