A reference is a scalar value that points to a memory location that holds some type of data. Everything in your Perl program is stored inside your computer's memory. Therefore, all of your variables and functions are located at some memory location. References are used to hold the memory addresses. When a reference is dereferenced, you retrieve the information referred to by the reference.

Reference Types

There are six types of references. A reference can point to a scalar, an array, a hash, a function, a glob or another reference. Table 8.1 shows how the different types are valued with the assignment operator and how to dereference them using curly braces.

Note

Note: I briefly mentioned hashes in Chapter 3, "Variables." Just to refresh your memory, hashes are another name for associative arrays. Since "hash" is shorter than "associative array," I'll be using both terms in this chapter.

Table 8.1 - The Six Types of References

Reference Assignment How to Dereference

$refScalar = \$scalar; ${$refScalar} is a scalar value.

$refArray = \@array; @{$refArray} is an array value.

$refHash = \%hash; %{$refHash} is a hash value.

$refFunction = \&function; &{$refFunction} is a function location.

$refGlob = \*FILE; $refGlob is a reference to a file handle and seems to be automatically dereferenced by Perl.

$refRef = \$refScalar; ${${$refRef}} is a scalar value.

Table 8.1 - The Six Types of References
Reference Assignment	How to Dereference
$refScalar = \$scalar;	${$refScalar} is a scalar value.
$refArray = \@array;	@{$refArray} is an array value.
$refHash = \%hash;	%{$refHash} is a hash value.
$refFunction = \&function;	&{$refFunction} is a function location.
$refGlob = \*FILE;	$refGlob is a reference to a file handle and seems to be automatically dereferenced by Perl.
$refRef = \$refScalar;	${${$refRef}} is a scalar value.

Essentially all you need to do in order to create a reference is to add the backslash to the front of a value or variable.

Errata Note

The printed version of this book did not mention Glob references. They can be used to pass file handles to subroutines. However, it seems like Glob references are not needed since Perl will let you pass file handles directly (as well as indirectly) to subroutines. For example: open(FILE, '<test.dat'); $refGlob = \*FILE; readLineFromFile($refGlob); # this line and the next seem to work in the same way. readLineFromFile(FILE); sub readLineFromFile { my($temp) = shift; $line = <$temp>; }
If you try this example - create a test.dat file first - you will see that the two calls to readLineFromFile() work correctly.

Example: Passing Parameters to Functions

Back in Chapter 5, "Functions," we talked about passing parameters to functions. At the time, we were not able to pass more than one array to a function. This was because functions only see one array (the @_ array) when looking for parameters. References can be used to overcome this limitation.

Let's start off by passing two arrays into a function to show that the function only sees one array.

Pseudocode

Call firstSub() with two arrays as parameters. Define the firstSub() function. Create local variables and assign elements from the parameter array to them.
Print the local arrays.

firstSub( (1..5), ("A".."E"));

sub firstSub {
    my(@firstArray, @secondArray) = @_;

    print("The first array is  @firstArray.\n");
    print("The second array is @secondArray.\n");
}

This program displays:

The first array is  1 2 3 4 5 A B C D E.
The second array is .

Inside the firstSub() function, the @firstArray variable was assigned the entire parameter array, leaving nothing for the @secondArray variable. By passing references to the @arrayOne and @arrayTwo, we can preserve the arrays for use inside the function. Very few changes are needed to enable the above example to use references. Take a look.

Pseudocode

Call firstSub() using the backslash operator to pass a reference to each array. Define the firstSub() function. Create two local scalar variables to hold the array references.
Print the local variables, dereferencing them to look like arrays. This is done using the @{} notation.

firstSub( \(1..5), \("A".."E") );                         # One

sub firstSub {
    my($ref_firstArray, $ref_secondArray) = @_;           # Two

    print("The first array is  @{$ref_firstArray}.\n");   # Three
    print("The second array is @{$ref_secondArray}.\n");  # Three
}

This program displays:

The first array is  1 2 3 4 5.
The second array is A B C D E.

Three things were done to make this example use references:

In the line marked "One," backslashes were added to indicate that a reference to the array should be passed.
In the line marked "Two," the references were taken from the parameter array and assigned to scalar variables.
In the lines marked "Three," the scalar values were dereferenced. Dereferencing means that Perl will use the reference as if it were a normal data type - In this case, an array variable.

Example: The ref() Function

Using references to pass arrays into a function worked well and it was easy, wasn't it? However, what happens if you pass a scalar reference to the firstSub() function instead of an array reference?

Pseudocode

Call firstSub() and pass a reference to a scalar and a reference to an array. Define the firstSub() function. Create two local scalar variables to hold the array references.
Print the local variables, dereferencing them to look like arrays.

Listing 8.1-08LST01.PL - Passing a Scalar Reference When the Function Expects an Array Reference Causes Problems

firstSub( \10, \("A".."E") ); sub firstSub { my($ref_firstArray, $ref_secondArray) = @_; print("The first array is @{$ref_firstArray}.\n"); print("The second array is @{$ref_secondArray}.\n"); }

This program displays:

Not an ARRAY reference at 08lst01.pl line 9.

Perl provides the ref() function so that you can check the reference type before dereferencing a reference. The next example shows how to trap the mistake of passing a scalar reference instead of an array reference.

Pseudocode

Call firstSub() and pass a reference to each variable. Define the firstSub() function. Create two local scalar variables to hold the array references.
Print the local variables if each variable is a reference to an array. Otherwise, print nothing.

Listing 8.2-08LST02.PL - How to Test for an Array Reference Passed as a Parameter

firstSub( \10, \("A".."E") ); sub firstSub { my($ref_firstArray, $ref_secondArray) = @_; print("The first array is @{$ref_firstArray}.\n") if (ref($ref_firstArray) eq "ARRAY"); # One print("The second array is @{$ref_secondArray}.\n") if (ref($ref_secondArray) eq "ARRAY"); # Two }

This program displays:

The second array is A B C D E.

Only the second parameter is printed because the first parameter - the scalar reference - failed the test on the line marked "One." The statement modifiers on the lines marked "One" and "Two" ensure that we are dereferencing an array reference. This prevents the error message that appeared earlier. Of course, in your own programs you might want to set an error flag or print a warning.

Table 8.2 shows the different values that the ref() function can return.

Table 8.2 - Using the ref() Function

Function Call Return Value

ref( 10 ); undefined

ref( \10 ); SCALAR

ref( \(1, 2) ); SCALAR

Errata Note

The printed version of this book indicated that ref( \(1, 2) ); would return ARRAY which is erroneous. Instead, SCALAR is returned because the array is being evaluated in a scalar context. Therefore, $c = \(1, 3); results in $c being a reference to a scalar that has a value of 3. However, $c = \(1..3); and $c = [1..3] does result in $c being a reference to an array with three elements (1, 2, and 3).

ref( \{1 => "Joe"} ); HASH

ref( \&firstSub ); CODE

ref( \\10 ); REF

Listing 8.3 shows another example of the ref() function in action.

Pseudocode

Initialize scalar, array, and hash variables. Pass the variables to the printRef() function. These are non-references so the undefined value should be returned. Pass variable references to the printRef() function. This is accomplished by prefixing the variable names with a backslash. Pass a function reference and a reference to a reference to the printRef() function. Define the printRef() function. Iterate over the parameter array. Assign the reference type to $refType.
If the current parameter is a reference then print its reference type otherwise print that it's a non-reference.

Listing 8.3-08LST03.PL - Using the ref() Function to Determine the Reference Type of a Parameter

$scalar = 10; @array = (1, 2); %hash = ( "1" => "Davy Jones" ); # I added extra spaces around the parameter list # so that the backslashes are easier to see. printRef( $scalar, @array, %hash ); printRef( \$scalar, \@array, \%hash ); printRef( \&printRef, \\$scalar ); # print the reference type of every parameter. sub printRef { foreach (@_) { $refType = ref($_); if (defined($refType) && $refType ne '') { print "$refType "; } else { print "Non-reference "; } } print("\n"); }

This program displays:

Non-reference Non-reference Non-reference Non-reference Non-reference
SCALAR ARRAY HASH
CODE REF

By using the ref() function you can protect program code that dereferences variables from producing errors when the wrong type of reference is used. Notice that five 'Non-references' strings are displayed. Why? Because both @array and %hash are 'flattened' when they are moved into the @_ array.

Example: Creating a Data Record

Perl's associative arrays (hashes) are extremely useful when it comes to storing information in a way that facilitates easy retrieval. For example, you could store customer information like this:

%record = ( "Name"    => "Jane Hathaway",
            "Address" => "123 Anylane Rd.",
            "Town"    => "AnyTown",
            "State"   => "AnyState",
            "Zip"     => "12345-1234"
);

The %record associative array can also be considered a data record with five members. Each member is a single item of information. The data record is a group of members that relate to a single topic. In this case, that topic is a customer address. And a database is one or more data records.

Each member is accessed in the record by using its name as the key. For example, you can access the state member by saying $record{"State"}. In a similar manner all of the members can be accessed.

Of course, a database with only one record is not very useful. By using references, you can build a multiple record array. Listing 8.4 shows two records and how to initialize a database array.

Pseudocode

Declare a data record called %recordOne as an associative array. Declare a data record called %recordTwo as an associative array.
Declare an array called @database with references to the associative arrays as elements.

Listing 8.4-08LST04.PL - A Database with Two Records

%recordOne = ( "Name" => "Jane Hathaway", "Address" => "123 Anylane Rd.", "Town" => "AnyTown", "State" => "AnyState", "Zip" => "12345-1234" ); %recordTwo = ( "Name" => "Kevin Hughes", "Address" => "123 Allways Dr.", "Town" => "AnyTown", "State" => "AnyState", "Zip" => "12345-1234" ); @database = ( \%recordOne, \%recordTwo );

You can print the address member of the first record like this:

print( %{$database[0]}->{"Address"} . "\n");

which displays:

123 Anylane Rd.

Let's dissect the dereferencing expression in this print statement. Remember to work left to right and always evaluate brackets and parentheses first. Ignoring the print() function and the newline, you can evaluate this line of code in the following way.

The inner most bracket is [0] which means that we'll be looking at the first element of an array.
The square bracket operators have a left to right associativity so we look left for the name of the array. The name of the array is database.
Next come the curly brackets which tell Perl to dereference. Curly brackets also have a left to right associativity so we look left to see the reference type. In this case we see a % which means an associative array.
The -> is the infix dereference operator. It tells Perl that the thing being dereferenced on the left (the database reference in this case) is connected to something on the right.
The 'thing' on the right is the key value or "Address." Notice that it is inside curly braces exactly as if a regular hash key were being used.

Note

Using the %{$database[0]}->{"Address"} notation is pretty cumbersome. Although it lets you can see exactly what is happening in terms of referencing and dereferencing. You can also let Perl handle the details by using the following notation: $database[0]->{"Address"}.

The variable declaration in the above example uses three variables to define the data's structure. We can condense the declaration down to one variable as shown in Listing 8.5.

Pseudocode

Declare an array called @database with two associative arrays as elements. Because the associative arrays are not being assigned directly to a variable, they are considered anonymous. Print the value associated with the "Name" key for the first element of the @database array.
Print the value associated with the "Name" key for the second element of the @database array.

Listing 8.5-08LST05.PL - Declaring the Database Structure in One Shot

@database = ( { "Name" => "Jane Hathaway", "Address" => "123 Anylane Rd.", "Town" => "AnyTown", "State" => "AnyState", "Zip" => "12345-1234" }, { "Name" => "Kevin Hughes", "Address" => "123 Allways Dr.", "Town" => "AnyTown", "State" => "AnyState", "Zip" => "12345-1234" } ); print(%{$database[0]}->{"Name"} . "\n"); print(%{$database[1]}->{"Name"} . "\n");

This program displays:

Jane Hathaway
Kevin Hughes

Let's analyze the dereferencing code in the first print line.

The inner-most bracket is [0] which means that we'll be looking at the first element of an array.
The square bracket operators have a left to right associativity so we look left for the name of the array. The name of the array is database.
Next comes the curly brackets which tell Perl to dereference. Curly brackets also have a left to right associativity so we look left to see the reference type. In this case we see a % which means an associative array.
The -> is the infix dereference operator. It tells Perl that the thing being dereferenced on the left (the database reference in this case) is connected to something on the right.
The 'thing' on the right is the key value or "Name". Notice that it is inside curly braces exactly as if a regular hash key were being used.

Even though the structure declarations in the last two examples look different, they are equivalent. You can confirm this because the structures are dereferenced the same way. What's happening here? Perl is creating anonymous associative array references that become elements of the @database array.

In the previous example, each hash had a name - %recordOne and %recordTwo. In the current example, there is no variable name directly associated with the hashes. If you use an anonymous variable in your programs, Perl will automatically provide a reference to it.

We can explore the concepts of data records a bit further using this basic example. So far, we've used hash references as elements of an array. When one data type is stored inside of another data type, this is called nesting data types. You can nest data types as often and as deeply as you'd like.

At this stage of the example, %{$database[0]}->{"Name"} was used to dereference the "Name" member of the first record. This type of dereferencing uses an array subscript to tell Perl which record to look at. However, you could use an associative array to hold the records. With an associative array, you could look at the records using a customer number or other id value. Listing 8.6 shows how this can be done.

Pseudocode

Declare a hash called %database with two keys, MRD-100 and MRD-250. Each key has a reference to an anonymous hash as its value. Find the reference to the hash associated with the key "MRD-100." Then print the value associated with the key "Name" inside the first hash.
Find the reference to the hash associated with the key "MRD-250." Then print the value associated with the key "Name" inside the first hash.

Listing 8.6-08LST06.PL - Using an Associative Array to Hold the Records

%database = ( "MRD-100" => { "Name" => "Jane Hathaway", "Address" => "123 Anylane Rd.", "Town" => "AnyTown", "State" => "AnyState", "Zip" => "12345-1234" }, "MRD-250" => { "Name" => "Kevin Hughes", "Address" => "123 Allways Dr.", "Town" => "AnyTown", "State" => "AnyState", "Zip" => "12345-1234" } ); print(%{$database{"MRD-100"}}->{"Name"} . "\n"); print(%{$database{"MRD-250"}}->{"Name"} . "\n");

This program displays:

Jane Hathaway
Kevin Hughes

You should be able to follow the same steps that we used previously to decipher the print statement in this listing. The key is that the associative array index is surrounded by the curly brackets instead of the square brackets used previously.

There is one more twist that I'd like to show you using this data structure. Let's see how to dynamically add information. First, we'll look at adding an entire data record and then we'll look at adding new members to an existing data record. Listing 8.7 shows you can use a standard hash assignment to dynamically create a data record.

Pseudocode

Assign a reference to a hash to the "MRD-300" key in the %database associative array. Assign the reference to the hash associated with the key "MRD-300" to the $refCustomer variable. Print the value associated with the key "Name" inside hash referenced by $refCustomer.
Print the value associated with the key "Address" inside hash referenced by $refCustomer.

Listing 8.7-08LST07.PL - Creating a Record Using Hash Assignment

$database{"MRD-300"} = { "Name" => "Nathan Hale", "Address" => "999 Centennial Ave.", "Town" => "AnyTown", "State" => "AnyState", "Zip" => "12345-1234" }; $refCustomer = $database{"MRD-300"}; print(%{$refCustomer}->{"Name"} . "\n"); print(%{$refCustomer}->{"Address"} . "\n");

This program displays:

Nathan Hale
999 Centennial Ave.

Notice that by using a temporary variable ($refCustomer) the program code is more readable. The alternative would be this:

print(%{$database{"MRD-300"}}->{"Name"} . "\n");

Most programmers would agree that using the temporary variable aids in the understanding of the program.

Our last data structure example shows how to add members to an existing customer record. Listing 8.8 shows how to add two phone number members to customer record MRD-300.

Pseudocode

Assign a reference to an anonymous function to $codeRef. This function will print the elements of the %database hash. Since each value in the %database hash is a reference to another hash, the function has an inner loop to dereference the sub-hash. Assign a reference to a hash to the "MRD-300" key in the %database associative array. Call the anonymous routine by dereferencing $codeRef to print the contents of %database. This is done by surrounding the code reference variable with curly braces and prefixing it with a & to indicate that it should be dereferenced as a function. Assign the reference to the hash associated with the key "MRD-300" to the $refCustomer variable. Add "Home Phone" as a key to the hash associated with the "MRD-300" key. Add "Business Phone" as a key to the hash associated with the "MRD-300" key.
Call the anonymous routine by dereferencing $codeRef to print the contents of %database.

Listing 8.8-08LST08.PL - How to Dynamically Add Members to a Data Structure

$codeRef = sub { while (($key, $value) = each(%database)) { print("$key = {\n"); while (($innerKey, $innerValue) = each(%{$value})) { print("\t$innerKey => $innerValue\n"); } print("};\n\n"); } }; $database{"MRD-300"} = { "Name" => "Nathan Hale", "Address" => "999 Centennial Ave.", "Town" => "AnyTown", "State" => "AnyState", "Zip" => "12345-1234" }; # print database before dynamic changes.
& {$codeRef};
$refCustomer = $database{"MRD-300"}; %{$refCustomer}->{"Home Phone"} = "(111) 511-1322"; %{$refCustomer}->{"Business Phone"} = "(111) 513-4556"; # print database after dynamic changes.
& {$codeRef};

This program displays:

MRD-300 = {
        Town => AnyTown
        State => AnyState
        Name => Nathan Hale
        Zip => 12345-1234
        Address => 999 Centennial Ave.
};

MRD-300 = {
        Town => AnyTown
        State => AnyState
        Name => Nathan Hale
        Home Phone => (111) 511-1322
        Zip => 12345-1234
        Business Phone => (111) 513-4556
        Address => 999 Centennial Ave.
};

This example does two new things. The first thing is that it uses an anonymous function referenced by $codeRef. This is done for illustration purposes. There is no reason to use an anonymous function. There are actually good reasons for you not to do so in normal programs. I think that anonymous functions make programs much harder to understand.

Note

When helper functions are small and easily understood, I like to place them at the beginning of code files. This helps me to quickly refresh my memory when coming back to view program code after time spent doing other things.

The second thing is that a regular hash assignment statement was use to add values. You can use any of the array functions with these nested data structures.

Example: Interpolating Functions inside Double-Quoted Strings

You can use references to force Perl to interpolate the return value of a function call inside double-quoted strings. This helps to reduce the number of temporary variables needed by your program.

Pseudocode

Call the makeLine() function from inside a double-quoted string. Define the makeLine() function.
Return the dash character repeated a specified number of times. The first element in the parameter array is the number of times to repeat the dash.

print("Here are  5 dashes ${\makeLine(5)}.\n");
print("Here are 10 dashes ${\makeLine(10)}.\n");

sub makeLine {
    return("-" x $_[0]);
}

This program displays:

Here are  5 dashes -----.
Here are 10 dashes ----------.

The trick in this example is that the backslash turns the scalar return value into a reference and then the dollar sign and curly braces turn the reference back into a scalar value that the print() function can interpret correctly. If the backslash character is not used to create the reference to the scalar return value then the ${} dereferencing operation does not have a reference to dereference and you will get an "initialized value" error.

Summary

In this chapter you learned about references. References are scalar variables used to hold the memory locations. When references are dereferenced, the actual value is returned. For example, if the value of the reference is assigned like this: $refScalar = \10. Then, dereferencing $refScalar would be equal to 10 and would look like this ${$refScalar}. You can always create a reference to a value or variable by preceding it with a backslash. Dereferencing is accomplished by surrounding the reference variable in curly braces and preceding the left curly brace with a character denoting what type of reference it is. For example, use @ for arrays and & for functions.

There are five types of references that you can use in Perl. You can have a reference to scalars, arrays, hashes, functions, and other references. If you need to determine what type of reference is passed to a function, use the ref() function.

The ref() function returns a string that indicates which type of reference was passed to it. If the parameter was not a reference, the undefined value is returned. You discovered that it is always a good idea to check reference types to prevent errors caused by passing the wrong type of reference. An example was given that caused a error by passing a scalar reference when the function expected an array reference.

NOTE: It seems that the return value for ref() might be platform dependent. When using ref on a non-reference, Windows 95 returns an undefined null value. Whereas on Linux, ref on a non-reference returned defined null. And this behavior may be simply a bug on one of the platforms. The upshot of this note is that you should test the return value of ref before depending on it.

A lot of time was spent discussing data records and how to access information stored in them. You learned how to step through dissecting a dereferencing expression; how to dynamically add new data records to an associative array; and how to add new data members to an existing record.

The last thing covered in this chapter was how to interpolate function calls inside double-quoted strings. You'll use this technique - at times - to avoid using temporary variables when printing or concatenating the output of functions to other strings.

Chapter 9, "Using Files," introduces you to opening, reading, and writing files. You find out how to store the data records you've constructed in this chapter to a file for long-term storage.

Review Questions

What is a reference?
How many types of references are there?
What does the ref() function return if passed a non-reference as a parameter?
What notation is used to dereference a reference value?
What is an anonymous array?
What is a nested data structure?
What will the following line of code display?
```
print("${\ref(\(1..5))}");
```
Using the %database array in Listing 8.6, what will the following line of code display?
```
print(%{$database{"MRD-100"}}->{"Zip"} . "\n");
```

Review Exercises

Write a program that will print the dereferenced value of $ref in the following line of code.
```
$ref = \\\45;
```
Write a function that removes the first element from each array passed to it. The return value of the function should be the number of elements removed from all arrays.
Add error checking to the function written in exercise 4 so that the undef value is returned if one of the parameters is not an array.
Write a program based on Listing 8.7 that adds a data member indicating which weekdays a salesman may call the customer with an id of MRD-300. Use the following as an example:
```
"Best days to call" => ["Monday", "Thursday" ]
```