| |||
Links Sections Chapters Part I: Basic Perl 02-Numeric and String
Literals Part II: Intermediate Perl Part III: Advanced Perl 13-Handling Errors and
Signals Part IV: Perl and the Internet 21-Using Perl with Web
Servers Appendixes |
In this chapter, we'll take a look at some of the ways that Perl handles data. All computer programs use data in some way. Some use it to personalize the program. For example, a mail program might need to remember your name so that it can greet you upon starting. Another program - say one that searches your hard disk for files - might remember your last search parameters in case you want to perform the same search twice.
A literal is a value that is represented "as is" or hard-coded in your source code. When you see the four characters 45.5 in programs it really refers to a value of forty-five and a half. Perl uses four types of literals. Here is a quick glimpse at them:
Associative arrays will be discussed in Chapter 3, "Variables." Numbers, strings, and regular arrays will be discussed in the sections below.
Note |
For those of you who are not familiar with
non-decimal numbering systems, here is a short explanation.
In decimal notation - or base ten - when you see the value 15 it signifies (1 * 10) + 5 or 1510. The subscript indicates which base is being used. In octal notation - or base eight - when you see the value 15 it signifies (1 * 8) + 5 or 1310. In hexadecimal notation - or base 16 - when you see the value 15 it signifies (1 * 16) + 5 or 2110. Base 16 needs an 6 characters in addition to 0 to 9 so that each position can have a total of 16 values. The letters A-F are used to represent 10-15. So the value BD16 is equal to (B16 * 16) + D16 or (1110 * 16) + 1310 which is 18910. |
If you will be using very large or very small numbers, you might also find scientific notation to be of use.
Note |
If you're like me, you probably forgot most of the math you learned in high school. However, scientific notation has always stuck with me. Perhaps because I liked moving decimal points around. Scientific notation looks like 10.23E+4, which is equivalent to 102,300. You can also represent small numbers if you use a negative sign. For example, 10.23E-4 is .001023. Simply move the decimal point to the right if the exponent is positive and to the left if the exponent is negative. |
Errata Note |
The printed version of this book indicated that 10.23E+4 was equivalent to 1,023,000 which was incorrect. The correct number is 102,300. This problem was spotted by Andy Poulsen. |
First, here are some integers.
Pseudocode |
An integer. Integers are numbers with no
decimal components.
An integer in octal format. This number is 35, or (4 * 8) + 3, in base 10. An integer in hexadecimal format. This number is also 35, or (2 * 16) + 3 in base 10. |
123 043 0x23Now, some numbers and fractions - also called floating point values. You will frequently see these values referred to as a float value for simplicity's sake.
Pseudocode |
A float with a value in the tenths place.
You can also say 100 and 5/10.
A float with a fraction value out to the thousandths place. You can also say 54 and 534/1000. |
100.5 54.534Here's a very small number.
Pseudocode |
A very small float value. You can represent this value in scientific notation as 3.4E-5. |
.000034
Note |
The real value of single-quoted strings won't become apparent until you read about variable interpolation in the section "Example: Variable Interpolation" in Chapter 3, "Variables." |
Pseudocode |
A literal that describes one of my favorite
role-playing characters.
A literal that describes the blessed cleric that frequently helps WasWaldo stay alive. |
'WasWaldo the Illusionist' 'Morganna the Fair'Strings are pretty simple, huh? But what if you wanted to use a single quote inside the literal? If you did this, Perl would think you wanted to end the string early and a compiler error would result. Perl uses the backslash (\) character to indicate that the normal function of the single quote - ending a literal - should be ignored for a moment.
Tip |
The backslash character is also called an escape character - perhaps because it lets the next character escape from its normal interpretation. |
Pseudocode |
A literal that comments on WasWaldo's
fighting ability. Notice how the single quote is used.
Another comment from the peanut gallery. Notice that double quotes can be used directly inside single-quoted strings. |
'WasWaldo can\'t hit the broad side of a barn.' 'Morganna said, "WasWaldo can\'t hit anything."'The single-quotes are used here specifically so that the double-quotes can be used to surround the spoken words. Later in the section on double-quoted literals, you'll see that the single-quotes can be replaced by double-quotes if you'd like.
You must know only one more thing about single-quoted strings. You can add a line break to a single-quoted string simply by adding line breaks to your source code - as demonstrated by Listing 2.1.
Pseudocode |
Tell Perl to begin printing.
More Lines for Perl to display The single quote ends the string literal. |
Listing 2.1 - 02LST01.PL - Using Embedded Line Breaks to Skip to a New Line |
|
Figure 2.1 shows a bill of goods displayed on one long, single-quoted literal.
Fig. 02.1 - A bill of goods displayed one long single-quoted literal.
You can see that with single-quoted literals, even the line breaks in your source code are part of the string.
Note |
Variables - which are described in Chapter 3,"Variables" - are simply locations in the computer's memory where Perl holds the various data types. They're called variables because the content of the memory can change as needed. |
The basic double-quoted string is a series of characters surrounded by double quotes. If you need to use the double quote inside the string, you can use the backslash character.
Pseudocode |
This literal is similar to one you've
already seen. Just the quotes are different.
Another literal that uses double quotes inside a double-quoted string. |
"WasWaldo the Illusionist" "Morganna said, \"WasWaldo can't hit anything.\""Notice how the backslash in the second line is used to escape the double quote characters. And the single quote can be used without a backslash.
One major difference between double- and single-quoted strings is that double-quoted strings have some special escape sequences that can be used. Escape sequences represent characters that are not easily entered using the keyboard or that are difficult to see inside an editor window. Table 2.1 shows all of the escape sequences that Perl understands. The examples following the table will illustrate some of them.
Table 2.1 - Escape Sequences | ||||||||||||||||||||||||||||||||||||||||||||||
|
Tip |
In the next chapter, "Variables," you'll see why you might need to use a backslash when using the $ and @ characters. |
Pseudocode |
This literal represents the following:
WasWaldo is 34 years old. The \u is used twice in the first word to
capitalize the w characters. And the hexadecimal notation is used to
represent the age using the ASCII codes for 3 and 4.
This literal represents the following: The kettle was HOT!. The \U capitalizes all characters until a \E sequence is seen. |
"\uwas\uwaldo is \x33\x34 years old." "The kettle was \Uhot\E!"Actually, this example isn't too difficult, but it does involve looking at more than one literal at once and it's been a few pages since our last advanced example. Let's look at the \t and \n escape sequences. Listing 2.2 - a program displaying a bill with several items - will produce the output shown in Figure 2.2.
Pseudocode |
Display a literal as the first line, second
and third of the output.
Display literals that show what was purchased Display a separator line. Display the total. |
Listing 2.2 - 02LST02.PL - Using Tabs and Newline Characters to Print |
|
Errata Note |
The printed version of this book was missing a semi-colon in the third line of the above listing. |
Fig. 02.2 - A bill of goods displayed using newline and tab characters.
Tip |
Notice that Figures 2.1 and 2.2 look identical. This illustrates a cardinal rule of Perl - there's always more than one way to do something. |
This program uses two methods to cause a line break.
I recommend using the \n character so that when looking at your code in the future, you can be assured that you meant to cause a line break and did not simply press the ENTER key by mistake.
Caution |
If you are a C/C++ programmer, this material is not new to you. However, Perl strings are not identical to C/C++ strings because they have no ending NULL character. If you are thinking of converting C/C++ programs to Perl, take care to modify any code that relies on the NULL character to end a string. |
Let's see how to use the back-quoted string to display a directory listing of all text files in the perl5 directory. Figure 2.3 shows what the output of such a program might look like.
Pseudocode |
Print the directory listing. |
print `dir *.txt`;Fig. 02.3 - Using a back-quoted string to display a directory.
All of the escape sequences used with double-quoted strings can be used with back-quoted strings.
This example shows an empty array, an array of numbers and an array of strings. Figure 2.4 shows the output of Listing 2.3.
Pseudocode |
Print the contents of an empty array.
Print the contents of an array of numbers. Print the contents of an array of strings. Print the contents of a array with different data types. |
Listing 2.3 - 02LST03.PL - Printing Some Array Literals |
|
Fig. 02.4 - The output from Listing 2.3, showing different array literals.
The fourth line of this listing shows that you can mix single- and double-quoted strings in the same array. You can also mix numbers and strings interchangeably, as shown in the last line.
Note |
Listing 2.3 uses the period, or
concatenation, operator to join a string representation of the
empty array with the string "Here is an empty array:" and the
string "<-- Nothing there!\n". You can read more about
operators in Chapter 4, "Operators."
In this and other examples in this chapters, the elements of an array will be printed with no spaces between them. You will see how to print with spaces in the section "Strings Revisited" in Chapter 4,"Variables" . |
While this example is not very "real-world," it gives you the idea behind specifying an array by using sub-arrays.
Pseudocode |
Print an array that consists of two
sub-arrays.
Print an array that consists of an array, a string, and another array. |
print (("Bright Orange", "Burnt"), ("Canary Yellow", "Sunbeam")); print (("Bright Orange", "Burnt"), " Middle ", ("Canary Yellow", "Sunbeam"));So far, we haven't talked about the internal representations of data types. That's because you almost never have to worry about such things with Perl. However, it is important to know that, internally, the sub-arrays are merged into the main array. In other words, the array:
(("Bright Orange", "Burnt"), ("Canary Yellow", "Sunbeam"))
is exactly equivalent to
("Bright Orange", "Burnt", "Canary Yellow", "Sunbeam")
Perl uses two periods (..) to replace a consecutive series of values. Not only is this method quicker to type - and less prone to error - it is easier to understand. Only the end points of the series are specified; you don't need to manually verify that every value is represented. If the .. is used, then automatically you know that a range of values will be used.
Pseudocode |
Print an array consisting of the numbers
from 1 to 15.
Print an array consisting of the numbers from 1 to 15 using the shorthand method. |
print (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); print "\n"; print (1..15);The two arrays used in the previous example are identical, but they were specified differently.
Note |
The double periods in the array specification are called the range operator. The range operator is also discussed in Chapter 4, "Operators," |
You can also use the shorthand method to specify values in the middle of an array.
Pseudocode |
Print an array consisting of the numbers 1,
2, 7, 8, 9, 10, 14, and 15.
Print an array consisting of the letters A, B, F, G, H, Y, Z |
print (1, 2, 7..10, 14, 15); print "\n"; print ("A", "B", "F".."H", "Y", "Z");The range operator works by taking the lefthand value, adding one to it, then appending that new value to the array. Perl continues to do this until the new value reaches the righthand value. You can use letters with the range operator because the ASCII table uses consecutive values to represent consecutive letters.
You read about numbers and the three different bases that can be used to represent them - decimal, octal, and hexadecimal. Very large or small numbers can also be described using scientific notation.
Strings were perhaps a bit more involved. Single-, double-, and back-quoted strings are used to hold strings of characters. Back-quoted strings have an additional purpose. They tell Perl to send the string to the operating system for execution.
Escape sequences are used to represent characters that are difficult to enter through the keyboard or that have more than one purpose. For example, using a double quote inside a double-quoted string would end the string before you really intended. The backslash character was introduced to escape the double quote and change its meaning.
The next chapter, "Variables," will show you how Perl uses your computer memory to store data types and also will show you ways that you can manipulate data.
print `dir /*.log`;