| |||
Links Sections Chapters Part I: Basic Perl 02-Numeric and String
Literals Part II: Intermediate Perl Part III: Advanced Perl 13-Handling Errors and
Signals Part IV: Perl and the Internet 21-Using Perl with Web
Servers Appendixes |
One of the reasons the Internet has blossomed so quickly is because everyone can understand the protocols that are spoken on the net. A protocol is a set of commands and responses. There are two layers of protocols that I'll mention here. The low-level layer is called TCP/IP and while it is crucial to the Internet, we can effectively ignore it for now. The high-level protocols like ftp, smtp, pop, http, and telnet are what you'll read about in this chapter. They use TCP/IP as a facilitator to communicate between computers. The protocols all have the same basic pattern:
Figure 18.1 shows what the protocol for sending mail looks like. The end-user creates a mail message and then the sending system, using the mail protocol, holds a conversation with the receiving system.
Fig. 18.1 - Figure 18.1 - All Protocols Follow This Communications Model.
Internet conversations are done with sockets, in a manner similar to using the telephone or shouting out a window. I won't kid you, sockets are a complicated subject. They are discussed in the "Sockets" section which follows. Fortunately, you only have to learn about a small subset of the socket functionality in order to use the high-level protocols.
Table 18.1 provides a list of the high-level protocols that you can use. This chapter will not be able to cover them all, but if you'd like to investigate further, the protocols are detailed in documents at the http://ds.internic.net/ds/dspg0intdoc.html web site.
Protocol | Number | Description |
---|---|---|
auth | 113 | Authentication |
echo | 7 | Checks server to see if they are running. |
finger | 79 | Lets you retrieve information about a user. |
ftp | 21 | File Transfer Protocol |
nntp | 119 | Network News Transfer Protocol - Usenet News Groups |
pop | 109 | Post Office Protocol - incoming mail |
smtp | 25 | Simple Mail Transfer Protocol - outgoing mail |
time | 37 | Time Server |
telnet | 23 | Lets you connect to a host and use it as if you were a directly connected terminal. |
Each protocol is also called a service. Hence the term, mail server or ftp server. Underlying all of the high-level protocols is the very popular Transfer Control Protocol/Internet Protocol or TCP/IP. You don't need to know about TCP/IP in order to use the high-level protocols. All you need to know is that TCP/IP enables a server to listen and respond to an incoming conversation. Incoming conversations arrive at something called a port. A Port is an imaginary place where incoming packets of information can arrive (just like a ship arrives at a sea port). Each type of service (for example, mail or file transfer) has its own port number.
Tip |
If you have access to a UNIX machine, look at the /etc/services file for a list of the services and their assigned port numbers. |
New for the Electronic Edition |
Peter van der Landen pointed out that Windows 95 has
a file called \windows\services which has a list of services and their
assigned port numbers. It's a good bet that Windows NT has the same file,
but I can't verify this yet.
12/27/96 - Bruce Rhodewait pointed out that Windows NT lists its services in C:\WINNT\SYSTEM32\DRIVERS\etc\services. Of course if you installed Windows NT into a different base directory, use that base directory instead of C:\WINNT. |
Errata Note |
The printed version of this book said that you should look in /etc/protocols in order to find the list of services. Peter (from Europe) pointed out this problem. |
In this chapter, we take a quick look at sockets, and then turn our attention to examples that use them. You see how to send and receive mail. Sending mail is done using the Simple Mail Transfer Protocol (SMTP) which is detailed in an RFC numbered 821. Receiving mail is done using the Post Office Portocol (POP) as detailed in RFC 1725.
Table 18.2 lists all of the Perl functions that relate to sockets so you have a handy reference. But remember, you probably won't need them all.
Function | Description |
---|---|
accept(NEWSOCKET, SOCKET) | Accepts a socket connection from clients waiting for a connection. The original socket, SOCKET, is left alone, and a new socket is created for the remote process to talk with. SOCKET must have already been opened using the socket() function. Returns true if it succeeded, false otherwise. |
bind(SOCKET, PACKED_ADDRESS) | Binds a network address to the socket handle. Returns true if it succeeded, false otherwise. |
connect(SOCKET, PACKED_ADDRESS) | Attempts to connect to a socket. Returns true if it succeeded, false otherwise. |
getpeername(SOCKET) | Returns the packed address of the remote side of the connection. This function can be used to reject connections for security reasons, if needed. |
getsockname(SOCKET) | Returns the packed address of the local side of the connection. |
getsockopt(SOCKET, LEVEL, OPTNAME) | Returns the socket option requested, or undefined if there is an error. |
listen(SOCKET, QUEUESIZE) | Creates a queue for SOCKET with QUEUESIZE slots. Returns true if it succeeded, false otherwise. |
recv(SOCKET, BUFFER, LENGTH, FLAGS) | Attempts to receive LENGTH bytes of data into a buffer from SOCKET. Returns the address of the sender, or the undefined value if there's an error. BUFFER will be grown or shrunk to the length actually read. However, you must initalize BUFFER befure use. For example my($buffer) = '';. |
select(RBITS, WBITS, EBITS, TIMEOUT) | Examines file descriptors to see if they are ready or if they have exception conditions pending. |
send(SOCKET, BUFFER, FLAGS, [TO]) | Sends a message to a socket. On unconnected sockets you must specify a destination (the TO parameter). Returns the number of characters sent, or the undefined value if there is an error. |
setsockopt(SOCKET, LEVEL, OPTNAME, OPTVAL) | Sets the socket option requested. Returns undefined if there is an error. OPTVAL may be specified as undef if you don't want to pass an argument. |
shutdown(SOCKET, HOW) | Shuts down a socket connection in the manner indicated by HOW. If HOW = 0, all incoming information will be ignored. If HOW = 1, all outgoing information will stopped. If HOW = 2, then both sending and receiving is disallowed. |
socket(SOCKET, DOMAIN, TYPE, PROTOCOL) | Opens a specific TYPE of socket and attaches it to the name SOCKET. See "The Server Side of a Conversation" for more details. Returns true if successful, false if not. |
socketpair(SOCK1, SOCK2, DOMAIN, TYPE, PROTO) | Creates an unnamed pair of sockets in the specified domain, of the specified type. Returns true if successful, false if not. |
Note |
If you are interested in knowing everything about sockets, you need to get your hands on some UNIX documentation. The Perl set of socket functions are pretty much a duplication of those available using the C language under UNIX. Only the parameters are different because Perl data structures are handled differently. You can find UNIX documentation at http://www.delorie.com/gnu/docs/ on the World Wide Web. |
Programs that use sockets inherently use the client-server paradigm. One program creates a socket (the server) and another connects to it (the client). The next couple of sections will look at both server programs and client programs.
The socket() call will look something like this:
$tcpProtocolNumber = getprotobyname('tcp') || 6;
socket(SOCKET, PF_INET(), SOCK_STREAM(), $tcpProtocolNumber)
or die("socket: $!");
The first line gets the TCP protocol number
using the getprotobyname() function. Some systems - such as Windows 95 - do not
implement this function, so a default value of 6 is provided. Then, the socket
is created with socket(). The socket name is SOCKET. Notice that it looks just
like a file handle. When creating your own sockets, the first parameter is the
only thing that you should change. The rest of the function call will
always use the same last three parameters shown above. The actual meaning
of the three parameters is unimportant at this stage. If you are curious, please
refer to the UNIX documentation previously mentioned.
Socket names exist in their own namespace. Actually, there are several pre-defined namespaces that you can use. The namespaces are called protocol families because the namespace controls how a socket connects to the world outside your process. For example, the PF_INET namespace used in the socket() function call above is used for the Internet.
Once the socket is created, you need to bind it to an address with the bind() function. The bind() call might look like this:
$port = 20001;
$internetPackedAddress = pack('Sna4x8', AF_INET(), $port, "\0\0\0\0");
bind(SOCKET, $internetPackedAddress)
or die("bind: $!");
All Internet sockets reside on a computer with
symbolic name. The server's name in conjunction with a port number makes up a
socket's address. For example, www.water.com:20001. Symbolic names also have a
number equivalent known as the dotted decimal address. For example, 145.56.23.1.
Port numbers are a way of determining which socket at www.water.com you'd like
to connect to. All port numbers below 1024 (or the symbolic constant,
IPPORT_RESERVED) are reserved for special sockets. For example, port 37 is
reserved for a time service and 25 is reserved for the smtp service. The value
of 20,001 used in this example was picked at random. The only limitations are:
use a value above 1024 and no two sockets on the same computer should have the
same port number.
Tip |
You can always refer to your own computer using the dotted decimal address of 127.0.0.1 or the symbolic name localhost. |
The second line of this short example creates a full Internet socket address using the pack() function. This is another complicated topic that I will sidestep. As long as you know the port number and the server's address, you can simply plug those values into the example code and not worry about the rest. The important part of the example is the "\0\0\0\0" string. This string holds the four numbers that make up the dotted decimal Internet address. If you already know the dotted decimal address, convert each number to octal and replace the appropriate \0 in the string.
If you know the symbolic name of the server instead of the dotted decimal address, use the following line to create the packed Internet address:
$internetPackedAddress = pack('S n A4 x8', AF_INET(), $port,
gethostbyname('www.remotehost.com'));
After the socket has been
created and an address has been bound to it, you need to create a queue for the
socket. This is done with the listen() function. The listen() call looks like
this:
listen(SOCKET, 5) or die("listen: $!");
This listen() statement
will create a queue that can handle 5 remote attempts to connect. The sixth
attempt will fail with an appropriate error code.
Now that the socket exists, has an address, and has a queue, your program is ready to begin a conversation using the accept() function. The accept() function makes a copy of the socket and starts a conversation with the new socket. The original socket is still available and able to accept connections. You can use the fork() function, in UNIX, to create child processes to handle multiple conversations. The normal accept() function call looks like this:
$addr = accept(NEWSOCKET, SOCKET) or die("accept: $!");
Now the
conversation has been started, use print(), send(), recv(), read(), or write()
to hold the conversation. The examples later in the chapter show how the
conversations are held.
The socket() call for the client program is the same as that used in the server:
$tcpProtocolNumber = getprotobyname('tcp') || 6;
socket(SOCKET, PF_INET(), SOCK_STREAM(), $tcpProtocolNumber)
or die("socket: $!");
After the socket is created, the connect()
function is called like this:
$port = 20001;
$internetPackedAddress = pack('Sna4x8', AF_INET(), $port, "\0\0\0\0");
connect(SOCKET, $internetPackedAddress) or die("connect: $!");
The
packed address was explained in "The
Server Side of a Conversation." The SOCKET parameter has no relation to the
name used on the server machine. I use SOCKET on both sides for convenience.
The connect() function is a blocking function. This means that it will wait until the connection is completed. You can use the select() function to set non-blocking mode, but you'll need to look in the UNIX documentation to find out how. It's a bit complicated to explain here.
After the connection is made, you use the normal input/output functions or the send() and recv() functions to talk with the server.
The rest of the chapter will be devoted to looking at examples of specific protocols. Let's start out by looking at the time service.
Listing 18.1 contains a program that can retrieve the time from any time server in the world. Modify the example to access your own time server by setting the $remoteServer variable to your server's symbolic name.
Pseudocode |
Turn on the warning compiler option. Load the Socket module. Turn on the strict pragma. Initialize the $remoteServer to the symbolic name of the time server. Set a variable equal to the number of seconds in 70 years. Initialize a buffer variable, $buffer. Declare $socketStructure. Declare $serverTime. Get the tcp protocol and time port numbers, provide defaults in case the getprotobyname() and getservbyname() functions are not implemented. Initialize $serverAddr with the Internet address of the time server. Display the current time on the local machine, also called the localhost. Create a socket using the standard parameters. Initialize $packedFormat with format specifiers. Connect the local socket to the remote socket that is providing the time service. Read the server's time as a 4 byte value. Close the local socket. Unpack the network address from a long (4 byte) value into a string value. Adjust the server time by the number of seconds in 70 years. Display the server's name, the number of seconds difference between the remote time and the local time, and the remote time. Declare the ctime() function. Return a string reflecting the time represented by the parameter. |
Listing 18.1-18LST01.PL - Getting the Time from a Time Service |
|
Each operating system will have a different method to update the local time. So I'll leave it in your hands to figure how to do that.
The next section is devoted to sending mail. First the protocol will be explained and then you see a Perl script that can send a mail message.
Note |
The mail service will be listening for your connection on TCP port 25. But this information will not be important until you see some Perl code later in the chapter. |
The message that you prepare can only use alphanumeric characters. If you need to send binary information (like files), use the MIME protocol. The details of the MIME protocol can be found at the http://ds.internic.net/ds/dspg0intdoc.html web site.
SMTP uses several commands to communicate with mail servers. These commands are described in Table 18.3. The commands are not case-insensitive, which means you can use either Mail or MAIL. However, remember that mail addresses are case-sensitive.
Command | Description |
---|---|
Basic Commands | |
HELO | Initiates a conversation with the mail server. When using this command you can specify your domain name so that the mail server knows who you are. For example, HELO mailhost2.planet.net. |
Indicates who is sending the mail. For example, MAIL FROM:<medined@mtolive.com>. Remember this is not your name, it's the name of the person who is sending the mail message. Any returned mail will be sent back to this address. | |
RCPT | Indicates who is recieving the mail. For example, RCPT TO:<rolf@earthdawn.com>. You can indicate more than one user by issuing multiple RCPT commands. |
DATA | Indicates that you are about to send the text (or body) of the message. The message text must end with the following five letter sequence: "\r\n.\r\n". |
QUIT | Indicates that the conversation is over. |
Advanced Commands (see RFC 821 for details) | |
EXPN | Indicates that you are using a mailing list. |
HELP | Asks for help from the mail server. |
NOOP | Does nothing other than get a reponse from the mail server. |
RSET | Aborts the current conversation. |
SEND | Sends a message to a user's terminal instead of a mailbox. |
SAML | Sends a message to a user's terminal and to a user's mailbox. |
SOML | Sends a message to a user's terminal if they are logged on, otherwise sends the message to the user's mailbox. |
TURN | Reverses the role of client and server. This might be useful if the client program can also act as a server and needs to receive mail from the remote computer. |
VRFY | Verifies the existance and user name of a given mail address. This command is not implemented in all mail servers. And it can be blocked by firewalls. |
Every command will receive a reply from the mail server in the form of a three digit number followed by some text describing the reply. For example, 250 OK or 500 Syntax error, command unrecognized. The complete list of reply codes is shown in Table 18.4. Hopefully, you'll never see most of them.
Code | Description |
---|---|
211 | A system status or help reply. |
214 | Help Message. |
220 | The server is ready. |
221 | The server is ending the conversation. |
250 | The requested action was completed. |
251 | The specified user is not local, but the server will forward the mail message. |
354 | This is a reply to the DATA command. After getting this, start sending the body of the mail message, ending with "\r\n.\r\n". |
421 | The mail server will be shutdown. Save the mail message and try again later. |
450 | The mailbox that you are trying to reach is busy. Wait a little while and try again. |
451 | The requested action was not done. Some error occured in the mail server. |
452 | The requested action was not done. The mail server ran out of system storage. |
500 | The last command contained a syntax error or the command line was too long. |
501 | The parameters or arguments in the last command contained a syntax error. |
502 | The mail server has not implemented the last command. |
503 | The last command was sent out of sequence. For example, you might have sent DATA before sending RECV. |
504 | One of the parameters of the last command has not been implemented by the server. |
550 | The mailbox that you are trying to reach can't be found or you don't have access rights. |
551 | The specified user is not local, part of the text of the message will contain a forwarding address. |
552 | The mailbox that you are trying to reach has run out of space. Store the message and try again tomorrow or in a few days - after the user gets a chance to delete some messages. |
553 | The mail address that you specified was not syntactically correct. |
554 | The mail transaction has failed for unknown causes. |
Now that you've seen all of the SMTP commands and reply codes, let's see what a typical mail conversation might look like. In the following conversation, the '>' lines are the SMTP commands that your programs issues. The '<' lines are the mail server's replies.
>HELO
<250 saturn.planet.net Hello medined@mtolive.com [X.X.X.X],pleased to meet you
>MAIL From: <(Rolf D'Barno, 5th Circle Archer)>
<250 <(Rolf D'Barno, 5th Circle Archer)>... Sender ok
>RCPT To: <medined@mtolive.com>
<250 <medined@mtolive.com>... Recipient ok
>DATA
<354 Enter mail, end with "." on a line by itself
>From: (Rolf D'Barno, 5th Circle Archer)
>Subject: Arrows
>This is line one.
>This is line two.
>.
<250 AAA14672 Message accepted for delivery
>QUIT
<221 saturn.planet.net closing connection
The bold lines are the
commands that are sent to the server. Some of the SMTP commands are a bit more
complex than others. In the next few sections, the MAIL, RCPT and DATA commands
are discussed. You will also see how to react to undeliverable mail.
MAIL FROM:<reverse-path>
If the mail server accepts the
command, it will reply with a code of 250. Otherwise, the reply code will be
greater than 400.
In the example shown previously
>MAIL From:<(medined@mtolive.com)>
<250 <(medined@mtolive.com)>... Sender ok
The reverse-path is
different from the name given as the sender following the DATA command. You can
use this technique to give a mailing list or yourself an alias. For example, if
you are maintaining a mailing list to your college alumnis, you might want the
name that appears in the reader's mailer to be '87 RugRats instead of your own
name.
RCPT TO:<forward-path>
Only one recipient can be named per
RCPT command. If the recipient is not known to the mail server, the response
code will be 550. You might also get a response code indicating that the
recipient is not local to the server. If that is the case, you will get one of
two responses back from the server:
DATA
After you get the standard 354 response, send the body of the
message followed by a line with a single period to indicate that the body is
finished. When the end of message line is received, the server will repond with
a 250 reply code.
Note |
The body of the message can also include several header items like Date, Subject, To, Cc, and From. |
An endless loop happens when a error notification message is sent to a non-existent mailbox. The server keeps trying to send a notification message to the reverse-path specified in the MAIL command.
The answer to this dilema is to specify an empty reverse path in the MAIL command of a notification message like this:
MAIL FROM:<>
An entire mail session that delivers a error
notification message might look like the following:
MAIL FROM:<>
250 ok
RCPT TO:<@HOST.COM@HOSTW.ARPA>
250 ok
DATA
354 send the mail data, end with .
Date: 12 May 96 12:34:53
From: MEDINED@mtolive.com
To: ROBIN@UIC.HOST.COM
Subject: Problem delivering mail.
Robin, your message to JACK@SILVER.COM was not
delivered.
SILVER.COM said this:
"550 No Such User"
.
250 ok
Caution |
I found out, on 11/30/96, that some mail servers require you to add your domain name to the HELO command. If your server has this requirement then the server response is 501 HELO requires domain address. |
Caution |
The script in Listing 18.2 was tested on Windows 95. Some comments have been added to indicate changes that are needed for SunOS 4.1+ and SunOS 5.4+ (Solaris 2). The SunOS comments were supplied by Qusay H. Mahmoud - also known as perlman on IRC. Thanks, Qusay! |
Pseudocode |
Turn on the warning compiler option. Load the Socket module. Turn on the strict pragma. Initialize $mailTo which holds the recipient's mail address. Initialize $mailServer which holds the symbolic name of your mail server. Initialize $mailFrom which holds the originator's mail address. Initialize $realName which holds the text that appears in the From header field. Initialize $subject which holds the text that appears in the Subject header field. Initialize $body which holds the text of the letter. Declare a signal handler for the Interrupt signal. This handler will trap users hitting Ctrl+c or Ctrl+break. Get the protocol number for the tcp protocol and the port number for the smtp service. Windows 95 and NT do not implement the getprotobyname() or getservbyname() functions so default values are supplied. Initialize $serverAddr with the mail server's Internet address. The $length variable is tested to see if is defined, if not, then the gethostbyname() function failed. Create a socket called SMTP using standard parameters. Initialize $packedFormat with format specifiers. Connect the socket to the port on the mail server. Change the socket to use unbuffer input/output. Normally, sends and receives are stored in an internal buffer before being sent to your script. This line of code eliminates the buffering steps. Create a temporary buffer. The buffer is temporary because it is local to the block surrounded by the curly brackets. Read two responses from the server. My mail server sends two reponses when the connection is made. Your server may only send one response. If so, delete one of the recv() calls. Send the HELO command. The sendSMTP() function will take care of reading the response. Send the MAIL command indicating where messages that the mail server sends back (like undeliverable mail messages) should be sent. Send the RCPT command to specify the recipient. Send the DATA command. Send the body of the letter. Note that no reponses are received from the mail server while the letter is sent. Send a line containing a single period indicating that you are finished sending the body of the letter. Send the QUIT command to end the conversation. Close the socket. Define the closeSocket() function which will act as a signal handler. Close the socket. Call die() to display a message and end the script. Define the sendSMTP() function. Get the debug parameter. Get the smtp command from the parameter array. Send the smtp command to STDERR if the debug parameters was true. Send the smtp command to the mail server. Get the mail server's response. Send the response to STDERR if the debug parameter was true. Split the response into reply code and message, and return just the reply code. |
Listing 18.2-18LST02.PL - Sending Mail with Perl |
|
This program displays:
> HELO
< 250 saturn.planet.net Hello medined@stan54.planet.net
[207.3.100.120], pleased to meet you
> MAIL From: <medined@mtolive.com>
< 250 <medined@mtolive.com>... Sender ok
> RCPT To: <~r00tbeer@fundy.csd.unbsj.ca>
< 250 <~r00tbeer@fundy.csd.unbsj.ca>... Recipient ok
> DATA
< 354 Enter mail, end with "." on a line by itself
>
.
< 250 TAA12656 Message accepted for delivery
> QUIT
< 221 saturn.planet.net closing connection
The lines in bold are the
commands that were sent to the server. The body of the letter is not shown in
the output. However, Figure 18.2 shows how the letter looks when displayed using
Netscape's mail program.
Fig. 18.2 - The Letter Created by 18LST02.pl
Listing 18.2, while long, is very straightforward. In order to use it yourself, you need only change the first two assignments. Change $mailTo to your own email address. And change $mailServer to your own mail server. Now run the script. After a minute or two a new mail message will appear in your mailbox.
Listing 18.3 contains a program that will filter your mail. It will display a report of the authors and subject line for any mail that relates to EarthDawn(tm), a role-playing game from FASA. This program will not delete any mail from the server, so you can experiment with confidence.
Note |
Before trying to run this program, make sure that the POP3Client module (POP3Client.pm) is in the Mail subdirectory of the library directory. You may need to create the Mail subdirectory as I did. On my system, this directory is called c:/perl5/Lib/Mail, it is probably different on your system though. See your system administration if you need help placing the file into the correct directory. |
Caution |
This script was tested using Windows 95. You might need to modify it for other systems. On SunOS 5.4+ (Solaris 2), you'll need to change the POP3Client module to use a packing format of 'S n c4 x8' instead of 'S n a4 x8'. Other changes might also be needed. |
Pseudocode |
Turn on the warning compiler option. Load the POP3Client module. The POP3Client module will load the Socket module automatically. Turn on the strict pragma. Declare some variables used to temporary values. Define the header format for the report. Define the detail format for the report. Initialize $username to a valid username for the mail server. Initialize $password to a valid password for the user name. Create a new POP3Client object. Iterate over the mail messages on the server. $pop->Count holds the number of messages waiting on the server to be read. Initialize a flag variable. When set true, the script will have a mail message relating to EarthDawn. Iterate over the headers in each mail messages. The Head() method of the POP3Client module returns the header lines one at a time in the $_ variable. Store the author's name if looking at the From header line. Store the subject if looking at the Subject line. This is the filter test. It checks to see if the word "EarthDawn" is in the subject line. If so, the $earthDawn flag variable is set to true (or 1). This line is commented out, normally it would copy the text of the message into the @body array. This line is also commented out, it will delete the current mail message from the server. Use with caution! Once deleted, you can't recover the messages. Set the flag variable, $earthDawn, to true. Write a detail line to the report if the flag variable is true. |
Listing 18.3-18LST03.PL - Creating a Mail Filter |
|
This program displays:
Waiting Mail Regarding EarthDawn Pg 1
Sender Subject
---------------------- ---------------------------------
Bob.Schmitt [EarthDawn] Nethermancer
Doug.Stoechel [EarthDawn] Weaponsmith
Mindy.Bailey [EarthDawn] Troubador
When you run this
script, you should change $username, $password, and $mailServer and the filter
test to whatever is appropriate for your system.
You could combine the filter program with the send mail program (from Listing 18.2) to create an automatic mail-response program. For example, if the subject of a message is "Info," you can automatically send a pre-defined message with information about a given topic. You could also create a program to automatically forward the messages to a covering person while you are on vacation. I'm sure that with a little thought you can come up with a half-dozen ways to make your life easier by automatically handling some of your in-coming mail.
Caution |
Windows 95 (and perhaps other operating systems) can't use the SIGALRM interrupt signal. This may cause problems if you use this script on those systems because the program will wait forever when a server does not respond. |
Pseudocode |
Turn on the warning compiler option. Load the Socket module. Turn on the strict pragma. Display a message if the red.planet.net server is reachable. Display a message if the saturn.planet.net server is reachable. Declare the echo() function. Get the host and timeout parameters from the paramter array. If no timeout parameter is specified, 5 seconds wil be used. Declare some local variables. Get the tco protocol and echo port numbers. Get the server's Internet address. If $serverAddr is undefined then the name of the server was probably incorrect and an error message is displayed. Check to see if the script is running under Windows 95. If not under Windows 95, store the old alarm handler function, set the alarm handler to be an anonymous function that simply ends the script, and set an alarm to go off in $timeout seconds. Initialize the status variable to true. Create a socket called ECHO. Initialize $packedFormat with format specifiers. Connect the socket to the remote server. Close the socket. Check to see if the script is running under Windows 95. If not under Windows 95, reset the alarm and restore the old alarm handler function. Return the status. |
Listing 18.4-18LST04.PL - Using the Echo Service |
|
This program will display:
echo: red.planet.net could not be found, sorry.
saturn.planet.net is up.
When dealing with the echo service, you only
need to make the connection in order to determine that the server is up and
running. As soon as the connection is made, you can close the socket.
Most of the program should be pretty familiar to you be now. However, you might not immediately realize what return statement in the middle of the echo() function does. The return statement is repeated here:
return(print("echo: $host could not be found, sorry.\n"), 0)
if ! defined($serverAddr);
The statement uses the comma
operator to execute two statement where normally you would see one. The last
statement to be evaluated is the value for the series of statements. In this
case, a zero value is returned. I'm not recommending this style of coding, but I
thought you should see it a least once. Now, if you see this technique in
another programmer's scripts you'll understand it better. The return statement
could also be done written like this:
if (! defined($serverAddr) {
print("echo: $host could not be found, sorry.\n")
return(0);
}
The program in Listing 18.5 downloads the perl FAQ in compressed format from ftp.cis.ufl.edu and displays a directory in two formats.
Caution |
The ftplib.pl file can be found on the CD-ROM that accompanies this book. Please put it into your Perl library directory. I have modified the standard ftplib.pl that is available from the Internet to allow the library to work under Windows 95 and Windows NT. |
Pseudocode |
Turn on the warning compiler option. Load the ftplib library. Turn on the strict pragma. Declare a variable to hold directory listings. Turn debugging mode on. This will display all of the protocol commands and responses on STDERR. Connect to the ftp server providing a userid of anonymous and your email address as the password. Use the list() function to get a directory listing without first changing to the directory. Change to the /pub/perl/faq directory. Start binary mode. This is very important when gets compressed files or executables. Get the Perl FAQ file. Use list() to find out which files are in the current directory and then print the list. Use dir() to find out which files are in the current directory and then print the list. Turn debugging off. Change to the /pub/perl/faq directory. Use list() to find out which files are in the current directory and then print the list. |
Listing 18.5-18LST05.PL - Using the ftplib Library |
|
This program displays:
<< 220 flood FTP server (Version wu-2.4(21) Tue Apr 9 17:01:12 EDT 1996) ready.
>> user anonymous
<< 331 Guest login ok, send your complete e-mail address as password.
>> pass .....
<< 230- Welcome to the
<< 230- University of Florida
.
.
.
<< 230 Guest login ok, access restrictions apply.
>> port 207,3,100,103,4,135
<< 200 PORT command successful.
>> nlst pub/perl/faq
<< 150 Opening ASCII mode data connection for file list.
<< 226 Transfer complete.
>> cwd /pub/perl/faq
<< 250 CWD command successful.
>> type i
<< 200 Type set to I.
>> port 207,3,100,103,4,136
<< 200 PORT command successful.
>> retr FAQ.gz
<< 150 Opening BINARY mode data connection for FAQ.gz (75167 bytes).
<< 226 Transfer complete.
>> port 207,3,100,103,4,138
<< 200 PORT command successful.
>> nlst
<< 150 Opening BINARY mode data connection for file list.
<< 226 Transfer complete.
list of /pub/perl/faq
FAQ
FAQ.gz
>> port 207,3,100,103,4,
139
<< 200 PORT command successful.
>> list
<< 150 Opening BINARY mode data connection for /bin/ls.
<< 226 Transfer complete.
list of /pub/perl/faq
total 568
drwxrwxr-x 2 1208 31 512 Nov 7 1995 .
drwxrwxr-x 10 1208 68 512 Jun 18 21:32 ..
-rw-rw-r-- 1 1208 31 197446 Nov 4 1995 FAQ
-rw-r--r-- 1 1208 31 75167 Nov 7 1995 FAQ.gz
list of /pub/perl/faq
FAQ
FAQ.gz
I'm sure that you can pick out the different FTP commands
and responses in this output. Notice that the FTP commands and responses are
only displayed when the debugging feature is turned on.
Like most services, NNTP uses a client/server model. You connect to a news server and request information using NNTP. The protocol consists of a series of commands and replies. I think NNTP is a bit more complicated than the other because the variety of things you might want to do with news articles is larger.
Caution |
Some of the NNTP commands will result in very large responses. For example, the LIST command will retrieve the name of every newsgroup that your server knows about. Since there are over 10,000 newsgroups it might take a lot of time for the response to be received. |
I suggest using Perl to filter newsgroups or to retrieve all the articles available and create reports or extracts. Don't use Perl for a full-blown news client. Use Java, Visual Basic, or another language that is designed with user interfaces in mind. In addition, there are plenty of great free or inexpensive news clients available, why reinvent the wheel?
Listing 18.6 contains an object-oriented program that encapsulates a small number of NNTP commands so that you can experiment with the protocol. Only the simplest of the commands have been implemented to keep the example small and uncluttered.
Pseudocode |
Turn on the warning compiler option. Load the Socket mdoule. Turn on the strict pragma. Begin the News package. This also started the definition of the News class. Define the new() function - the constructor for the News class. Get the class name from the parameter array. Get the name of the news server from the parameter array. Declare a hash with two entries - the class properties. Bless the hash. Call the initialize() function that connects to the server. Define a signal handler to gracefully handle Ctrl+C and Ctrl+Break. Return a reference to the hash - the class object. Define the initialize() function - connects to the news server. Get the class name from the parameter array. Get the protocol number, port number, and server address. Create a socket. Initialize the format for the pack() function. Connect to the news server. Modify the socket to use non-buffered I/O. Call the getInitialResponse() function. Define getInitialResponse() - receive response from connection. Get the class name from the parameter array. Initialize a buffer to hold the reponse. Get the reponse from the server. Print the response if debugging is turned on. Define closeSocket() - signal handler. Close the socket. End the script. Define DESTROY() - the deconstructor for the class. Close the socket. Define debug() - turns debugging on or off. Get the class name from the parameter array. Get the state (on or off) from the parameter array. Turn debugging on if the state is on or 1. Turn debugging off if the state is off or 0. Define send() - send a NNTP command and get a response. Get the class name from the parameter array. Get the command from the parameter array. Print the command if debugging is turned on. Send the command to the news server. Get a reply from the news server. Print the reply if debugging is turned on. Return the reply to the calling routine. Define article() - gets an news article from the server. Get the class name from the parameter array. Get the article number from the parameter array. Return the response to the ARTICLE command. No processing of the reponse is needed. Define group() - gets information about a specific newsgroup. Get the class name from the parameter array. Get the newsgroup name from the parameter array. Split the response using space characters as a delimiter. Define help() - gets a list of commands and descriptions from server. Return the response to the HELP command. Define quit() - ends the session with the server. Send the QUIT command. Close the socket. Start the main package or namespace. Declare some local variables. Create a News object. Turn debugging on. Get information about the comp.lang.perl.misc newsgroup. If the reply is good, display the newgroup information. Turn debugging off. Initialize some loop variables. The loop will execute 5 times. Start looping through the article numbers. Read an article, split the reponse using newline as the delimiter. Search through the lines of the article for the From and Subject lines. Display the article number, author, and subject. Turn debugging on. Get help from the server. They will be displayed because debugging is on. Stop the NNTP session. Define the min() function - find smallest element in parameter array. Store the first element into $min. Interate over the parameter array. If the current element is smaller than $min, set $min equal to it. Return $min. |
Listing 18.6-18LST06.PL - Using the NNTP Protocol to Read Usenet News |
|
This program displays:
<200 jupiter.planet.net InterNetNews NNRP server INN 1.4 22-Dec-93 ready (post
> GROUP comp.lang.perl.misc
< 211 896 27611 33162 comp.lang.perl.misc
There are 896 articles, from 27611 to 33162.
#27611 From: rtvsoft@clearlight.com
Subject: Re: How do I suppress this error message
#27612 From: aml@world.std.com
Subject: Re: find and replace
#27613 From: hallucin@netvoyage.net
Subject: GRRRR!!!! Connect error!
#27614 From: mheins@prairienet.org
Subject: Re: Why does RENAME need parens?
#27615 From: merlyn@stonehenge.com
Subject: Re: Date on Perl 2ed moved?
#27616 From: Tim
Subject: Re: How do I suppress this error message
> HELP
< 100 Legal commands
authinfo user Name|pass Password
article [MessageID|Number]
body [MessageID|Number]
date
group newsgroup
head [MessageID|Number]
help
ihave
last
list [active|newsgroups|distributions|schema]
listgroup newsgroup
mode reader
newgroups yymmdd hhmmss ["GMT"] [<distributions>]
newnews newsgroups yymmdd hhmmss ["GMT"] [<distributions>]
next
post
slave
stat [MessageID|Number]
xgtitle [group_pattern]
xhdr header [range|MessageID]
xover [range]
xpat header range|MessageID pat [morepat...]
xpath xpath MessageID
Report problems to <medined@mtolive.com>
.
The program previously listed is very useful for hacking but it is
not ready for professional use in several respects. The first problem is that it
pays no attention to how large the incoming article is. It will read up to one
million characters. This is probably not good. You might consider a different
method. The second problem is that it ignores error messages sent from the
server. In a professional program, this is a bad thing to do. Use this program
as a launchpad to a more robust application.
In order to get you started, there are two files on the CD-ROM, url.pl and url_get.pl. These libraries will retrieve web documents when given a specific URL. Place them into your Perl directory and run the program in Listing 18.7. It will download the Perl home page into the $perlHomePage variable.
Pseudocode |
Load the url_get library. Initialize $perlHomePage with the contents of the Perl home page. |
Listing 18.7-18LST07.PL - Retrieving the Perl Home Page. |
|
The HTTP standard is kept on the http://www.w3.org/pub/WWW/Protocols/HTTP/HTTP2.html web page.
You started off this chapter with a list of some protocols or services that are available. Then you learned that protocols are set of commands and responses that both a server and a client understand. The high-level protocols (like mail and file-transfer) rest on top of the TCP/IP protocol. TCP/IP was ignored because, like any good foundation, you don't need to know its details in order to use it.
Servers and clients use a different set of functions. Servers use socket(), bind(), listen(), accept(), close(), and a variety of I/O functions. Client use socket(), connect(), close(), and a variety of I/O functions.
On the server side, every socket must have an address which consists of the server's address and a port number. The port number can be any number greater than 1024. The name and port are combined using a colon as a delimiter. For example, www.foo.com:4000.
Next, you looked at an example of the time service. This service is useful for synchronizing all of the machines on a network.
SMTP or Simple Mail Transport Protocol is used for sending mail. There are only five basic commands: HELO, MAIL, RCPT, DATA, and QUIT. These commands were discussed and then an mail sending program was shown in Listing 18.2.
The natural corollary to sending mail is receiving mail - done with the POP or Post Office Protocol. Listing 18.3 contained a program to filter incoming mail looking for a specific string. It produced a report of the messages that contained that string in the subject line.
After looking at POP, you saw how to use the Echo service to see if a server was running. This service is of marginal use under in Windows operating systems because they do now handle the SIGALRM signal. So a process might wait forever for a server to respond.
Then, you looked at FTP or File Transfer Protocol. This protocol is used to send files between computers. The example in Listing 18.5 used object-oriented techniques to retrieve the Perl Frequently Asked Questions file.
NNTP was next. The news protocol can retrieve articles from a news server. While the example was a rather large program, it still only covered a few of the commands that are available.
Lastly, the HTTP protocol was mentioned. A very short - two line - program was given to retrieve a single web page.