|
Chapter 13Messaging Facilities: The System V Ipc Functions
CONTENTS
This chapter introduces you to the Interprocess Communications (Ipc) functionality of message queues, shared memory, and semaphores. The Ipc facilities provide a clean, consistent solution to passing data between processes on the same machine. (Sockets can extend across platforms and a network.) An Introduction to the System V IpcThe UNIX system V Ipc enables you to perform the following tasks:
Each Ipc function is available to calling processes as a system resource. These resources are available for all processes on a system-level basis and can be shared by many processes on the same system. Ipc resources are limited to the system they reside on and do not offer networking functionality. Because there are only a limited number of Ipc resources on any UNIX system, it's important to free up each resource after using it. This is because each Ipc resource can exist for a long time after the process that created it has finished executing. Each Ipc resource is referred to as an object in the operating system. For working with Ipc resources, you either have to create an object or use an existing one. Ipc objects are created via a get() function for that object. Each get() function call requires a unique positive Ipc key as the identifier for that object. Keys are converted by the kernel into an ID number and returned by the get() function. Then the ID is used by other related functions to refer to that object for all other operations. An Ipc key is a long integer and is used to name the Ipc resource. A key is assigned by the programmer but could also be assigned by the system. The keys for shared memory, message queues, and semaphores are unique in the sense that the same key number can be assigned to Ipc objects of different types. That is, a semaphore with a key of 11 can coexist on the same system with a message queue with a key of 11. However, another semaphore cannot coexist with a key of 11 on the same system. Programmers can force the underlying operating system to assign a key by specifying the &Ipc_PRIVATE flag (this is explained shortly). When you pass in a key number to a get() function, an ID is returned. Once an object is created and its ID is returned, the object must then be referred to by its ID. You can draw the analogy that a file handle is to a file as an ID is to an Ipc resource. The returned IDs are positive if there are no errors. (A negative ID is returned if there is an error.) You can create a unique key by using the &Ipc_PRIVATE flags if you are not imaginative enough. The kernel then creates the ID and the key for you. Ipc objects are global. Once created, the object is available to all the processes in the system. In this respect, you have to be careful how you access the available resources because any process can overwrite your shared memory, message queue, or semaphore. Also, your Ipc object remains in memory long after your process has gone. You, not the kernel, are responsible for cleanup. When you create the object, you also have to specify permissions. The format of the permissions is very similar to that of files: three groups of read/write for owner, group, and other. The execute permission bits for the permissions are ignored by the Ipc calls. To get access to an existing object, you have to specify 0 for permissions. The following flags are permitted for creating objects:
Using the UNIX System V Ipc FunctionsWith Perl you can access all of the Ipc functions via a standard set of library functions. The information required for the functions is consistent with a UNIX system interface; therefore, the information in a UNIX man page will provide enough information about the facilities available on your system. System V Ipc functions are defined in Perl header files. For a Perl installation on a UNIX system, the required information is in the *.ph files. (The ph stands for Perl header.) The following files will be required by most of the Perl scripts you write to utilize the Ipc facilities: require "ipc.ph"; Keep in mind that this might not work as shown here. Here are the primary reasons an error occurs when you try to include these files with the require statement:
To cure these problems, you'll have to run the h2ph script in the /usr/lib/perl directory. The h2ph script contains a line (around line 9) that has the variable $perlincl set to a directory. On my machine, this variable is set to /usr/lib/perl5/i486-linux/5.002. On your machine, this value might be different. In any event, the value of $perlincl is the directory where the *.ph files are stored by the h2ph script. Now go to the /usr/include directory and, as root, run the h2ph command as shown here: h2ph * sys/* The include files on your system may require that more subdirectories be included in the paths specified to this program. For example, on a Linux 3.0 system, the command to get most of the required files is this: h2ph * sys/* asm/* linux/* The only clear way to know which files are required is to include the Perl header files in a sample script. If everything goes well, you should be able to get the script to run. The sample script shown in Listing 13.1 gives two ignorable warnings on all three different Linux versions. The script does manage to create the message queue as expected. I cover the topic of message queues in the section "Shared Memory," later in this chapter. Listing 13.1. A sample script to test Perl header file inclusion. 1 #!/usr/bin/perl The three required files in the sample script are included for message queues, shared memory, and semaphores, respectively. Only those that are required have to be included. That is, you do not have to include shm.ph if you aren't going to be using shared memory. The ipc.ph file is required for any of these three features. PERMISSIONS is set to 0666, meaning that any process can work with or even delete the Ipc object in question. For a more secure system, you might consider using 0600 to give permissions to the owner process only. The msgget() FunctionIn Listing 13.1 an Ipc message queue was created. To use the System V message-passing facility, you first create a message queue ID for a given message queue. Here's the syntax of the msgget() function: $msgid = msgget ($key, $flag); $key is set to either &Ipc_PRIVATE or an arbitrary constant. If $key is &Ipc_PRIVATE or $flag has &Ipc_CREAT set, the message queue is created and its queue ID is returned in $msgid. For &Ipc_EXCL, the object must not already exist. If msgget() cannot create the message queue, $msgid is set to undef. The ipcs CommandAfter running the test script, you can see what the object created looks like by using the ipcs command. The ipcs command lists the status of any Ipc objects in the system. Here is the output from the ipcs command after creating the message queue. ------ Shared Memory Segments -------- The output from the ipcs command on your machine may be different than the one shown here. However, most of the information should be the same. For instance, in this example one message queue is shown as being created. The ID of this queue is 128; it is owned by khusain and has permissions of 0666, thereby allowing any process to manipulate it. The message queue has no messages in it and is not using any memory for queuing messages. The msgsnd() and msgrcv() FunctionsUse the msgsnd() function to send a message to a message queue. The syntax of the msgsnd function is this: $err = msgsnd ($msgid, $message, $flags); $msgid is the message queue ID returned by msgget(); $message is the content of what you are sending (the content does not have to be text). The $flags specifies options to use when sending the message. The msgsnd() function returns a non-zero value if the send operation succeeds and 0 if an error occurs. You can check $! for the errno code if you get a value of 0 back from this call. Call the msgrcv() function to read messages from a message queue. The syntax of the msgrcv function is this: $err = msgrcv ($msgid, $rcvd, $size, $mesgtype, $flags); $msgid is the ID of the message queue. $rcvd is a scalar variable in which the incoming data is stored. $size is set to the number of bytes of the incoming message plus the size of the message type. The message type is specified in $mesgtype by the caller. If $mesgtype is 0, any message on the queue is pulled off. A positive non-zero value implies that the first message of the type equal to the value in $mesgtype will be pulled. A negative non-zero value of $mesgtype requests to pull any message whose ID is greater than the absolute value of $mesgtype. $flags specifies options that affect the message. If &Ipc_NOWAIT is specified, the function returns immediately with the appropriate error code. If the &Ipc_WAIT flag is set, the function waits until there is a message on the queue. The msgrcv() function returns a non-zero value if a message has arrived; otherwise it returns 0. You can check $! for the errno code if you get a value of 0 back from this call. Let's see how to send a message. A message has a long integer as the first four bytes followed by the body of the message. The first bytes are used as identifiers for each message. It's up to the receiver to know how many bytes to expect from the type of the message. First, Listing 13.2 presents a script that creates a message queue using a unique key and then sends a message on it. Listing 13.2. Creating a message queue and sending a message on it. 1 #!/usr/bin/perl Don't forget to replace lines 2 through 4 with your machine's specific path! Lines 5 and 6 include the header files for the message queue facility. Line 7 sets the permissions to be globally vulnerable; that is, anyone can attach to or even destroy an object created with these permissions. The $ipckey value is set to 42 because it's a unique number. Had this value been left as Ipc_PRIVATE, a new message queue would be created every time this script is run. Too many queues will eat up system resources, so use these scripts judiciously. The message itself is created in lines 11 and 12 using the pack statement. The L parameter to pack sets up the message type, and the a* parameter specifies a null-terminated string. The message will be 12 bytes long, including the null terminator for the string, but not including the four-byte message type. Line 13 is where the message is actually sent. The &Ipc_NOWAIT flag requests that the message returns immediately even if it could not be sent. If you want to wait, use &Ipc_WAIT instead. Be warned, however, that the script making the call is suspended until the message is sent. To see if the message made it to the message queue, check the output from the ipcs command: ------ Shared Memory Segments -------- Note that there isn't a receiver to receive the message just yet. If we do not create a receiving process, the messages in the queue will just sit there until the queue is destroyed. Queues have to be destroyed manually; the system will not destroy them for you automatically. There is one message with an ID of 512 and a length of 12 bytes in the queue. The message stays in the queue until it's retrieved by something else. That something else is the script shown in Listing 13.3. Listing 13.3. Receiving messages on the message queue. 1 #!/usr/bin/perl Again, don't forget to modify lines 2 through 4 for your machine. Line 15 in Listing 13.3 is of importance to us. Note how the message type is set to 0, even though the message type sent was 1. The message type in the msgrcv() function can take three sets of values:
Run the receiver script. The message queue should be empty now. Let's confirm that the message queue is empty by examining the output of the ipcs command. In the following output, look at the information for the message queues. You should see zero for the number of messages and zero bytes by the queue. ------ Shared Memory Segments -------- It's not a good idea to leave Ipc objects around in the system. The msgctl() function is used to set options for message queues and send commands that affect them. Generally, this function is used to delete message queues. Here's the syntax of the msgctl function: $err =msgctl ($msgid, $msgcmd, $msgarg); $msgid is the message queue ID. The argument $msgcmd is the command to be sent to the message queue. The list of available commands is defined in the file ipc.ph. Some of the commands that can be specified by msgcmd set the values of message queue options. If one of these commands is specified, the new value of the option is specified in msgarg. If an error occurs, msgctl returns the undefined value. msgctl() also can return 0 or a non-zero value and will set errno in $!. To delete a queue, use the following command: $ret = msgctl($msgid, &Ipc_RMID, $NULL); The value of the returned parameter will be -1 if there is an error; otherwise, the value is 0. Sometimes the message queue can be deleted in a signal handler, like this: sub cleanup { Shared MemoryMessage queues are great for sending messages in a LIFO order. The major problem with message queues is that they can overflow if no one is there to receive the messages. Shared memory areas have to be explicitly created before you can use them. To do this, call the shmget function with a key as you did with message queues. Here's the syntax of the shmget function: $shmid = shmget (key, msize, flag); As with message queues, $key is either &Ipc_PRIVATE or an arbitrary constant. If the key is &Ipc_PRIVATE or the flag has &Ipc_CREAT set, the shared memory segment is created, and its ID is returned in $shmid. The msize is the size of the created shared memory in bytes. If shmget() cannot create the shared memory area, the returned value in $shmid is set to undef. The $flags are the same as with message queues. Here are the actions you can perform on a shared memory segment:
The shmwrite() and shmread() FunctionsTo write data to an area of shared memory, call the shmwrite() function. Here's the syntax of the shmwrite function: shmwrite ($shmid, $text, $pos, $size); $shmid is the shared memory ID returned by shmget. $text is the character string to write to the shared memory, $pos is the number of bytes to skip over in the shared memory before writing to it, and $size is the number of bytes to write. This function returns a value that is the number of bytes actually written or, in the case of an error, a value of 0. If the data specified by $text is longer than the value specified by size, only the first $size bytes of text are written to the shared memory. If the data specified by $text is shorter than the value specified by $size, shmwrite generally will fill the leftover space with null characters. An error also occurs if you attempt to write too many bytes to the shared memory area (that is, if the value of $pos plus $size is greater than the number of bytes in the shared memory segment). To read data from a segment of shared memory, call the shmread function. Here's the syntax of the shmread function: shmread ($shmid, $retval, $pos, $size); Here, $shmid is the shared memory ID returned by shmget. The $retval variable is a scalar variable (or array element) in which the returned data is to be stored. The data is read from $pos number of bytes from the start of the shared memory segment, and $size is the number of bytes to copy. This function returns a non-zero value if the read operation succeeds, or it returns 0 in the case of an error. Only the number of bytes requested are returned in $retval. An error occurs if you attempt to read too many bytes from the shared memory area. In other words, if the value of $pos plus $size is greater than the number of bytes in the shared memory segment, you'll get an error. On errors, the values in the $retval scalar are undefined. See Listing 13.4 for a simple Perl script that creates a memory segment and then puts some data in it. Listing 13.4. The use of shmget() and shmwrite(). 1 #!/usr/bin/perl Note that in Listing 13.5, the shared memory segment is 1,024 bytes long. The shared memory segment is not automatically destroyed when the creating process is killed. The values and space for these values in the shared memory area remain there even after the process that created the segment is long gone. A second application can now come in and read from the shared memory segment. This second application is shown in Listing 13.5. Listing 13.5. The use of shmread(). 1 #!/usr/bin/perl This little example brings up a very possible and potentially dangerous scenario concerning the use of shared memory to pass data between two applications. Take two processes, A and B, which share data through shared memory. What if process A is in the middle of writing some data that B is reading? There is a high probability that the data read by B could be mangled by A. There is nothing that prevents B from reading from the same offset to which A is writing. To prevent such potentially erroneous read/write situations, you have to lock the resource from multiple use. This is where semaphores come into play. A semaphore allows multiple processes to synchronize access on a resource. SemaphoresA semaphore is simply a counter in the kernel. It can have values of 0, -1, -2, and so on, depending on how many processes are using it. A value of 0 indicates that the resource is unavailable. When a resource is locked by a process, the value of the semaphore is decremented. When the resource is freed, the value of the semaphore is incremented. A semaphore value of less than 0 indicates that the process must block (that is, wait until some other process zeroes it). A semaphore is a data structure in the kernel that contains the process ID of the last process to perform a semaphore operation and the number of processes waiting on the semaphore to be 0. A binary semaphore uses a value of either 1 or 0. To use a semaphore, you must first create it. To do this, call the semget function. Here's the syntax of the semget function: $semid = semget ($key, $num, $flag); The key and flag here are the same as those for shared memory or message queues. If the key is &Ipc_PRIVATE or the flag has &Ipc_CREAT set, the semaphore is created and its ID is returned in semid. The $num variable is the number of semaphores created and is an index into an array of semaphores. The first element of the array is at index 0. If semget is unable to create the semaphore, $semid is set to the null string. To perform a semaphore operation, call the semop() function. Here's the syntax of the semop() function: semop ($semid, $semstructs); Here, $semid is the semaphore ID returned by semget, and $semstructs is a character string consisting of an array of semaphore structures. Each semaphore structure consists of the following components, each of which is a short integer (as created by the s format character in pack):
This function returns a non-zero value if the semaphore operation is successful; otherwise, 0 if an error occurs. There are three actions you can take with a semaphore. Each of these actions happens on the elements of the array you created in the semget() function. These actions add or subtract a value to the semaphore:
The semctl function enables you to set options for semaphores and issue commands that affect them. Here's the syntax of the semctl function: $err = semctl ($semid, $semcmd, $semarg); $semid is the semaphore ID returned by semget. $semcmd is the command that affects the semaphore; the list of available commands includes the Ipc_RMID for removing the resource. Check the ipc.ph file for more commands on your system. Some of the commands that can be specified by semcmd set the values of semaphore options. If one of these commands is specified, the new value of the option is specified in $semarg. If an error occurs, semctl returns the undefined value; otherwise, it returns 0. Listing 13.6 shows an example of a parent and child process sharing a shared memory resource using semaphores. Listing 13.6. Using semaphores and shared memory together. 1 #!/usr/bin/perl Line 2 specifies that the buffers be flushed immediately on write. This is a good idea when you are working with forked processes. Lines 5 through 11 set up the include paths for the required header files. In lines 15 through 20 the semaphore is set up and created for parent and child processes to use. In line 24 the shared memory segment is created. The process forks off into two processes in line 27. The child process waits for the semaphore by first explicitly setting the local counter to 0 (see line 33). Then it checks for the value of the semaphore in line 35 after packing the parameters into the semaphore structure in line 34. When it breaks out of the semaphore (that is, when the value of the semaphore is 0), the child reads data from the shared memory segment. It then sets the value of the semaphore to 2. The parent, on the other hand, decrements the semaphore by 1. The value is 2 if the child runs first, and thus becomes 1, giving the parent control. If the child is running and adding to shared memory, then the value of the semaphore is 0; therefore, decrementing by the parent forces it to -1, thus blocking the parent. Now, when the child increments by 2, the semaphore is set to 1 (-1 + 2 = 1) and the parent is started. The child then waits until the semaphore becomes 0, which happens when the parent decrements the semaphore one more time. The shmctl and semctl functions are used to obliterate the Ipc resources once you are done with the application. The SysV::Ipc ModuleThe SysV Ipc code in Listing 13.6 was written long ago and has been documented to death in almost all UNIX texts. Some kind folks have tried to make the interface easier to use by providing modules to make the interface cleaner. For the latest version of this module, please check the CPAN sites listed in appendix B, "Perl Module Archives." It looks simple enough already, but it can always be tweaked a little bit more. Check out the Ipc::SysV modules written by Jack Shirazi. Unfortunately, I could not get this module to compile and work for Perl 5.002. There was not any documentation or follow up address for me to get more information about the module. You can try to get it to work on your system, but with the application interface the way it is now, you should ask yourself this question: Will the module make it simpler? If the answer is yes, by all means go for it! Applications of IpcThere are several ways that you can apply the Ipc tools you have available. Generally, shared memory and semaphores have to be used together. When working with large blocks of data, use shared memory to pass the data between two processes. Synchronize the transfer between processes via the use of a semaphore. What if you have a situation in which more than one process is required to process the data? Semaphores can get clunky at this stage if you are not careful. In a typical scenario, you could have one process collect data from external devices and then have the data available in shared memory for all other processes. The shared memory area will be divided into partitions. Each partition is used only by one process and only written to by the data collector. The data collector updates all the sections of shared memory and then updates a semaphore with the number of processes that are currently waiting to work with this data. Then it sends a message to each of the processes via a message queue. After sending all the messages, the data collector process waits for the semaphore to be 0 again, thereby getting the signal to proceed. Each data-handling process (client) can wait for messages forever on its message queue. As soon as it receives a message on its queue, the client can guarantee that it will have exclusive access to its partition. After it has processed the data, the client can decrement the semaphore. As each client increments the semaphore, it will go back to the top of the loop and wait on the input message queue again. Once all the clients have incremented the semaphore, it becomes 0 again. This causes the data collector to wake up and begin the process of collecting and updating the shared memory area. Listings 13.7 and 13.8 show a partial application for such a system. These listings are by no means complete because this would require a full-blown application well beyond the scope of this chapter. The gist of the program is to illustrate how all three types of Ipc objects can be used with each other to create relatively complex applications. The server application decrements the semaphore to block (while the clients do what they have to do) and then increments the value of the semaphore. The processes here have to run concurrently in the background. There are three clients for the one server. Obviously, this example is contrived for the book-you might have more clients to handle your tasks. Listing 13.7. The server of a dummy application. 1 #!/usr/bin/perl Listing 13.8 is the client application to pick up the messages from the server. The messages sent can contain additional information for the client; that is, they don't have to be just triggers for the client to proceed with reading. The contents of the messages can contain information about how and where to pick up data from shared memory. Listing 13.8. The client of a dummy application. 1 #!/usr/bin/perl The sample programs shown in Listings 13.7 and 13.8 provided the basis for a prototype of a seismic data collection system. The actual system was written in C for efficiency because of some pretty lengthy mathematical calculations. However, with Perl, we were able to use this code to get a proof-of-concept working model up and running in just one afternoon. The prototype provided us with enough information to consider using the Ipc model for the application. In later models of the same application, I was able to extend the processing to remote machines by replacing the message queues with sockets and sending the requisite portions of data along the sockets. The final application was tested further by adding new Perl scripts that share messages and simulate data using shared memory. Listings 13.7 and 13.8 are very similar to the working application and have been created from scratch. It should be relatively painless for you to take this code and extend it into your own prototype. SummaryPerl is a very powerful tool for prototyping applications. With the capability to access the system facilities, Perl can provide the necessary tools for rapid prototyping. Hopefully, this chapter has provided you with enough information to set up your own applications. This chapter has introduced you to the System V Ipc facilities available through Perl. The Ipc objects are global and remain in memory long after the processes that created them are gone. With Ipc objects, you are limited to one machine. If your application requires network access, consider using sockets instead. Using message queues, you can send messages between processes. For large data items, message queues might not be very efficient. Consider using shared memory instead. To synchronize the access to the shared memory, you can use semaphores to prevent one process from writing to an area where another process might be reading.
|
|||||||||||||||||||||||||||||
With any suggestions or questions please feel free to contact us |