|
3.2 Perl Functions in Alphabetical Order/PATTERN/
/PATTERN/ m/PATTERN/ The match operator. See "Regular Expressions" in Chapter 2, The Gory Details. ?PATTERN?
?PATTERN? This is just like the /PATTERN/ search, except that it matches only once between calls to reset, so it finds only the first occurrence of something rather than all occurrences. (In other words, the operator works repeatedly until it actually matches something, then it turns itself off until you explicitly turn it back on with reset.) This may be useful (and efficient) if you want to see only the first occurrence of the pattern in each file of a set of files. Note that m?? is equivalent to ??. The reset operator will only reset instances of ?? that were compiled in the same package that it was. abs
abs VALUE This function returns the absolute value of its argument (or $_ if omitted). accept
accept NEWSOCKET, GENERICSOCKET This function does the same thing as the accept system call--see accept (2). It is used by server processes that wish to accept socket connections from clients. Execution is suspended until a connection is made, at which time the NEWSOCKET filehandle is opened and attached to the newly made connection. The function returns the connected address if the call succeeded, false otherwise (and puts the error code into $!). GENERICSOCKET must be a filehandle already opened via the socket operator and bound to one of the server's network addresses. For example:
unless ($peer = accept NS, S) { die "Can't accept a connection: $!\n"; } See also the example in the section "Sockets" in Chapter 6, Social Engineering. alarm
alarm EXPR This function sends a SIGALRM signal to the executing Perl program after EXPR seconds. On some older systems, alarms go off at the "top of the second," so, for instance, an alarm 1 may go off anywhere between 0 to 1 second from now, depending on when in the current second it is. An alarm 2 may go off anywhere from 1 to 2 seconds from now. And so on. For better resolution, you may be able to use syscall to call the itimer routines that some UNIX systems support. Or you can use the timeout feature of the select function. Each call disables the previous timer, and an argument of 0 may be supplied to cancel the previous timer without starting a new one. The return value is the number of seconds remaining on the previous timer. atan2
atan2 Y, X This function returns the arctangent of Y/X in the range -pi to pi. A quick way to get an approximate value of pi is to say:
$pi = atan2(1,1) * 4; For the tangent operation, you may use the POSIX::tan() function, or use the familiar relation:
sub tan { sin($_[0]) / cos($_[0]) } bind
bind SOCKET, NAME This function does the same thing as the bind system call--see bind (2). It attaches an address (a name) to an already opened socket specified by the SOCKET filehandle. The function returns true if it succeeded, false otherwise (and puts the error code into $!). NAME should be a packed address of the proper type for the socket.
bind S, $sockaddr or die "Can't bind address: $!\n"; See also the example in the section "Sockets" in Chapter 6, Social Engineering. binmode
binmode FILEHANDLE This function arranges for the file to be treated in binary mode on operating systems that distinguish between binary and text files. It should be called after the open but before any I/O is done on the filehandle. The only way to reset binary mode on a filehandle is to reopen the file. On systems that distinguish binary mode from text mode, files that are read in text mode have \r\n sequences translated to \n on input and \n translated to \r\n on output. binmode has no effect under UNIX or Plan9. If FILEHANDLE is an expression, the value is taken as the name of the filehandle. The following example shows how a Perl script might prepare to read a word processor file with embedded control codes:
open WP, "$file.wp" or die "Can't open $file.wp: $!\n"; binmode WP; while (read WP, $buf, 1024) {...} bless
bless REF, CLASSNAME bless REF This function looks up the item pointed to by reference REF and tells the item that it is now an object in the CLASSNAME package--or the current package if no CLASSNAME is specified, which is often the case. It returns the reference for convenience, since a bless is often the last thing in a constructor function. (Always use the two-argument version if the constructor doing the blessing might be inherited by a derived class. In such cases, the class you want to bless your object into will normally be found as the first argument to the constructor in question.) See "Objects" in Chapter 5, Packages, Modules, and Object Classes for more about the blessing (and blessings) of objects. caller
caller EXPR caller This function returns information about the stack of current subroutine calls. Without an argument it returns the package name, filename, and line number that the currently executing subroutine was called from:
($package, $filename, $line) = caller; With an argument it evaluates EXPR as the number of stack frames to go back before the current one. It also reports some additional information.
$i = 0; while (($pack, $file, $line, $subname, $hasargs, $wantarray) = caller($i++)) { ... } Furthermore, when called from within the DB package, caller returns more detailed information: it sets the list variable @DB::args to be the arguments passed in the given stack frame. chdir
chdir EXPR This function changes the working directory to EXPR, if possible. If EXPR is omitted, it changes to the home directory. The function returns 1 upon success, 0 otherwise (and puts the error code into $!).
chdir "$prefix/lib" or die "Can't cd to $prefix/lib: $!\n"; The following code can be used to move to the user's home directory, one way or another:
$ok = chdir($ENV{"HOME"} || $ENV{"LOGDIR"} || (getpwuid($<))[7]); Alternately, taking advantage of the default, you could say this:
$ok = chdir() || chdir((getpwuid($<))[7]); See also the Cwd module, described in Chapter 7, The Standard Perl Library, which lets you keep track of your current directory. chmod
chmod LIST This function changes the permissions of a list of files. The first element of the list must be the numerical mode, as in chmod (2). (When using nonliteral mode data, you may need to convert an octal string to a decimal number using the oct function.) The function returns the number of files successfully changed. For example:
$cnt = chmod 0755, 'file1', 'file2'; will set $cnt to 0, 1, or 2, depending on how many files got changed (in the sense that the operation succeeded, not in the sense that the bits were different afterward). Here's a more typical usage:
chmod 0755, @executables; If you need to know which files didn't allow the change, use something like this:
@cannot = grep {not chmod 0755, $_} 'file1', 'file2', 'file3'; die "$0: could not chmod @cannot\n" if @cannot; This idiom makes use of the grep function to select only those elements of the list for which the chmod function failed. chomp
chomp VARIABLE chomp LIST chomp This is a slightly safer version of chop (see below) in that it removes only any line ending corresponding to the current value of $/, and not just any last character. Unlike chop, chomp returns the number of characters deleted. If $/ is empty (in paragraph mode), chomp removes all trailing newlines from the selected string (or strings, if chomping a LIST). chop
chop VARIABLE chop LIST chop This function chops off the last character of a string and returns the character chopped. The chop operator is used primarily to remove the newline from the end of an input record, but is more efficient than s/\n$//. If VARIABLE is omitted, the function chops the $_ variable. For example:
while (<PASSWD>) { chop; # avoid \n on last field @array = split /:/; ... } If you chop a LIST, each string in the list is chopped:
@lines = `cat myfile`; chop @lines; You can actually chop anything that is an lvalue, including an assignment:
chop($cwd = `pwd`); chop($answer = <STDIN>); Note that this is different from:
$answer = chop($tmp = <STDIN>); # WRONG which puts a newline into $answer, because chop returns the character chopped, not the remaining string (which is in $tmp). One way to get the result intended here is with substr:
$answer = substr <STDIN>, 0, -1; But this is more commonly written as:
chop($answer = <STDIN>); To chop more than one character, use substr as an lvalue, assigning a null string. The following removes the last five characters of $caravan:
substr($caravan, -5) = "`; The negative subscript causes substr to count from the end of the string instead of the beginning. chown
chown LIST This function changes the owner (and group) of a list of files. The first two elements of the list must be the numerical uid and gid, in that order. The function returns the number of files successfully changed. For example:
$cnt = chown $uid, $gid, 'file1', 'file2'; will set $cnt to 0, 1, or 2, depending on how many files got changed (in the sense that the operation succeeded, not in the sense that the owner was different afterward). Here's a more typical usage:
chown $uid, $gid, @filenames; Here's a subroutine that looks everything up for you, and then does the chown:
sub chown_by_name { local($user, $pattern) = @_; chown((getpwnam($user))[2,3], glob($pattern)); } &chown_by_name("fred", "*.c"); Notice that this forces the group of each file to be the gid fetched from the passwd file. An alternative is to pass a -1 for the gid, which leaves the group of the file unchanged. On most systems, you are not allowed to change the ownership of the file unless you're the superuser, although you should be able to change the group to any of your secondary groups. On insecure systems, these restrictions may be relaxed, but this is not a portable assumption. chr
chr NUMBER This function returns the character represented by that NUMBER in the character set. For example, chr(65) is "A" in ASCII. To convert multiple characters, use pack(`C*`, LIST) instead. chroot
chroot FILENAME This function does the same operation as the chroot system call--see chroot (2). If successful, FILENAME becomes the new root directory for the current process--the starting point for pathnames beginning with "/". This directory is inherited across exec calls and by all subprocesses. There is no way to undo a chroot. Only the superuser can use this function. Here's some code that approximates what many FTP servers do:
chroot +(getpwnam('ftp'))[7] or die "Can't do anonymous ftp: $!\n"; close
close FILEHANDLE This function closes the file, socket, or pipe associated with the filehandle. You don't have to close FILEHANDLE if you are immediately going to do another open on it, since the next open will close it for you. (See open.) However, an explicit close on an input file resets the line counter ($.), while the implicit close done by open does not. Also, closing a pipe will wait for the process executing on the pipe to complete (in case you want to look at the output of the pipe afterward), and it prevents the script from exiting before the pipeline is finished.[1] Closing a pipe explicitly also puts the status value of the command executing on the pipe into $?. For example:
open OUTPUT, '|sort >foo'; # pipe to sort ... # print stuff to output close OUTPUT; # wait for sort to finish die "sort failed" if $?; # check for sordid sort open INPUT, 'foo'; # get sort's results FILEHANDLE may be an expression whose value gives the real filehandle name. It may also be a reference to a filehandle object returned by some of the newer object-oriented I/O packages. closedir
closedir DIRHANDLE This function closes a directory opened by opendir. See the examples under opendir. connect
connect SOCKET, NAME This function does the same thing as the connect system call--see connect (2). The function initiates a connection with another process that is waiting at an accept (2). The function returns true if it succeeded, false otherwise (and puts the error code into $!). NAME should be a packed network address of the proper type for the socket. For example:
connect S, $destadd or die "Can't connect to $hostname: $!\n"; To disconnect a socket, either close or shutdown. See also the example in the section "Sockets" in Chapter 6, Social Engineering. cos
cos EXPR This function returns the cosine of EXPR (expressed in radians). For example, the following script will print a cosine table of angles measured in degrees:
# Here's the lazy way of getting degrees-to-radians. $pi = atan2(1,1) * 4; $piover180 = $pi/180; # Print table. for ($_ = 0; $_ <= 90; $_++) { printf "%3d %7.5f\n", $_, cos($_ * $piover180); } For the inverse cosine operation, you may use the POSIX::acos() function, or use this relation:
sub acos { atan2( sqrt(1 - $_[0] * $_[0]), $_[0] ) } crypt
crypt PLAINTEXT, SALT This function encrypts a string exactly in the manner of crypt (3). This is useful for checking the password file for lousy passwords.[2] Only the guys wearing white hats are allowed to do this.
To see whether a typed-in password $guess matches the password $pass obtained from a file (such as /etc/passwd), try something like the following:
if (crypt($guess, $pass) eq $pass) { # guess is correct } Note that there is no easy way to decrypt an encrypted password apart from guessing. Also, truncating the salt to two characters is a waste of CPU time, although the manpage for crypt (3) would have you believe otherwise. Here's an example that makes sure that whoever runs this program knows their own password:
$pwd = (getpwuid $<)[1]; $salt = substr $pwd, 0, 2; system "stty -echo"; print "Password: "; chop($word = <STDIN>); print "\n"; system "stty echo"; if (crypt($word, $salt) ne $pwd) { die "Sorry...\n"; } else { print "ok\n"; } Of course, typing in your own password to whoever asks for it is unwise. The crypt function is unsuitable for encrypting large quantities of data. Find a library module for PGP (or something like that) for something like that. dbmclose
dbmclose HASH This function breaks the binding between a DBM file and a hash. This function is actually just a call to untie with the proper arguments, but is provided for backward compatibility with older versions of Perl. dbmopen
dbmopen HASH, DBNAME, MODE This binds a DBM file to a hash (that is, an associative array). (DBM stands for Data Base Management, and consists of a set of C library routines that allow random access to records via a hashing algorithm.) HASH is the name of the hash (with a %). DBNAME is the name of the database (without the .dir or .pag extension). If the database does not exist, and a valid MODE is specified, the database is created with the protection specified by MODE (as modified by the umask). To prevent creation of the database if it doesn't exist, you may specify a MODE of undef, and the function will return a false value if it can't find an existing database. If your system supports only the older DBM functions, you may have only one dbmopen in your program. Values assigned to the hash prior to the dbmopen are not accessible. If you don't have write access to the DBM file, you can only read the hash variables, not set them. If you want to test whether you can write, either use file tests or try setting a dummy array entry inside an eval, which will trap the error. Note that functions such as keys and values may return huge list values when used on large DBM files. You may prefer to use the each function to iterate over large DBM files. This example prints out the mail aliases on a system using sendmail:
dbmopen %ALIASES, "/etc/aliases", 0666 or die "Can't open aliases: $!\n"; while (($key,$val) = each %ALIASES) { print $key, ' = ', $val, "\n"; } dbmclose %ALIASES; Hashes bound to DBM files have the same limitations as DBM files, in particular the restrictions on how much you can put into a bucket. If you stick to short keys and values, it's rarely a problem. Another thing you should bear in mind is that many existing DBM databases contain null-terminated keys and values because they were set up with C programs in mind. The B News history file and the old sendmail aliases file are examples. Just use "$key\0" instead of $key. There is currently no built-in way to lock generic DBM files. Some would consider this a bug. The DB_File module does provide locking at the granularity of the entire file, however. See the documentation on that module in Chapter 7, The Standard Perl Library for details. This function is actually just a call to tie with the proper arguments, but is provided for backward compatibility with older versions of Perl. defined
defined EXPR This function returns a Boolean value saying whether EXPR has a real value or not. A scalar that contains no valid string, numeric, or reference value is known as the undefined value, or undef for short. Many operations return the undefined value under exceptional conditions, such as end of file, uninitialized variable, system error, and such. This function allows you to distinguish between an undefined null string and a defined null string when you're using operators that might return a real null string. You may also check to see whether arrays, hashes, or subroutines have been allocated any memory yet. Arrays and hashes are allocated when you first put something into them, whereas subroutines are allocated when a definition has been successfully parsed. Using defined on the predefined special variables is not guaranteed to produce intuitive results. Here is a fragment that tests a scalar value from a hash:
print if defined $switch{'D'}; When used on a hash element like this, defined only tells you whether the value is defined, not whether the key has an entry in the hash table. It's possible to have an undefined scalar value for an existing hash key. Use exists to determine whether the hash key exists. In the next example we use the fact that some operations return the undefined value when you run out of data:
print "$val\n" while defined($val = pop(@ary)); The same thing goes for error returns from system calls:
die "Can't readlink $sym: $!" unless defined($value = readlink $sym); Since symbol tables for packages are stored as hashes (associative arrays), it's possible to check for the existence of a package like this:
die "No XYZ package defined" unless defined %XYZ::; Finally, it's possible to avoid blowing up on nonexistent subroutines:
sub saymaybe { if (defined &say) { say(@_); } else { warn "Can't say"; } } See also undef. delete
delete EXPR This function deletes the specified key and associated value from the specified hash. (It doesn't delete a file. See unlink for that.) Deleting from $ENV{} modifies the environment. Deleting from a hash that is bound to a (writable) DBM file deletes the entry from the DBM file. The following naïve example inefficiently deletes all the values of a hash:
foreach $key (keys %HASH) { delete $HASH{$key}; } (It would be faster to use the undef command.) EXPR can be arbitrarily complicated as long as the final operation is a hash key lookup:
delete $ref->[$x][$y]{$key}; For normal hashes, the delete function happens to return the value (not the key) that was deleted, but this behavior is not guaranteed for tied hashes, such as those bound to DBM files. To test whether a hash element has been deleted, use exists. die
die LIST Outside of an eval, this function prints the concatenated value of LIST to STDERR and exits with the current value of $! (errno). If $! is 0, it exits with the value of ($? >> 8) (which is the status of the last reaped child from a system, wait, close on a pipe, or `command`). If ($? >> 8) is 0, it exits with 255. If LIST is unspecified, the current value of the $@ variable is propagated, if any. Otherwise the string "Died" is used as the default. Equivalent examples:
die "Can't cd to spool: $!\n" unless chdir '/usr/spool/news'; chdir '/usr/spool/news' or die "Can't cd to spool: $!\n" (The second form is generally preferred, since the important part is the chdir.) Within an eval, the function sets the $@ variable equal to the error message that would have been produced otherwise, and aborts the eval, which then returns the undefined value. The die function can thus be used to raise named exceptions that can be caught at a higher level in the program. See the section on the eval function later in this chapter. If the final value of LIST does not end in a newline, the current script filename, line number, and input line number (if any) are appended to the message, as well as a newline. Hint: sometimes appending `, stopped" to your message will cause it to make better sense when the string "at scriptname line 123" is appended. Suppose you are running script canasta:
die "/etc/games is no good"; die "/etc/games is no good, stopped"; which produces, respectively:
/etc/games is no good at canasta line 123. /etc/games is no good, stopped at canasta line 123. If you want your own error messages reporting the filename and linenumber, use the _ _FILE_ _ and _ _LINE_ _ special tokens:
die '"', _ _FILE_ _, '", line ', _ _LINE_ _, ", phooey on you!\n"; This produces output like:
"canasta", line 38, phooey on you! do
do BLOCK do SUBROUTINE(LIST) do EXPR The do BLOCK form executes the sequence of commands in the BLOCK, and returns the value of the last expression evaluated in the block. When modified by a loop modifier, Perl executes the BLOCK once before testing the loop condition. (On other statements the loop modifiers test the conditional first.) The do SUBROUTINE(LIST) is a deprecated form of a subroutine call. See "Subroutines" in Chapter 2, The Gory Details. The do EXPR, form uses the value of EXPR as a filename and executes the contents of the file as a Perl script. Its primary use is (or rather was) to include subroutines from a Perl subroutine library, so that:
do 'stat.pl'; is rather like:
eval `cat stat.pl`; except that it's more efficient, more concise, keeps track of the current filename for error messages, and searches all the directories listed in the @INC array. (See the section on "Special Variables" in Chapter 2, The Gory Details.) It's the same, however, in that it does reparse the file every time you call it, so you probably don't want to do this inside a loop. Note that inclusion of library modules is better done with the use and require operators, which also do error checking and raise an exception if there's a problem. dump
dump LABEL dump This function causes an immediate core dump. Primarily this is so that you can use undump (1) to turn your core dump into an executable binary after having initialized all your variables at the beginning of the program. (The undump program is not supplied with the Perl distribution, and is not even possible on some architectures. There are hooks in the code for using the GNU unexec() routine as an alternative. Other methods may be supported in the future.) When the new binary is executed it will begin by executing a goto LABEL (with all the restrictions that goto suffers). Think of the operation as a goto with an intervening core dump and reincarnation. If LABEL is omitted, the function arranges for the program to restart from the top. Please note that any files opened at the time of the dump will not be open any more when the program is reincarnated, with possible confusion resulting on the part of Perl. See also the -u command-line switch. For example:
#!/usr/bin/perl use Getopt::Std; use MyHorridModule; %days = ( Sun => 1, Mon => 2, Tue => 3, Wed => 4, Thu => 5, Fri => 6, Sat => 7, ); dump QUICKSTART if $ARGV[0] eq '-d'; QUICKSTART: Getopts('f:'); ... This startup code does some slow initialization code, and then calls the dump function to take a snapshot of the program's state. When the dumped version of the program is run, it bypasses all the startup code and goes directly to the QUICKSTART label. If the original script is invoked without the -d switch, it just falls through and runs normally. If you're looking to use dump to speed up your program, check out the discussion of efficiency matters in Chapter 8, Other Oddments, as well the Perl native-code compiler in Chapter 6, Social Engineering. You might also consider autoloading, which at least makes it appear to run faster. each
each HASH This function returns a two-element list consisting of the key and value for the next value of a hash. With successive calls to each you can iterate over the entire hash. Entries are returned in an apparently random order. When the hash is entirely read, a null list is returned (which, when used in a list assignment, produces a false value). The next call to each after that will start a new iteration. The iterator can be reset either by reading all the elements from the hash, or by calling the keys function in scalar context. You must not add elements to the hash while iterating over it, although you are permitted to use delete. In a scalar context, each returns just the key, but watch out for false keys. There is a single iterator for each hash, shared by all each, keys, and values function calls in the program. This means that after a keys or values call, the next each call will start again from the beginning. The following example prints out your environment like the printenv (1) program, only in a different order:
while (($key,$value) = each %ENV) { print "$key=$value\n"; } eof
eof FILEHANDLE eof() eof This function returns true if the next read on FILEHANDLE will return end of file, or if FILEHANDLE is not open. FILEHANDLE may be an expression whose value gives the real filehandle name. An eof without an argument returns the end-of-file status for the last file read. Empty parentheses () may be used in connection with the combined files listed on the command line. That is, inside a while (<>) loop eof() will detect the end of only the last of a group of files. Use eof(ARGV) or eof (without the parentheses) to test each file in a while (<>) loop. For example, the following code inserts dashes just before the last line of the last file:
while (<>) { if (eof()) { print "-" x 30, "\n"; } print; } On the other hand, this script resets line numbering on each input file:
while (<>) { print "$.\t$_"; if (eof) { # Not eof(). close ARGV; # reset $. } } Like "$" in a sed program, eof tends to show up in line number ranges. Here's a script that prints lines from /pattern/ to end of each input file:
while (<>) { print if /pattern/ .. eof; } Here, the flip-flop operator (..) evaluates the regular expression match for each line. Until the pattern matches, the operator returns false. When it finally matches, the operator starts returning true, causing the lines to be printed. When the eof operator finally returns true (at the end of the file being examined), the flip-flop operator resets, and starts returning false again. Note that the eof function actually reads a byte and then pushes it back on the input stream with ungetc (3), so it is not very useful in an interactive context. In fact, experienced Perl programmers rarely use eof, since the various input operators already behave quite nicely in while-loop conditionals. See the example in the description of foreach in Chapter 2, The Gory Details. eval
eval EXPR eval BLOCK The value expressed by EXPR is parsed and executed as though it were a little Perl program. It is executed in the context of the current Perl program, so that any variable settings remain afterward, as do any subroutine or format definitions. The code of the eval is treated as a block, so any locally scoped variables declared within the eval last only until the eval is done. (See local and my.) As with any code in a block, a final semicolon is not required. If EXPR is omitted, the operator evaluates $_. The value returned from an eval is the value of the last expression evaluated, just as with subroutines. Similarly, you may use the return operator to return a value from the middle of the eval. If there is a syntax error or run-time error (including any produced by the die operator), eval returns the undefined value and puts the error message in $@. If there is no error, $@ is guaranteed to be set to the null string, so you can test it reliably afterward for errors. Here's a statement that assigns an element to a hash chosen at run-time:
eval "\$$arrayname{\$key} = 1"; (You can accomplish that more simply with soft references--see "Symbolic References" in Chapter 4, References and Nested Data Structures.) And here is a simple Perl shell:
while (<>) { eval; print $@; } Since eval traps otherwise-fatal errors, it is useful for determining whether a particular feature (such as socket or symlink) is implemented. In fact, eval is the way to do all exception handling in Perl. If the code to be executed doesn't vary, you should use the eval BLOCK form to trap run-time errors; the code in the block is compiled only once rather than on each execution, yielding greater efficiency. The error, if any, is still returned in $@. Examples:
# make divide-by-zero non-fatal eval { $answer = $a / $b; }; warn $@ if $@; # same thing, but less efficient eval '$answer = $a / $b'; warn $@ if $@; # a compile-time error (not trapped) eval { $answer = }; # a run-time error eval '$answer ='; # sets $@ Here, the code in the BLOCK has to be valid Perl code to make it past the compilation phase. The code in the string doesn't get examined until run-time, and so doesn't cause an error until run-time. With an eval you should be careful to remember what's being looked at when:
eval $x; # CASE 1 eval "$x"; # CASE 2 eval '$x'; # CASE 3 eval { $x }; # CASE 4 eval "\$$x++"; # CASE 5 $$x++; # CASE 6 Cases 1 and 2 above behave identically: they run the code contained in the variable $x. (Case 2 has misleading double quotes, making the reader wonder what else might be happening, when nothing is. The contents of $x would in any event have to be converted to a string for parsing.) Cases 3 and 4 likewise behave in the same way: they run the code $x, which does nothing at all except return the value of $x. (Case 4 is preferred since the expression doesn't need to recompiled each time.) Case 5 is a place where normally you would like to use double quotes to let you interpolate the variable name, except that in this particular situation you can just use symbolic references instead, as in case 6. A frequently asked question is how to set up an exit routine. One common way is to use an END block. But you can also do it with an eval, like this:
#!/usr/bin/perl eval <<'EndOfEval'; $start = __LINE__; . . # your ad here . EndOfEval # Cleanup unlink "/tmp/myfile$$"; $@ && ($@ =~ s/\(eval \d+\) at line (\d+)/$0 . " line " . ($1+$start)/e, die $@); exit 0; Note that the code supplied for an eval might not be recompiled if the text hasn't changed. On the rare occasions when you want to force a recompilation (because you want to reset a .. operator, for instance), you could say something like this:
eval $prog . '#' . ++$seq; exec
exec LIST This function terminates the currently running Perl script by executing another program in place of itself. If there is more than one argument in LIST (or if LIST is an array with more than one value) the function calls C's execvp (3) routine with the arguments in LIST. This bypasses any shell processing of the command. If there is only one scalar argument, the argument is checked for shell metacharacters. If metacharacters are found, the entire argument is passed to "/bin/sh -c" for parsing.[3] If there are no metacharacters, the argument is split into words and passed directly to execvp (3) in the interests of efficiency, since this bypasses all the overhead of shell processing. Ordinarily exec never returns--if it does return, it always returns false, and you should check $! to find out what went wrong. Note that exec (and system) do not flush your output buffer, so you may need to enable command buffering by setting $| on one or more filehandles to avoid lost output. This statement runs the echo program to print the current argument list:
exec 'echo', 'Your arguments are: ', @ARGV; This example shows that you can exec a pipeline:
exec "sort $outfile | uniq" or die "Can't do sort/uniq: $!\n"; The UNIX execv (3) call provides the ability to tell a program the name it was invoked as. This name might have nothing to do with the name of the program you actually gave the operating system to run. By default, Perl simply replicates the first element of LIST and uses it for both purposes. If, however, you don't really want to execute the first argument of LIST, but you want to lie to the program you are executing about its own name, you can do so. Put the real name of the program you want to run into a variable and then put that variable out in front of the LIST without a comma, kind of like a filehandle for a print statement. (This always forces interpretation of the LIST as a multi-valued list, even if there is only a single scalar in the list.) Then the first element of LIST will be used only to mislead the executing program as to its name. For example:
$shell = '/bin/csh'; exec $shell '-sh', @args; # pretend it's a login shell die "Couldn't execute csh: $!\n"; You can also replace the simple scalar holding the program name with a block containing arbitrary code, which simplifies the above example to:
exec {'/bin/csh'} '-sh', @args; # pretend it's a login shell exists
exists EXPR This function returns true if the specified hash key exists in its hash, even if the corresponding value is undefined.
print "Exists\n" if exists $hash{$key}; print "Defined\n" if defined $hash{$key}; print "True\n" if $hash{$key}; A hash element can only be true if it's defined, and can only be defined if it exists, but the reverse doesn't necessarily hold true in either case. EXPR can be arbitrarily complicated as long as the final operation is a hash key lookup:
if (exists $ref->[$x][$y]{$key}) { ... } exit
exit EXPR This function evaluates EXPR and exits immediately with that value. Here's a fragment that lets a user exit the program by typing x or X:
$ans = <STDIN>; exit 0 if $ans =~ /^[Xx]/; If EXPR is omitted, the function exits with 0 status. You shouldn't use exit to abort a subroutine if there's any chance that someone might want to trap whatever error happened. Use die instead, which can be trapped by an eval. exp
exp EXPR This function returns e to the power of EXPR. If EXPR is omitted, it gives exp($_). To do general exponentiation, use the ** operator. fcntl
fcntl FILEHANDLE, FUNCTION, SCALAR This function calls UNIX's fcntl (2) function. (fcntl stands for "file control".) You'll probably have to say:
use Fcntl; first to get the correct function definitions. SCALAR will be read and/or written depending on the FUNCTION--a pointer to the string value of SCALAR will be passed as the third argument of the actual fcntl call. (If SCALAR has no string value but does have a numeric value, that value will be passed directly rather than a pointer to the string value.) The return value of fcntl (and ioctl) is as follows:
Thus Perl returns true on success and false on failure, yet you can still easily determine the actual value returned by the operating system:
$retval = fcntl(...) or $retval = -1; printf "System returned %d\n", $retval; Here, even the string "0 but true" prints as 0, thanks to the %d format. For example, since Perl always sets the close-on-exec flag for file descriptors above 2, if you wanted to pass file descriptor 3 to a subprocess, you might want to clear the flag like this:
use Fcntl; open TTY,"+>/dev/tty" or die "Can't open /dev/tty: $!\n"; fileno TTY == 3 or die "Internal error: fd mixup"; fcntl TTY, &F_SETFL, 0 or die "Can't clear the close-on-exec flag: $!\n"; fcntl will produce a fatal error if used on a machine that doesn't implement fcntl (2). On machines that do implement it, you can do such things as modify the close-on-exec flags, modify the non-blocking I/O flags, emulate the lockf (3) function, and arrange to receive the SIGIO signal when I/O is pending. You might even have record-locking facilities. fileno
fileno FILEHANDLE This function returns the file descriptor for a filehandle. (A file descriptor is a small integer, unlike the filehandle, which is a symbol.) It returns undef if the handle is not open. It's useful for constructing bitmaps for select, and for passing to certain obscure system calls if syscall (2) is implemented. It's also useful for double-checking that the open function gave you the file descriptor you wanted--see the example under fcntl. If FILEHANDLE is an expression, its value is taken to represent a filehandle, either indirectly by name, or directly as a reference to a filehandle object. A caution: don't count on the association of a Perl filehandle and a numeric file descriptor throughout the life of the program. If a file has been closed and reopened, the file descriptor may change. Filehandles STDIN, STDOUT, and STDERR start with file descriptors of 0, 1, and 2 (the UNIX standard convention), but even they can change if you start closing and opening them with wild abandon. But you can't get into trouble with 0, 1, and 2 as long as you always reopen immediately after closing, since the basic rule on UNIX systems is to pick the lowest available descriptor, and that'll be the one you just closed. flock
flock FILEHANDLE, OPERATION This function calls flock (2) on FILEHANDLE. See the manual page for flock (2) for the definition of OPERATION. Invoking flock will produce a fatal error if used on a machine that doesn't implement flock (2) or emulate it through some other locking mechanism. Here's a mailbox appender for some BSD-based systems:
$LOCK_SH = 1; $LOCK_EX = 2; $LOCK_NB = 4; $LOCK_UN = 8; sub lock { flock MBOX, $LOCK_EX; # and, in case someone appended # while we were waiting... seek MBOX, 0, 2; } sub unlock { flock MBOX, $LOCK_UN; } open MBOX, ">>/usr/spool/mail/$ENV{'USER'}" or die "Can't open mailbox: $!"; lock(); print MBOX $msg, "\n\n"; unlock(); Note that flock is unlikely to work on a file being accessed through a network file system. fork
fork This function does a fork (2) call. If it succeeds, the function returns the child pid to the parent process and 0 to the child process. (If it fails, it returns the undefined value to the parent process. There is no child process.) Note that unflushed buffers remain unflushed in both processes, which means you may need to set $| on one or more filehandles earlier in the program to avoid duplicate output. A nearly bulletproof way to launch a child process while checking for "cannot fork" errors would be:
FORK: { if ($pid = fork) { # parent here # child process pid is available in $pid } elsif (defined $pid) { # $pid is zero here if defined # child here # parent process pid is available with getppid } elsif ($! =~ /No more process/) { # EAGAIN, supposedly recoverable fork error sleep 5; redo FORK; } else { # weird fork error die "Can't fork: $!\n"; } } These precautions are not necessary on operations which do an implicit fork (2), such as system, backquotes, or opening a process as a filehandle, because Perl automatically retries a fork on a temporary failure in these cases. Be very careful to end the child code with an exit, or your child may inadvertently leave the conditional and start executing code intended only for the parent process. If you fork your child processes, you'll have to wait on their zombies when they die. See the wait function for examples of doing this. The fork function is unlikely to be implemented on any operating system not resembling UNIX, unless it purports POSIX compliance. format
format NAME = picture line value list ... . Declares a named sequence of picture lines (with associated values) for use by the write function. If NAME is omitted, the name defaults to STDOUT, which happens to be the default format name for the STDOUT filehandle. Since, like a sub declaration, this is a global declaration that happens at compile time, any variables used in the value list need to be visible at the point of the format's declaration. That is, lexically scoped variables must be declared earlier in the file, while dynamically scoped variables merely need to be set in the routine that calls write. Here's an example (which assumes we've already calculated $cost and $quantity:
my $str = "widget"; # A lexically scoped variable. format Nice_Output = Test: @<<<<<<<< @||||| @>>>>> $str, $%, '$' . int($num) . $~ = "Nice_Output"; # Select our format. local $num = $cost * $quantity; # Dynamically scoped variable. write; Like filehandles, format names are identifiers that exist in a symbol table (package) and may be fully qualified by package name. Within the typeglobs of a symbol table's entries, formats reside in their own namespace, which is distinct from filehandles, directory handles, scalars, arrays, hashes, or subroutines. Like those other six types, however, a format named Whatever would also be affected by a local on the *Whatever typeglob. In other words, a format is just another gadget contained in a typeglob, independent of the other gadgets. The "Formats" section in Chapter 2, The Gory Details contains numerous details and examples of their use. The "Per Filehandle Special Variables" and "Global Special Variables" sections in Chapter 2, The Gory Details describe the internal format-specific variables, and the English and FileHandle modules in Chapter 7, The Standard Perl Library provide easier access to them. formline
formline PICTURE, LIST This is an internal function used by formats, although you may also call it. It formats a list of values according to the contents of PICTURE, placing the output into the format output accumulator, $^A. Eventually, when a write is done, the contents of $^A are written to some filehandle, but you could also read $^A yourself and then set $^A back to "". Note that a format typically does one formline per line of form, but the formline function itself doesn't care how many newlines are embedded in the PICTURE. This means that the ~ and ~~ tokens will treat the entire PICTURE as a single line. You may therefore need to use multiple formlines to implement a single record-format, just like the format compiler. Be careful if you put double quotes around the picture, since an @ character may be taken to mean the beginning of an array name. formline always returns true. See "Formats" in Chapter 2, The Gory Details for other examples. getc
getc FILEHANDLE getc This function returns the next byte from the input file attached to FILEHANDLE. At end-of-file, it returns a null string. If FILEHANDLE is omitted, the function reads from STDIN. This operator is very slow, but is occasionally useful for single-character, buffered input from the keyboard. This does not enable single-character input. For unbuffered input, you have to be slightly more clever, in an operating-system-dependent fashion. Under UNIX you might say this:
if ($BSD_STYLE) { system "stty cbreak </dev/tty >/dev/tty 2>&1"; } else { system "stty", "-icanon", "eol", "\001"; } $key = getc; if ($BSD_STYLE) { system "stty -cbreak </dev/tty >/dev/tty 2>&1"; } else { system "stty", "icanon", "eol", "^@"; # ASCII NUL } print "\n"; This code puts the next character typed on the terminal in the string $key. If your stty program has options like cbreak, you'll need to use the code where $BSD_STYLE is true, otherwise, you'll need to use the code where it is false. Determining the options for stty is left as an exercise to the reader. The POSIX module in Chapter 7, The Standard Perl Library provides a more portable version of this using the POSIX::getattr() function. See also the TERM::ReadKey module from your nearest CPAN site. getgrent
getgrent setgrent endgrent These functions do the same thing as their like-named system library routines--see getgrent (3). These routines iterate through your /etc/group file (or its moral equivalent coming from some server somewhere). The return value from getgrent in list context is:
($name, $passwd, $gid, $members) where $members contains a space-separated list of the login names of the members of the group. To set up a hash for translating group names to gids, say this:
while (($name, $passwd, $gid) = getgrent) { $gid{$name} = $gid; } In scalar context, getgrent returns only the group name. getgrgid
getgrgid GID This function does the same thing as getgrgid (3): it looks up a group file entry by group number. The return value in list context is:
($name, $passwd, $gid, $members) where $members contains a space-separated list of the login names of the members of the group. If you want to do this repeatedly, consider caching the data in a hash (associative array) using getgrent. In scalar context, getgrgid returns only the group name. getgrnam
getgrnam NAME This function does the same thing as getgrnam (3): it looks up a group file entry by group name. The return value in list context is:
($name, $passwd, $gid, $members) where $members contains a space-separated list of the login names of the members of the group. If you want to do this repeatedly, consider slurping the data into a hash (associative array) using getgrent. In scalar context, getgrnam returns only the numeric group ID. gethostbyaddr
gethostbyaddr ADDR, ADDRTYPE This function does the same thing as gethostbyaddr (3): it translates a packed binary network address to its corresponding names (and alternate addresses). The return value in list context is:
($name, $aliases, $addrtype, $length, @addrs) where @addrs is a list of packed binary addresses. In the Internet domain, each address is four bytes long, and can be unpacked by saying something like:
($a, $b, $c, $d) = unpack('C4', $addrs[0]); In scalar context, gethostbyaddr returns only the host name. See the section on "Sockets" in Chapter 6, Social Engineering for another approach. gethostbyname
gethostbyname NAME This function does the same thing as gethostbyname (3): it translates a network hostname to its corresponding addresses (and other names). The return value in list context is:
($name, $aliases, $addrtype, $length, @addrs) where @addrs is a list of raw addresses. In the Internet domain, each address is four bytes long, and can be unpacked by saying something like:
($a, $b, $c, $d) = unpack('C4', $addrs[0]); In scalar context, gethostbyname returns only the host address. See the section on "Sockets" in Chapter 6, Social Engineering for another approach. gethostent
gethostent sethostent STAYOPEN endhostent These functions do the same thing as their like-named system library routines--see gethostent (3). They iterate through your /etc/hosts file and return each entry one at a time. The return value from gethostent is:
($name, $aliases, $addrtype, $length, @addrs) where @addrs is a list of raw addresses. In the Internet domain, each address is four bytes long, and can be unpacked by saying something like:
($a, $b, $c, $d) = unpack('C4', $addrs[0]); Scripts that use these routines should not be considered portable. If a machine uses a nameserver, it would interrogate most of the Internet to try to satisfy a request for all the addresses of every machine on the planet. So these routines are unimplemented on such machines. getlogin
getlogin This function returns the current login from /etc/utmp, if any. If null, use getpwuid. For example:
$login = getlogin || (getpwuid($<))[0] || "Intruder!!"; getnetbyaddr
getnetbyaddr ADDR, ADDRTYPE This function does the same thing as getnetbyaddr (3): it translates a network address to the corresponding network name or names. The return value in list context is:
($name, $aliases, $addrtype, $net) In scalar context, getnetbyaddr returns only the network name. getnetbyname
getnetbyname NAME This function does the same thing as getnetbyname (3): it translates a network name to its corresponding network address. The return value in list context is:
($name, $aliases, $addrtype, $net) In scalar context, getnetbyname returns only the network address. getnetent
getnetent setnetent STAYOPEN endnetent These functions do the same thing as their like-named system library routines--see getnetent (3). They iterate through your /etc/networks file, or moral equivalent. The return value in list context is:
($name, $aliases, $addrtype, $net) In scalar context, getnetent returns only the network name. getpeername
getpeername SOCKET This function returns the packed socket address of other end of the SOCKET connection. For example:
use Socket; $hersockaddr = getpeername SOCK; ($port, $heraddr) = unpack_sockaddr_in($hersockaddr); $herhostname = gethostbyaddr($heraddr, AF_INET); $herstraddr = inet_ntoa($heraddr); getpgrp
getpgrp PID This function returns the current process group for the specified PID (use a PID of 0 for the current process). Invoking getpgrp will produce a fatal error if used on a machine that doesn't implement getpgrp (2). If PID is omitted, the function returns the process group of the current process (the same as using a PID of 0). On systems implementing this operator with the POSIX getpgrp (2) system call, PID must be omitted or, if supplied, must be 0. getppid
getppid This function returns the process ID of the parent process. On the typical UNIX system, if your parent process ID changes to 1, your parent process has died and you've been adopted by the init program. getpriority
getpriority WHICH, WHO This function returns the current priority for a process, a process group, or a user. See getpriority (2). Invoking getpriority will produce a fatal error if used on a machine that doesn't implement getpriority (2). For example, to get the priority of the current process, use:
$curprio = getpriority(0, 0); getprotobyname
getprotobyname NAME This function does the same thing as getprotobyname (3): it translates a protocol name to its corresponding number. The return value in list context is:
($name, $aliases, $protocol_number) In scalar context, getprotobyname returns only the protocol number. getprotobynumber
getprotobynumber NUMBER This function does the same thing as getprotobynumber (3): it translates a protocol number to its corresponding name. The return value in list context is:
($name, $aliases, $protocol_number) In scalar context, getprotobynumber returns only the protocol name. getprotoent
getprotoent setprotoent STAYOPEN endprotoent These functions do the same thing as their like-named system library routines--see getprotent (3). The return value from getprotoent is:
($name, $aliases, $protocol_number) In scalar context, getprotoent returns only the protocol name. getpwent
getpwent setpwent endpwent These functions do the same thing as their like-named system library routines--see getpwent (3). They iterate through your /etc/passwd file (or its moral equivalent coming from some server somewhere). The return value in list context is:
($name,$passwd,$uid,$gid,$quota,$comment,$gcos,$dir,$shell) Some machines may use the quota and comment fields for other purposes, but the remaining fields will always be the same. To set up a hash for translating login names to uids, say this:
while (($name, $passwd, $uid) = getpwent) { $uid{$name} = $uid; } In scalar context, getpwent returns only the username. getpwnam
getpwnam NAME This function does the same thing as getpwnam (3): it translates a username to the corresponding passwd file entry. The return value in list context is:
($name,$passwd,$uid,$gid,$quota,$comment,$gcos,$dir,$shell) If you want to do this repeatedly, consider caching the data in a hash (associative array) using getpwent. In scalar context, getpwnam returns only the numeric user ID. getpwuid
getpwuid UID This function does the same thing as getpwuid (3): it translates a numeric user id to the corresponding passwd file entry. The return value in list context is:
($name,$passwd,$uid,$gid,$quota,$comment,$gcos,$dir,$shell) If you want to do this repeatedly, consider slurping the data into a hash using getpwent. In scalar context, getpwuid returns the username. getservbyname
getservbyname NAME, PROTO This function does the same thing as getservbyname (3): it translates a service (port) name to its corresponding port number. PROTO is a protocol name such as "tcp". The return value in list context is:
($name, $aliases, $port_number, $protocol_name) In scalar context, getservbyname returns only the service port number. getservbyport
getservbyport PORT, PROTO This function does the same thing as getservbyport (3): it translates a service (port) number to its corresponding names. PROTO is a protocol name such as "tcp". The return value in list context is:
($name, $aliases, $port_number, $protocol_name) In scalar context, getservbyport returns only the service port name. getservent
getservent setservent STAYOPEN endservent These functions do the same thing as their like-named system library routines--see getservent (3). They iterate through the /etc/services file or its equivalent. The return value in list context is:
($name, $aliases, $port_number, $protocol_name) In scalar context, getservent returns only the service port name. getsockname
getsockname SOCKET This function returns the packed sockaddr address of this end of the SOCKET connection. (And why wouldn't you know your own address already? Because you might have bound an address containing wildcards to the generic socket before doing an accept. Or because you might have been passed a socket by your parent process--for example, inetd.)
use Socket; $mysockaddr = getsockname(SOCK); ($port, $myaddr) = unpack_sockaddr_in($mysockaddr); getsockopt
getsockopt SOCKET, LEVEL, OPTNAME This function returns the socket option requested, or the undefined value if there is an error. See setsockopt for more. glob
glob EXPR This function returns the value of EXPR with filename expansions such as a shell would do. (If EXPR is omitted, $_ is globbed instead.) This is the internal function implementing the <*> operator, except that it may be easier to type this way. For example, compare these two:
@result = map { glob($_) } "*.c", "*.c,v"; @result = map <${_}>, "*.c", "*.c,v"; The glob function is not related to the Perl notion of typeglobs, other than that they both use a * to represent multiple items. gmtime
gmtime EXPR This function converts a time as returned by the time function to a 9-element list with the time correct for the Greenwich time zone (aka GMT, or UTC, or even Zulu in certain cultures, not including the Zulu culture, oddly enough). Typically used as follows:
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = gmtime(time); All list elements are numeric, and come straight out of a struct tm (that's a C programming structure--don't sweat it). In particular this means that $mon has the range 0..11, $wday has the range 0..6, and the year has had 1,900 subtracted from it. (You can remember which ones are 0-based because those are the ones you're always using as subscripts into 0-based arrays containing month and day names.) If EXPR is omitted, it does gmtime(time). For example, to print the current month in London:
$london_month = (qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec))[(gmtime)[4]]; The Perl library module Time::Local contains a subroutine, timegm( ), that can convert in the opposite direction. In scalar context, gmtime returns a ctime (3)-like string based on the GMT time value. goto
goto LABEL goto EXPR goto &NAME goto LABEL finds the statement labeled with LABEL and resumes execution there. It may not be used to go into any construct that requires initialization, such as a subroutine or a foreach loop. It also can't be used to go into a construct that is optimized away. It can be used to go almost anywhere else within the dynamic scope,[4] including out of subroutines, but for that purpose it's usually better to use some other construct such as last or die. The author of Perl has never felt the need to use this form of goto (in Perl, that is--C is another matter).
Going to even greater heights of orthogonality (and depths of idiocy), Perl allows goto EXPR, which expects EXPR to evaluate to a label name, whose scope is guaranteed to be unresolvable until run-time since the label is unknown when the statement is compiled. This allows for computed gotos per FORTRAN, but isn't necessarily recommended[5] if you're optimizing for maintainability:
goto +("FOO", "BAR", "GLARCH")[$i]; goto &NAME is highly magical, substituting a call to the named subroutine for the currently running subroutine. This is used by AUTOLOAD subroutines that wish to load another subroutine and then pretend that this subroutine--and not the original one--had been called in the first place (except that any modifications to @_ in the original subroutine are propagated to the replacement subroutine). After the goto, not even caller will be able to tell that the original routine was called first. grep
grep EXPR, LIST grep BLOCK LIST This function evaluates EXPR or BLOCK in a Boolean context for each element of LIST, temporarily setting $_ to each element in turn. In list context, it returns a list of those elements for which the expression is true. (The operator is named after a beloved UNIX program that extracts lines out of a file that match a particular pattern. In Perl the expression is often a pattern, but doesn't have to be.) In scalar context, grep returns the number of times the expression was true. Presuming @all_lines contains lines of code, this example weeds out comment lines:
@code_lines = grep !/^#/, @all_lines; Since $_ is a reference into the list value, altering $_ will modify the elements of the original list. While this is useful and supported, it can occasionally cause bizarre results if you aren't expecting it. For example:
@list = qw(barney fred dino wilma); @greplist = grep { s/^[bfd]// } @list; @greplist is now "arney", "red", "ino", but @list is now "arney", "red", "ino", "wilma"! Caveat Programmor. See also map. The following two statements are functionally equivalent:
@out = grep { |