|
7.2 Library ModulesAs mentioned earlier, the following library modules are arranged in alphabetical order, for easy reference. AnyDBM_File--Provide Framework for Multiple DBMs
use AnyDBM_File; This module is a "pure virtual base class"--it has nothing of its own. It's just there to inherit from the various DBM packages. By default it inherits from NDBM_File for compatibility with earlier versions of Perl. If it doesn't find NDBM_File, it looks for DB_File, GDBM_File, SDBM_File (which is always there--it comes with Perl), and finally ODBM_File. Perl's dbmopen function (which now exists only for backward compatibility) actually just calls tie to bind a hash to AnyDBM_File. The effect is to bind the hash to one of the specific DBM classes that AnyDBM_File inherits from. You can override the defaults and determine which class dbmopen will tie to. Do this by redefining @ISA:
@AnyDBM_File::ISA = qw(DB_File GDBM_File NDBM_File); Note, however, that an explicit use takes priority over the ordering of @ISA, so that:
use GDBM_File; will cause the next dbmopen to tie your hash to GDBM_File. You can tie hash variables directly to the desired class yourself, without using dbmopen or AnyDBM_File. For example, by using multiple DBM implementations, you can copy a database from one format to another:
use Fcntl; # for O_* values use NDBM_File; use DB_File; tie %oldhash, "NDBM_File", $old_filename, O_RDWR; tie %newhash, "DB_File", $new_filename, O_RDWR|O_CREAT|O_EXCL, 0644; while (($key,$val) = each %oldhash) { $newhash{$key} = $val; } DBM comparisonsHere's a table of the features that the different DBMish packages offer:
See alsoRelevant library modules include: DB_File, GDBM_File, NDBM_File, ODBM_File, and SDBM_File. Related manpages: dbm (3), ndbm (3). Tied variables are discussed extensively in Chapter 5, Packages, Modules, and Object Classes, and the dbmopen entry in Chapter 3, Functions, may also be helpful. You can pick up the unbundled modules from the src/misc/ directory on your nearest CPAN site. Here are the most popular ones, but note that their version numbers may have changed by the time you read this:
http://www.perl.com/CPAN/src/misc/db.1.85.tar.gz http://www.perl.com/CPAN/src/misc/gdbm-1.7.3.tar.gz AutoLoader--Load Functions Only on Demand
package GoodStuff; use Exporter; use AutoLoader; @ISA = qw(Exporter AutoLoader); The AutoLoader module provides a standard mechanism for delayed loading of functions stored in separate files on disk. Each file has the same name as the function (plus a .al ), and comes from a directory named after the package (with the auto/ directory). For example, the function named GoodStuff::whatever() would be loaded from the file auto/GoodStuff/whatever.al. A module using the AutoLoader should have the special marker _ _END_ _ prior to the actual subroutine declarations. All code before this marker is loaded and compiled when the module is used. At the marker, Perl stops parsing the file. When a subroutine not yet in memory is called, the AUTOLOAD function attempts to locate it in a directory relative to the location of the module file itself. As an example, assume POSIX.pm is located in /usr/local/lib/perl5/POSIX.pm. The AutoLoader will look for the corresponding subroutines for this package in /usr/ local/lib/perl5/auto/POSIX/*.al. Lexicals declared with my in the main block of a package using the AutoLoader will not be visible to autoloaded functions, because the given lexical scope ends at the _ _END_ _ marker. A module using such variables as file-scoped globals will not work properly under the AutoLoader. Package globals must be used instead. When running under use strict, the use vars pragma may be employed in such situations as an alternative to explicitly qualifying all globals with the package name. Package variables predeclared with this pragma will be accessible to any autoloaded routines, but of course will not be invisible outside the module file. The AutoLoader is a counterpart to the SelfLoader module. Both delay the loading of subroutines, but the SelfLoader accomplishes this by storing the subroutines right there in the module file rather than in separate files elsewhere. While this avoids the use of a hierarchy of disk files and the associated I/O for each routine loaded, the SelfLoader suffers a disadvantage in the one-time parsing of the lines after _ _DATA_ _, after which routines are cached. The SelfLoader can also handle multiple packages in a file. AutoLoader, on the other hand, only reads code as it is requested, and in many cases should be faster. But it requires a mechanism like AutoSplit to be used to create the individual files. On systems with restrictions on file name length, the file corresponding to a subroutine may have a shorter name than the routine itself. This can lead to conflicting filenames. The AutoSplit module will warn of these potential conflicts when used to split a module. See the discussion of autoloading in Chapter 5, Packages, Modules, and Object Classes. Also see the AutoSplit module, a utility that automatically splits a module into a collection of files for autoloading. AutoSplit--Split a Module for Autoloading
# from a program use AutoSplit; autosplit_modules(@ARGV) # or from the command line perl -MAutoSplit -e 'autosplit(FILE, DIR, KEEP, CHECK, MODTIME)' ... # another interface perl -MAutoSplit -e 'autosplit_lib_modules(@ARGV)' ... This function splits up your program or module into files that the AutoLoader module can handle. It is mainly used to build autoloading Perl library modules, especially complex ones like POSIX. It is used by both the standard Perl libraries and by the MakeMaker module to automatically configure libraries for autoloading. The autosplit() interface splits the specified FILE into a hierarchy rooted at the directory DIR. It creates directories as needed to reflect class hierarchy. It then creates the file autosplit.ix, which acts as both a forward declaration for all package routines and also as a timestamp for when the hierarchy was last updated. The remaining three arguments to autosplit() govern other options to the autosplitter. If the third argument, KEEP, is false, then any pre-existing .al files in the autoload directory are removed if they are no longer part of the module (obsoleted functions). The fourth argument, CHECK, instructs autosplit() to check the module currently being split to ensure that it really does include a use specification for the AutoLoader module, and skips the module if AutoLoader is not detected. Lastly, the MODTIME argument specifies that autosplit() is to check the modification time of the module against that of the autosplit.ix file, and only split the module if it is newer. Here's a typical use of AutoSplit by the MakeMaker utility via the command line:
perl -MAutoSplit -e 'autosplit($ARGV[0], $ARGV[1], 0, 1, 1)' MakeMaker defines this as a make macro, and it is invoked with file and directory arguments. The autosplit() function splits the named file into the given directory and deletes obsolete .al files, after checking first that the module does use the AutoLoader and ensuring that the module isn't already split in its current form. The autosplit_lib_modules() form is used in the building of Perl. It takes as input a list of files (modules) that are assumed to reside in a directory lib/ relative to the current directory. Each file is sent to the autosplitter one at a time, to be split into the directory lib/auto/. In both usages of the autosplitter, only subroutines defined following the Perl special marker _ _END_ _ are split out into separate files. Routines placed prior to this marker are not autosplit, but are forced to load when the module is first required. Currently, AutoSplit cannot handle multiple package specifications within one file. AutoSplit will inform the user if it is necessary to create the top-level directory specified in the invocation. It's better if the script or installation process that invokes AutoSplit has created the full directory path ahead of time. This warning may indicate that the module is being split into an incorrect path. AutoSplit will also warn the user of subroutines whose names cause potential naming conflicts on machines with severely limited (eight characters or less) filename length. Since the subroutine name is used as the filename, these warnings can aid in portability to such systems. Warnings are issued and the file skipped if AutoSplit cannot locate either the _ _END_ _ marker or a specification of the form package Name;. AutoSplit will also complain if it can't create directories or files. Benchmark--Check and Compare Running Times of Code
use Benchmark; # timeit(): run $count iterations of the given Perl code, and time it $t = timeit($count, 'CODE'); # $t is now a Benchmark object # timestr(): convert Benchmark times to printable strings print "$count loops of 'CODE' took:", timestr($t), "\n"; # timediff(): calculate the difference between two times $t = timediff($t1 - $t2); # timethis(): run "code" $count times with timeit(); also, print out a # header saying "timethis $count: " $t = timethis($count, "CODE"); # timethese(): run timethis() on multiple chunks of code @t = timethese($count, { 'Name1' => '...CODE1...', 'Name2' => '...CODE2...', }); # new method: return the current time $t0 = new Benchmark; # ... your CODE here ... $t1 = new Benchmark; $td = timediff($t1, $t0); print "the code took: ", timestr($td), "\n"; # debug method: enable or disable debugging Benchmark->debug (1); $t = timeit(10, ' 5 ** $Global '); Benchmark->debug(0); The Benchmark module encapsulates a number of routines to help you figure out how long it takes to execute some code a given number of times within a loop. For the timeit() routine, $count is the number of times to run the loop. CODE is a string containing the code to run. timeit() runs a null loop with $count iterations, and then runs the same loop with your code inserted. It reports the difference between the times of execution. For timethese(), a loop of $count iterations is run on each code chunk separately, and the results are reported separately. The code to run is given as a hash with keys that are names and values that are code. timethese() is handy for quick tests to determine which way of doing something is faster. For example:
$ perl -MBenchmark -Minteger timethese(100000, { add => '$i += 2', inc => '$i++; $i++' }); _ _END_ _ Benchmark: timing 1000000 iterations of add, inc... add: 4 secs ( 4.52 usr 0.00 sys = 4.52 cpu) inc: 6 secs ( 5.32 usr 0.00 sys = 5.32 cpu) The following routines are exported into your namespace if you use the Benchmark module:
timeit() timethis() timethese() timediff() timestr() The following routines will be exported into your namespace if you specifically ask that they be imported:
clearcache() # clear just the cache element indexed by $key clearallcache() # clear the entire cache disablecache() # do not use the cache enablecache() # resume caching NotesCode is executed in the caller's package. The null loop times are cached, the key being the number of iterations. You can control caching with calls like these:
clearcache($key); clearallcache(); disablecache(); enablecache(); Benchmark inherits only from the Exporter class. The elapsed time is measured using time (2) and the granularity is therefore only one second. Times are given in seconds for the whole loop (not divided by the number of iterations). Short tests may produce negative figures because Perl can appear to take longer to execute the empty loop than a short test. The user and system CPU time is measured to millisecond accuracy using times (3). In general, you should pay more attention to the CPU time than to elapsed time, especially if other processes are running on the system. Also, elapsed times of five seconds or more are needed for reasonable accuracy. Because you pass in a string to be evaled instead of a closure to be executed, lexical variables declared with my outside of the eval are not visible. Carp--Generate Error Messages
use Carp; carp "Be careful!"; # warn of errors (from perspective of caller) croak "We're outta here!"; # die of errors (from perspective of caller) confess "Bye!"; # die of errors with stack backtrace carp() and croak() behave like warn and die, respectively, except that they report the error as occurring not at the line of code where they are invoked, but at a line in one of the calling routines. Suppose, for example, that you have a routine goo() containing an invocation of carp(). In that case--and assuming that the current stack shows no callers from a package other than the current one--carp() will report the error as occurring where goo() was called. If, on the other hand, callers from different packages are found on the stack, then the error is reported as occurring in the package immediately preceding the package in which the carp() invocation occurs. The intent is to let library modules act a little more like built-in functions, which always report errors where you call them from. confess() is like die except that it prints out a stack backtrace. The error is reported at the line where confess() is invoked, not at a line in one of the calling routines. Config--Access Perl Configuration Information
use Config; if ($Config{cc} =~ /gcc/) { print "built by gcc\n"; } use Config qw(myconfig config_sh config_vars); print myconfig(); print config_sh(); config_vars(qw(osname archname)); The Config module contains all the information that the Configure script had to figure out at Perl build time (over 450 values).[1]
Shell variables from the config.sh file (written by Configure) are stored in a readonly hash, %Config, indexed by their names. Values set to the string "undef" in config.sh are returned as undefined values. The Perl exists function should be used to check whether a named variable exists.
|