![]() |
![]()
![]() ![]() ![]()
![]()
|
![]() |
Chapter 27Writing Extensions in C
CONTENTS
In this chapter you'll work with the Perl XS
language, which is used to create an interface between
Perl and a C library. Such interfaces are called extensions
to Perl because they enable your code to look and feel just like
it is a part of Perl. Extensions are useful in appending extra
functionality to Perl. This chapter covers the basics of writing
extensions. The examples are simple enough to build on in order
to create your own extensions library.
In Perl, XS refers to a programming language interface used to
create an interface between C code and Perl scripts. Using the
XS API, you can create a library that can be loaded dynamically
into Perl.
The XS interface defines language components that wrap around
Perl constructs. To use the XS language, you need the xsubpp
compiler to embed the constructs for you. The base construct in
the XS language is the XSUB
function. The XSUB function
is called when calls are made to pass data and control between
Perl and C code. Basically, Perl scripts call the XSUB
routines to get to the C code in the libraries encapsulated by
the XSUB functions.
To ensure that correct data types are mapped between C and Perl
scripts, the XS compiler does a mapping of one C type of variable
to a type in Perl. The mapping is maintained in a file called
typemap. When you're looking
for extension files, it's often instructive to see how files are
mapped by using the typemap
file in the same directory as the source file. typemap
files are covered later in this chapter.
You may be asking yourself why someone would want to write an
extension in C when Perl is a perfectly good working language.
For one thing, Perl is slow compared to C. Your C compiler can
generate really tight code. All that power does have drawbacks.
Another reason is that you might already have source code written
and working in C, and porting this existing code to Perl would
not make much sense. Also, it's possible to write code that can
be accessed with Perl and C if you write the interfaces to your
core functions correctly.
The best way to show the creation of an extension is by example.
This section steps you through the creation of the futureValue
world function for Perl.
Go to the directory where you installed the Perl distribution.
This is important. Do not try this in any other directory or you'll
get errors.
For Step 2, you have to run the header to extension program h2xs.
To obtain a list of the options for this program, use the -h
option, as shown here:
Run h2XS -n Finance. This
creates a directory named Finance,
possibly under the subdirectory ext/
if it exists in the current working directory. Four files are
created in the Finance directory:
MANIFEST, Makefile.PL,
Finance.pm, and Finance.xs.
Here's what the output on your terminal will look like:
The MANIFEST file in the
ext/Finance directory contains
the names of the four files created. You have to change directories
to ./ext/Finance to be able
to work with these four files. Also, depending on who installed
your Perl distribution, you might have to run as root.
The contents of file Makefile.PL
are shown in Listing 27.1.
The h2xs script also creates
a .pm file. The contents
of file Finance.pm are shown
in Listing 27.2. The Finance.pm
file is where you add your exports and any Perl code.
All scripts which use Finance.pm
will now have to tell Perl to use functions in the Finance.pm
extension with the following command:
When Perl sees this use command,
it searches for a Finance.pm
file of the same name in the various directories listed in the
@Inc array. If it cannot
find the file, Perl stops with an error message.
The .pm file extension generally
requests that the Exporter and Dynamic Loader extensions also
be loaded. You need them for exporting functions and dynamic loading.
Perl uses the @ISA array
to get any methods that are not found in the current package.
After this set, the library is loaded as an extension into Perl.
There are two files to look at in the Finance
example we just created. The Finance.xs
file (Listing 27.3) holds the C routines that contain all the
C code for the extension, and the Finance.pm
file (Listing 27.2) contains routines that tell Perl how to load
this extension and what functions are exported.
In Step 3, you have to generate a makefile.
Generating and invoking the make
command makefile will create
a working version of the library Finance.so
in the ../../lib/ext/Finance
directory. After testing, you can move the finished versions of
these files to the /usr/lib
or /usr/local/lib tree. In
all further testing in this section, you must point the @Inc
array to the ../../lib/ext/Finance
location for this finance.so
file.
You may also see a blib directory
in the ext/Finance directory.
The man page templates for
your extensions are kept here.
Finally, the Finance.xs file
where you place all your code for extension in C is shown in Listing
27.3.
Now that you have created the necessary files for your own Perl
extension, you can move to Step 3, which involves adding your
own code to the newly generated extension files. Add some simple
futureValue world application
code to the Finance.xs file.
Listing 27.4 shows what Finance.xs
looks like with the code addition. Be sure to create this file
because you'll be using it throughout the rest of the chapter.
A function to calculate the future value of an investment is defined
in line 43. The function to calculate the present value of money
to be received in the future from an investment is defined starting
at line 60. Both functions return one value and require three
arguments, which are defined one per each line following the function
declaration. For example, for the present value function in line
60, the three arguments must be defined in one line as
At line 77, I define a Gordon Growth model function for a stock.
This function only prints something and does not return any values.
At line 88, I define a straight line depreciation model function
that returns more than one value on the calling stack.
Next, run the command perl Makefile.PL. This creates a real makefile, which make needs. The Makefile.PL checks to see whether your Perl distribution is complete and then writes the makefile for you. If you get any errors at this point, you should check to see whether your Perl distribution is complete. As a check, try running the Makefile.PL script as root. If your Perl distribution was installed by root, you may not have permission to overwrite some files. It won't hurt to try. If you do not get any errors, proceed. Run the makefile on your newly created Makefile. The following output should be pretty close to what you see: # make Wait! Before you execute this gem of a script shown above, look at the output in Listing 27.6. The shared version of Finance.so is in the directory ../../lib/auto/Finance/Finance.so. It is important that you copy Finance.so to your @Inc path. It's important at this point to either modify the @Inc array in your scripts that use Finance.pm or, by default, point to a known test location where this .so file will reside. Perl will search the directories listed in @Inc to load the extension module. Step 5: Test Your Extension ModuleNow, in the Test1 directory, create the test script shown in Listing 27.5 and name it t.pl. Listing 27.5. The test program. 1 #!/usr/bin/perl Notice that Finance::Gordon is used to explicitly call the Gordon Growth Model function (see line 5). Look up the formula in your finance textbook if you don't believe me. It would be cumbersome to keep typing in Finance:: to all your functions. Add the declaration to the @EXPORT array in Finance.pm and remake. Now you can use the function Gordon by itself. The @EXPORT array in the .pm file tells Perl which of the extension's routines should be placed in the calling package's own name space. Final ConsiderationsThere are some things you should be aware of before you export everything in your module. Sure, it saves you typing and makes the code easier to read by not having all those Finance:: prefixes everywhere. However, what about the same function name residing in both the main and the modules? In this case, the function in the main may prevail. Why create an ambiguity when the choice of a good name for a function will suffice? Also, it's not a good idea to export every function in your extension. After all, the idea behind the extension is to hide some of the functionality and intricacies in the extension from the application that is using this extension module. Most of the time you do not want to export the names of your extension's subroutines because they might accidentally clash with other modules' subroutines from other extensions or from the calling program itself. The xsubpp CompilerIf you examine the makefile for your extension, you'll see a call to a program called xsubpp. This is a preprocessor compiler for XS code. The compiler xsubpp takes the XS code in the .xs file, converts it into C code, and places it in a file whose suffix is .c. The C code created makes heavy use of the C functions within Perl.
An XSUB function is just like a C function in that it takes arguments and returns one or more single values (if not declared void). Values may also be returned via pointers to arguments passed to the function. Now move on to something a little bit more exotic and create a function that takes arguments and returns something. This function calculates and returns the Julian day given a calendar day. The Julian day calculation is very important in astronomical calculations since it's a reference counter from all the days since January 1, 4713 B.C. and was founded by the French scholar Joseph Scaliger (1540-1609) in 1583 A.D. The formula for calculating the Julian day is given in forms in just as many astronomical texts. One version of this formula is:
Julian Day = 367 * YEAR - 7 * (YEAR + (M + 9)/12)4 + Don't try to shorten the formula by reducing it to an algebraic equivalent since the formula relies on dropping bits off the right side of the decimal point. One such way of implementing this formula is the function for calculating the Julian day, as shown in Listing 27.6. Listing 27.6. The Julian.c file. 1 #include <math.h>
Run h2xs -A -n Julian as
before to get the Julian module started. This creates the ext/Julian
directory with the necessary files. This time, however, you'll
be adding a lot more code into the functions in Julian.xs,
as shown in Listing 27.7.
Note that each line containing the arguments after the declaration
of JulianDay is indented
one tab. It is not necessary to have a tab between the type of
variable and the name of the variable. Also note that that there
is no semicolon following the declaration of each variable.
Edit the file Makefile.PL
so that the corresponding line looks like this:
Notice that an extra library to link in has been specified in
this case, the math library, libm.
You'll learn later in this chapter how to write XSUBs
that can call every routine in a library.
Generate the makefile and
run make. A test script for
running this program is shown in List-ing 27.8.
Now that you've learned some of the ways to create and use C extensions,
here's how to put them all together. First of all, look at the
Julian.c file in Listing
27.9, put together by the Perl script.
The function is called XS,
and the arguments are specified via the dXSARGS
keyword.
The meaning of keywords such as newXS,
SvIV, and so on in the output
C file are explained in Chapter 25, "Perl
Internal Files and Structures." However, for the moment,
concentrate on the compiler itself and how it expects input.
The functions compiled by the xsubpp
compiler are referred to as XSUB.
You specify the parameters that are passed into the XSUB
just after you declare the function return value and name. The
list of parameters looks very C-like, but the lines must be indented
by a tab stop, and each line should not have an ending semicolon.
The list of output parameters occurs after the OUTPUT:
directive. The default value returned is RETVAL.
The use of RETVAL tells Perl
that you want to send this value back as the return value of the
XSUB function. You still
have set RETVAL to something.
You can also specify which variables used in the XSUB
function should be placed into the respective Perl variables that
are passed in.
The xsubpp compiler uses
rules to convert from Perl's internal data types to C's data types.
These rules are stored in the typemap
file. The rules in typemap
contain mappings for converting ints,
unsigned ints, and so on
into Perl scalars. Arrays are mapped to char**
or void pointers, and so
on. The typemap file with
all the mappings is located in the ExtUtils
directory under the Perl installation.
The typemap file is split
into three sections. The first section is a mapping of various
C data types into a tag value. The second section is for converting
input parameters to C, and the third is for outputting parameters
from C to Perl.
Take a look at Listing 27.9 again. Note the SvIV
for the month declaration.
Now look in the typemap file
for the declaration of int.
You'll see it defined as T_IV.
Now go to the second INPUT
part in the file to see how T_IV
is mapped for input. You'll see the following lines, which map
the integer from the Perl variable:
Similarly, in the OUTPUT
section of the typemap file,
you'll see the following lines to generate a returned value. This
fragment places an integer into the ST
array (which is indexed from 0
on up for all the incoming and outgoing arguments of a function):
If you forgot to create the typemap file, you might see output that looks like this: Error: 'const char *' not in typemap in Julian.xs, line XXX This error means that you have used a C data type that xsubpp doesn't know how to convert between Perl and C. The solution is to create a custom typemap file that tells xsubpp how to do the conversions. You can define your own typemap entries if you find certain parameters in the file that you cannot find in the existing typemap file. For example, the type double is understood by Perl, but not double *. In this case you have to make an entry in the typemap file to convert the pointer to double to something Perl will understand. Try a void pointer. The bootstrap FunctionAll Perl extensions require a Perl module with a call to the bootstrap function, which loads the extension into Perl. The module's functions are two-fold: Export all the extension's functions and global references to variables to the Perl script using the extension, and load the XSUBs into Perl using dynamic linking. Thus, you require two modules: the Exporter to export your functions and the DynaLoader for dynamic loading. See the following example for the Julian package: package Julian; # my package Passing ArgumentsParameters are passed into an XSUB function via an argument stack. The same stack is used to store the XSUB's return value. All Perl functions are stack oriented and use indexes in their own stack to access their variables. The stacks are organized bottom up and can be indexed using the ST(index) macro. The first position on that stack that belongs to the active function is referred to as index 0 for that function. The positions on the stack are referred to as ST(0) for the first item, ST(1) for the next, and so on. The incoming parameters and outgoing return values for an XSUB are always position 0. Parameters are pushed left to right.
The RETVAL Variable and the OUTPUT SectionThe OUTPUT section of the xs file is where you place return values. The return value is always ST(0). However, this value will not be set unless the OUTPUT section with RETVAL is defined. You must have the two lines in a function to get it to return a value: OUTPUT: You also have to remember to set RETVAL somewhere along the code to have a value to return. The type of RETVAL is the type of the function you declared at the top. So, the JulianDay function has a long RETVAL, whereas the futureValue function has a double RETVAL. For return types of void, the RETVAL variable is not defined and you cannot use it. Input parameters in an XSUB are normally initialized with their values from the values pushed on the argument stack at the time of the call. Entries in the typemap file are used to map the Perl values into their C counterparts in the XSUB function. You can use code that would be generated by the xsubpp compile directly to gain access to a variable. For example, in the following function, the first argument is accessed via the SvPV map function: GetDayOfWeek(julianDay,dayOfWeek) In this example, dayOfWeek is assigned to a value of 0 as the default value. This is done so that if nothing is passed in for dayOfWeek, then it will be set to 0. Defensive programming like this makes the package easier to use. You can even place assignments in the parameter list, like this: DayOfWeek(julianDay,dayOfWeek = 0) The default values set in the parameters may only be a number or a string, not pointers. Also, you can define such values from a right-to-left order in the parameter list. Thus, the following line would cause unspeakable errors from xsubpp: DayOfWeek(dayOfWeek = 0, julianDay) To allow the XSUB for DayOfWeek() to have a default date value, you could rearrange the parameters to the XSUB. A Perl program will then be able to call DayOfWeek() with either of the following statements: $status = DayOfWeek( $julianDay ); The code in the Julian.xs file would look like the following: int The XSUB code generated for this segment of code would look this: XS(XS_Julian_DayOfWeek) In this code fragment, the special variable items tells the routine how many parameters have been passed into the function. The items variable is tested to see how to initialize jday in this example. The sv_newmortal() function is used to clear out the return values for this XSUB function. The use of ellipses (...) for passing variable-length argument lists is also supported in XSUBs. Your function can easily get the number of arguments passed into it by looking at the special items variable. The items keyword is a reserved variable and the xsubpp compiler supplies items for all XSUBs. Using the items variable lets you accept an unknown number of arguments in your XSUB function. KeywordsThere are several special keywords in the .xs file that can also be used when writing extensions. The MODULE KeywordThe MODULE keyword is used to start the XS code and to specify the name of the package currently being defined. There is only one MODULE keyword per .xs file. All text before the MODULE keyword is not processed in any way by xsubpp. Do not modify the code before the MODULE keyword. If you have to add code, it will be passed through to the final C file. Here's the syntax for the MODULE keyword: MODULE packageName The packageName is used as the name of the bootstrap function for this module extension. The MODULE keyword is generated for you by xsubpp. The PACKAGE KeywordOn occasion, you may have more than one package per module. In this case, the PACKAGE keyword is used to indicate which package within the module contains the code that follows. Generally, the name following the PACKAGE keyword is the same as that following the MODULE keyword. The PACKAGE keyword is used with the MODULE keyword and must follow on the same line as the MODULE keyword. You have to edit the .xs file yourself to make sure which package gets which function. The CODE: KeywordThe CODE: keyword is used to indicate where the real C code for a function begins. Use just C code until you start a new block with another keyword, such as OUTPUT:. You can use C comments (/*...*/), ampersands, and so on, and they will not be touched by the xsubpp compiler. xsubpp matches certain C preprocessor directives that are allowed within the CODE: block. It also matches # used for Perl comments. The compiler passes the preprocessor directives that it recognizes through to the final C file untouched and will remove the commented lines. Comments can be added to XSUBs by placing # at the beginning of the line, too. Nested comments are not supported. Be careful not to make the comment look like a C preprocessor directive! The xsubpp compiler could be confused if a Perl comment begins to look like a C preprocessor directive. The following is a bad idea: #define a variable Is the above line a comment or a C statement? I do not know how this will be interpreted. If you are going to mess with the argument stack, though, you'll want to use the PpcODE keyword, which will be discussed later in the chapter. The OUTPUT: KeywordThe OUTPUT: keyword specifies the return values from a function. You have seen it used earlier in the case of the RETVAL assignment as a return. The OUTPUT: keyword generates code that does the mapping of the XSUB function's variables back to those in the Perl program calling XSUB. This keyword is used after the code in the CODE: area. The RETVAL variable is not the default return variable in the CODE: area. Only by specifying it after the OUTPUT: keyword are you letting xsubpp know that it's a return variable for this function. The OUTPUT: keyword also lists the input parameters for use as output variables. This may be necessary when a parameter has been modified within the function and the programmer would like the update to be seen by Perl. Say that you define a function, which returns the day of the week, given an isFriday() function in the Julian package. The function returns true (1) if the day is a Friday. The day of the week is returned in the second parameter passed to the function. The function is shown in the Julian.xs file as this: int This example uses a NO_INIT keyword to show that the dayOfWeek is an output value. The xsubpp compiler normally generates code to read the values of all function parameters from the argument stack and assign them to C variables upon entry to the function. NO_INIT tells the compiler that these passed parameters are used for output rather than for input and that they are assigned before the function terminates. Thus, the function isFriday() uses the dayOfWeek variable only as an output variable and does not care about its initial contents. PpcODE: for Returning More Than One ValueThe PpcODE: keyword can be used instead of the CODE: keyword. The PpcODE: keyword tells the xsubpp compiler that the code in the XSUBs will be modifying the return stack itself. You'll use the PpcODE: keyword to return lists instead of values. Only void functions allow the use of PpcODE: keywords. Look at the following section of code from the Finance.pm module and extension. It returns the straight line depreciation list for an asset. The function is called depreciateSL: void Note the use of the PpcODE variable to show that you will be messing with the stack. The EXTEND(sp,n) macro is used to make room on the argument stack for n return values with the call to EXTEND(sp,n). The PpcODE: directive causes the xsubpp compiler to create a stack pointer called sp, which is conveniently used in the EXTEND() call. The values are then pushed onto the stack with the PUSHs() macro, with the value returned from the sv_2mortal() call to map the value to the stack. The way to call this function is as follows in Listing 27.10: Listing 27.10. Using the Julian.xs file. 1 #!/usr/bin/perl Lines 13 through 15 run a for loop to test different values for a Julian day. Here's the output from this script: -============= first part ===============- First assign it all back to a listvalue = 900.000000 for 5 years Returning Undef and Empty ListsIn Perl, there are times when you'll want to return empty or undefined lists. You have to use the PpcODE block, not the CODE: block. To return empty lists, just return nothing. The Perl code sends back a pointer to any empty list. For return undef, you have to set the value of ST(0) to undef. You can use the sv_newmortal() call to initialize a return value to undef and set it to ST(0). ST(0) = sv_newmortal(); /* Undefine the return value. */ In later sections of the code, you can use the sv_setnv() function to set the value to something else if you want. Here's the syntax for this call: sv_setnv(ST(index), value); Alternatively, you can set the ST(index) value to sv_undef, where a pointer to list in the ST(index) was expected. More than likely, the index value is 0, unless you are returning references to arrays. For example, to return an undefined list, use this statement: ST(0) = &sv_undef;
The BOOT: KeywordThe BOOT: keyword is used to add code to the extension's bootstrap function. The bootstrap function is generated by the xsubpp compiler and normally holds the statements necessary to register any XSUBs with Perl. With the BOOT: keyword, the programmer can tell the compiler to add extra statements to the bootstrap function. This keyword may be used any time after the first MODULE keyword and should appear on a line by itself. The first blank line after the keyword terminates the code block. A sample usage is shown here: BOOT: The Listings of ModulesThe listing of Julian.xs using the BOOT: and other keywords is shown in Listing 27.11. Listing 27.11. The Julian.xs file. 1 #ifdef _ _cplusplus SummaryExtensions to Perl are written in the XS language. Templates for creating the modules are written by the h2xs program. Module-specific Perl is kept in the *.pm file, and the C code is kept in the .xs file. The xsubpp program parses the .xs file to generate a C file, which in turn is compiled and linked to form the shared library (with an .so extension). xsubpp expects the .xs files to follow the XS language syntax and specification. After compilation, the .so files with the extension code can be moved into your installation directory.
|
||||||||||||||||||||||||||
With any suggestions or questions please feel free to contact us |