Running a Perfect Internet Site with Linux 14-5141.doc

Notice: This material is excerpted from Running A Perfect Internet Site with Linux, ISBN: 0-7897-0514-1. The electronic version of this material has not been through the final proof reading stage that the book goes through before being published in printed form. Some errors may exist here that are corrected before the book is published. This material is provided "as is" without any warranty of any kind.

Copyright ©1996, Que Corporation. All rights reserved. No part of this book may be used or reproduced in any form or by any means, or stored in a database or retrieval system without prior written permission of the publisher except in the case of brief quotations embodied in critical articles and reviews. Making copies of any part of this book for any purpose other than your own personal use is a violation of United States copyright laws. For information, address Que Corporation, 201 West 103rd Street, Indianapolis, IN 46290 or at support@mcp .com.

Chapter 14 - Maintaining Your System

An important aspect of site maintenance is maintaining your servers. Because these programs perform all of the services your site offers to both your own users and other people on the Internet, you've got to keep them running smoothly.

In this chapter, you learn how to:

Maintaining Your E-mail Servers

It's important to keep your e-mail running smoothly, as it's in many ways the most essential service on your site. If you lose access to e-mail, you lose communication with your users and the rest of the Internet.

You have one server to maintain in this case, Sendmail. It's not the server software itself you're maintaining here. You'll be keeping track of the log files, mail spool space, and other mail-related features.

Maintaining Sendmail

There are only a few things you need to do to maintain your Sendmail server. The rest is taken care of by the software itself. Things to take care of include the following:

Maintaining Mailing Lists

Most mailing list maintenance involves adding and removing users by hand. While the software does handle this aspect itself, there are times you'll need to do the following:

The rest of mailing list maintenance involves wading through e-mail that comes back to the list owner for various reasons. A lot of this mail is:

For the above examples, if you don't have a need to save the error messages you'll get in your mailbox (if you run the list), just delete them.

Maintaining Your Web Server

Maintaining a Web server is a little more involved than maintaining your e-mail servers, but not by much. Most of the work simply involves keeping things up to date.

Checking Links

If you have Web pages set up for your site in general (e.g., a Web page with a list of frequently asked user tech support questions), and it has any links to outside sources, it's important to check it regularly for accuracy. Things on the Internet change constantly, so you may find that the links you point to go up and down or move to another URL. The sooner you find out, the more likely you are to get a page letting you know where the new resource is.

To check out a link, just click it and see if it takes you where it's supposed to.

Basic HTML

All HTML formatting commands have a opening code, and a closing code. The difference between them is that the closing code ends with a slash (/). For example, the code that begins every document should be <HTML>. The entire Web page goes after this, and at the end of the page, the last code should be </HTML>.

As was shown in chapter 7, "Installing your Web Server," this is the most basic Web page you can have:

<HTML>
<Head><Title>My first test page</Title></Head>
<BODY>
This is my test page.
</BODY>
</HTML>

Note that the page begins and ends with an HTML code. The second code you always need to have is the HEAD code, which marks off the header for the page, which is displayed at the top of the browser window if you're using Netscape. The next one is the TITLE code, which defines the page's title, which is displayed within the page itself. Finally, the last necessary code is the BODY code, which defines marks off the main portion of the page.

You don't have to capitalize the codes. A lot of people do it as a way to make sure they stand out from the rest of the text.

HTML has a number of word-processor-like codes as well. Some of them are:

Code	Meaning
<H1>...</H1>	Main header
<H2>...</H2>	Secondary header. There are 6 headers, each one less prominent than the one before it.
<B>...</B>	Bold
<I>...</I>	Italics
<U>...</U>	Underline
<UL>...</UL>	Bulleted list. Each bulleted item should begin with <LI>. There is no ending </LI>.

This is a very basic guide to HTML code. For a more in-depth guide, check out QUE's Special Edition Using HTML.

Keeping Track of Usage

You can keep track of accesses to your Web pages by simply looking through your server's log files. However, you can also get a program off the Internet that tracks all sorts of usage statistics, makes graphs, etc. This program is wusage, and you can get it one of the two following ways:

Compiling wusage

To compile wusage, complete the following steps:

  1. Move it to your favorite unpacking and compiling location.
  2. Decompress and untar the file.
  3. Change into the wusage directory.
  4. Edit the Makefile and change the compiler from CC=cc to CC=gcc.
  5. Save and exit the Makefile.
  6. Type make all to compile the program.

Configuring wusage

To configure wusage, do the following (the configuration file is huge, so I'll only discuss the changes you need to make here):

Step 1

Edit the sample file wusage.conf.

Step 2

#Type of server log:
#If your server CAN use COMMON format, then DO, in all cases.

The server you installed does work with the COMMON format (COMMON is a format used by the httpd you installed to format log files), so insert the line:

COMMON

Step 3

#Name of your server as it should be presented:
Quest

Change Quest to the name of your server. For example, my server is simply Renaissoft.

Step 4

#File to use as a prefix; MUST BE A COMPLETE FILE SYSTEM PATH. REALLY:
#NOT A URL.
/home/www/prefix.html

If you don't want it to store its prefix file in this path, change it to one appropriate for your needs.

Step 5

#File to use as a suffix; MUST BE A COMPLETE FILE SYSTEM PATH. REALLY:
#NOT A URL.
/home/www/suffix.html

If you don't want it to store its suffix file in this path, change it to one appropriate for your needs.

Step 6

#Directory where HTML pages generated by usage program should be located:
/home/www/web/usage

If you don't want it to save the pages it creates to display your usage statistics in the path above, change it to suit your own needs.

Step 7

#URL to which locations of HTML pages should be appended for usage reports:
#(the same as the first line, but in web space, not filesystem space)
/usage

The directory above is relative to your Web server's file system. For example, /usage above would be /public_html/usage. This item should match the item from step 6.

Step 8

#Path of httpd log file:
/home/www/ncsa/logs/access_log

If your httpd log file isn't in the location shown above, change this item to point to the location of your file.

Step 9

#Top-level domain only (i.e., org not cshl.org):
org

Change this item to match your top level domain. For example, I'm renaissoft.com, so I would enter com instead of org here.

Step 10

#Directories/items that should never register in the top ten:
#To inhibit everything on a path, use /path*
{
}

If you have pages on your site that you don't want mentioned in the statistics even if they're in the top ten, list their full paths here between the brackets.

Step 11

#items that should never register at *all*, even:
#for the total access count
{
}

If you have pages you don't even want counted when determining the total number of accesses, put their full paths between the brackets.

Step 12

#Sites that should never register in the usage statistics:
{
}

If you don't want to see particular sites listed in your usage statistics (e.g., your own site), enter the domain name between the brackets. I included my own site here, because I don't want any of my page testing to count as real accesses, for example, {*.renaissoft.com}.

Step 13

Save and exit the file.

Step 14

Move wusage into your main Web directory (or really wherever you prefer to have it).

Step 15

Move wusage.conf to your main Web conf directory (or elsewhere if you prefer), which is inside the same directory as cgi-bin one.

Running wusage

To actually use wusage, just use crontab -e to make a cron job to tabulate your site statistics (calling wusage itself) once a week or however often you want. When you refer to the file in the crontab, call it as /fullpath/wusage -c /fullpath/wusage.conf. If you want to go ahead and see how it comes out, enter this same syntax on the command line now.

All you have to do now is edit your home page and add a link to the files wusage generates!

Cleaning Out the Access Log File

If your site gets a lot of Web page hits, occasionally clean out the httpd access log file (not regularly, simply whenever it starts to look excessively large to you). To keep the access log down to a reasonable size without interfering with wusage,first take look at your wusage page, and jot down the dates of the week it did its last report for.

You may want to back up the directory wusage keeps its HTML files in before proceeding, and compress and save a backup of the httpd access log (from the next step) first.

Now, edit the httpd access_log, and find the entries from the last day of the week for which the last report was finished.look through these entries and find the last entry from that particular date. Then, delete everything from the oldest items to the above last entry you just found. Don't touch anything dated after the last week that was already reported. Finally, save and exit the file.

The folks at Quest Protein Database Center (the programmers of this great utility) sincerely want to know that you're using their software. Feel free to send the author, Thomas Boutell, e-mail at boutell@boutell.com. If you don't mind, include the URL for your usage page. If you'd rather not have it known because, for example, it's for a private site, then just drop him a note saying you use the software.

Maintaining Your Gopher Server

Maintaining a Gopher server is a fairly straightforward process. After all, it's mainly a simpler version of a Web server.

The main item to maintain your Gopher servers checking your Gopher menu pointers on occasion to make sure they're accurate.

To make a pointer to a menu item outside your own gopher server, you need to create a file containing the necessary information. I'll walk you through an example that points to something that's actually back at my own site. The entry would be as follows:

Name=Renaissoft's Programs
Type=1
Port=70
Path=/Programs
Host=gopher.renaissoft.com

This breaks down as follows:

  1. Name is the name of the menu choice you want to offer. The menu choice someone would select to go to this option is Renaissoft's Programs.
  2. Type refers to the data type of the file. The data types available are as follows:
  1. The type I entered in the example is type 1, a directory.
  2. Port is the port the gopher client needs to connect to in order to reach the gopher server you're pointing them to. The port I entered was 70.
  3. Path is the file path the gopher client needs to go to in order to find the item you're pointing to. The directory this menu item points to is in gopher-data/Programs.
  4. 5. Host is the machine the gopher client needs to connect to in order to get the item you're pointing to. The host I pointed to was gopher.renaissoft.com.

Maintaining Your News Server

Keeping your news server in good working order is mainly taken care of by the news.daily script. The issue you'll want to keep an eye on is your hard drive usage.

To ensure that news doesn't overrun your hard drive, do the following:

Every day INN runs a script called news.daily, which sends you information on your site. This includes statistics for your site, including hard drive usage of your news spool, and errors that occurred during the day.

Maintaining Your FTP Server

Once again, FTP server maintenance is fairly simple-server maintenance is fortunately easier and less time-consuming than server installation! It's important to keep an eye on your FTP files, especially if you have a server with an incoming directory where outside users are dropping files off.

Keep the following in mind while maintaining your FTP server:

Maintaining Your Finger Server

Finger servers, due to their simple nature, don't have much to maintain. All you need to do to keep an eye on them is keep track of your dummy users. Keep their info up-to-date, and be sure not to leave old unnecessary ones lying around.

QUE Home Page

For technical support for our books and software contact support@mcp.com

Copyright ©1996, Que Corporation