Running a Perfect Internet Site with Linux 14-5141.doc
Notice: This material is excerpted from Running A Perfect Internet Site with Linux, ISBN: 0-7897-0514-1. The electronic version of this material has not been through the final proof reading stage that the book goes through before being published in printed form. Some errors may exist here that are corrected before the book is published. This material is provided "as is" without any warranty of any kind.
Copyright ©1996, Que Corporation. All rights reserved. No part of this book may be used or reproduced in any form or by any means, or stored in a database or retrieval system without prior written permission of the publisher except in the case of brief quotations embodied in critical articles and reviews. Making copies of any part of this book for any purpose other than your own personal use is a violation of United States copyright laws. For information, address Que Corporation, 201 West 103rd Street, Indianapolis, IN 46290 or at support@mcp .com.
An important aspect of site maintenance is maintaining your servers. Because these programs perform all of the services your site offers to both your own users and other people on the Internet, you've got to keep them running smoothly.
In this chapter, you learn how to:
It's important to keep your e-mail running smoothly, as it's in many ways the most essential service on your site. If you lose access to e-mail, you lose communication with your users and the rest of the Internet.
You have one server to maintain in this case, Sendmail. It's not the server software itself you're maintaining here. You'll be keeping track of the log files, mail spool space, and other mail-related features.
There are only a few things you need to do to maintain your Sendmail server. The rest is taken care of by the software itself. Things to take care of include the following:
Most mailing list maintenance involves adding and removing users by hand. While the software does handle this aspect itself, there are times you'll need to do the following:
The rest of mailing list maintenance involves wading through e-mail that comes back to the list owner for various reasons. A lot of this mail is:
For the above examples, if you don't have a need to save the error
messages you'll get in your mailbox (if you run the list), just
delete them.
Maintaining Your Web Server
Maintaining a Web server is a little more involved than maintaining your e-mail servers, but not by much. Most of the work simply involves keeping things up to date.
If you have Web pages set up for your site in general (e.g., a Web page with a list of frequently asked user tech support questions), and it has any links to outside sources, it's important to check it regularly for accuracy. Things on the Internet change constantly, so you may find that the links you point to go up and down or move to another URL. The sooner you find out, the more likely you are to get a page letting you know where the new resource is.
To check out a link, just click it and see if it takes you where it's supposed to.
All HTML formatting commands have a opening code, and a closing code. The difference between them is that the closing code ends with a slash (/). For example, the code that begins every document should be <HTML>. The entire Web page goes after this, and at the end of the page, the last code should be </HTML>.
As was shown in chapter 7, "Installing your Web Server," this is the most basic Web page you can have:
<HTML> <Head><Title>My first test page</Title></Head> <BODY> This is my test page. </BODY> </HTML>
Note that the page begins and ends with an HTML code. The second code you always need to have is the HEAD code, which marks off the header for the page, which is displayed at the top of the browser window if you're using Netscape. The next one is the TITLE code, which defines the page's title, which is displayed within the page itself. Finally, the last necessary code is the BODY code, which defines marks off the main portion of the page.
You don't have to capitalize the codes. A lot of people do it as a way to make sure they stand out from the rest of the text.
HTML has a number of word-processor-like codes as well. Some of them are:
Code Meaning <H1>...</H1> Main header <H2>...</H2> Secondary header. There are 6 headers, each one less prominent than the one before it. <B>...</B> Bold <I>...</I> Italics <U>...</U> Underline <UL>...</UL> Bulleted list. Each bulleted item should begin with <LI>. There is no ending </LI>.
This is a very basic guide to HTML code. For a more in-depth guide, check out QUE's Special Edition Using HTML.
You can keep track of accesses to your Web pages by simply looking through your server's log files. However, you can also get a program off the Internet that tracks all sorts of usage statistics, makes graphs, etc. This program is wusage, and you can get it one of the two following ways:
To compile wusage, complete the following steps:
To configure wusage, do the following (the configuration file is huge, so I'll only discuss the changes you need to make here):
Step 1
Edit the sample file wusage.conf.
Step 2
#Type of server log: #If your server CAN use COMMON format, then DO, in all cases.
The server you installed does work with the COMMON format (COMMON is a format used by the httpd you installed to format log files), so insert the line:
COMMON
Step 3
#Name of your server as it should be presented: Quest
Change Quest to the name of your server. For example, my server is simply Renaissoft.
Step 4
#File to use as a prefix; MUST BE A COMPLETE FILE SYSTEM PATH. REALLY: #NOT A URL. /home/www/prefix.html
If you don't want it to store its prefix file in this path, change it to one appropriate for your needs.
Step 5
#File to use as a suffix; MUST BE A COMPLETE FILE SYSTEM PATH. REALLY: #NOT A URL. /home/www/suffix.html
If you don't want it to store its suffix file in this path, change it to one appropriate for your needs.
Step 6
#Directory where HTML pages generated by usage program should be located: /home/www/web/usage
If you don't want it to save the pages it creates to display your usage statistics in the path above, change it to suit your own needs.
Step 7
#URL to which locations of HTML pages should be appended for usage reports: #(the same as the first line, but in web space, not filesystem space) /usage
The directory above is relative to your Web server's file system. For example, /usage above would be /public_html/usage. This item should match the item from step 6.
Step 8
#Path of httpd log file: /home/www/ncsa/logs/access_log
If your httpd log file isn't in the location shown above, change this item to point to the location of your file.
Step 9
#Top-level domain only (i.e., org not cshl.org): org
Change this item to match your top level domain. For example, I'm renaissoft.com, so I would enter com instead of org here.
Step 10
#Directories/items that should never register in the top ten: #To inhibit everything on a path, use /path* { }
If you have pages on your site that you don't want mentioned in the statistics even if they're in the top ten, list their full paths here between the brackets.
Step 11
#items that should never register at *all*, even: #for the total access count { }
If you have pages you don't even want counted when determining the total number of accesses, put their full paths between the brackets.
Step 12
#Sites that should never register in the usage statistics: { }
If you don't want to see particular sites listed in your usage statistics (e.g., your own site), enter the domain name between the brackets. I included my own site here, because I don't want any of my page testing to count as real accesses, for example, {*.renaissoft.com}.
Step 13
Save and exit the file.
Step 14
Move wusage into your main Web directory (or really wherever you prefer to have it).
Step 15
Move wusage.conf to your main Web conf directory (or elsewhere if you prefer), which is inside the same directory as cgi-bin one.
To actually use wusage, just use crontab -e to make a cron job to tabulate your site statistics (calling wusage itself) once a week or however often you want. When you refer to the file in the crontab, call it as /fullpath/wusage -c /fullpath/wusage.conf. If you want to go ahead and see how it comes out, enter this same syntax on the command line now.
All you have to do now is edit your home page and add a link to the files wusage generates!
If your site gets a lot of Web page hits, occasionally clean out the httpd access log file (not regularly, simply whenever it starts to look excessively large to you). To keep the access log down to a reasonable size without interfering with wusage,first take look at your wusage page, and jot down the dates of the week it did its last report for.
You may want to back up the directory wusage keeps its HTML files in before proceeding, and compress and save a backup of the httpd access log (from the next step) first.
Now, edit the httpd access_log, and find the entries from the last day of the week for which the last report was finished.look through these entries and find the last entry from that particular date. Then, delete everything from the oldest items to the above last entry you just found. Don't touch anything dated after the last week that was already reported. Finally, save and exit the file.
The folks at Quest Protein Database Center (the programmers of this great utility) sincerely want to know that you're using their software. Feel free to send the author, Thomas Boutell, e-mail at boutell@boutell.com. If you don't mind, include the URL for your usage page. If you'd rather not have it known because, for example, it's for a private site, then just drop him a note saying you use the software.
Maintaining a Gopher server is a fairly straightforward process. After all, it's mainly a simpler version of a Web server.
The main item to maintain your Gopher servers checking your Gopher menu pointers on occasion to make sure they're accurate.
To make a pointer to a menu item outside your own gopher server, you need to create a file containing the necessary information. I'll walk you through an example that points to something that's actually back at my own site. The entry would be as follows:
Name=Renaissoft's Programs Type=1 Port=70 Path=/Programs Host=gopher.renaissoft.com
This breaks down as follows:
Keeping your news server in good working order is mainly taken care of by the news.daily script. The issue you'll want to keep an eye on is your hard drive usage.
To ensure that news doesn't overrun your hard drive, do the following:
Every day INN runs a script called news.daily, which sends you information on your site. This includes statistics for your site, including hard drive usage of your news spool, and errors that occurred during the day.
Once again, FTP server maintenance is fairly simple-server maintenance is fortunately easier and less time-consuming than server installation! It's important to keep an eye on your FTP files, especially if you have a server with an incoming directory where outside users are dropping files off.
Keep the following in mind while maintaining your FTP server:
Finger servers, due to their simple nature, don't have much to maintain. All you need to do to keep an eye on them is keep track of your dummy users. Keep their info up-to-date, and be sure not to leave old unnecessary ones lying around.