© 1997 The McGraw-Hill Companies, Inc. All rights reserved.
Any use of this Beta Book is subject to the rules stated in the Terms of Use.

Chapter 11 Proxy Servers

Application gateways, or proxy server, define a whole different concept in terms of firewalls. In order to balance out some of the weaknesses presented by packet-filtering routers, you can use certain software applications in your firewall to forward and filter connections for services, such as telnet and FTP. These applications are referred to as a proxy service, and the host running the proxy service is often called an application gateway.

Many IS&T (information systems and technology) professionals consider application gateways to be a true firewall because the other types lack user authentication. Accessibility is much more restricted than with packet-filtering and circuit-level gateways because it requires a gateway program for every application such as telnet, FTP, and so on.

As a matter of fact, there are many companies that only use a proxy service as they firewall, while others just rely on the firewall itself. Depending of your environment, size of your company and the level of protection you want to accomplish, one or the other may be all you need. However, as a rule of thumb, you should always consider the implementation of a proxy service combined with your package-filtering routers (firewalls), so that you can achieve a more robust level of defense and flexible access control. Also you will find that many firewall products will bring you the best of both worlds, combining both filtering and proxing features in a single package.

The combination of application gateways and packet-filtering routers to increase the level of security and flexibility of your firewall is therefore the ideal solution for addressing Internet security. These are often called hybrid gateways. They are somewhat common, as they provide internal hosts unobstructed access to untrusted networks while enforcing strong security on connections coming from outside the protected network.

Consider figure 11.1 as an example of a site that uses a packet-filtering router and blocks all incoming telnet and FTP connections. The router allows telnet and FTP packets to go only to the telnet/FTP application gateway. A user connecting to a site system would have to connect first to the application gateway, and then to the destination host, as follows:

A user telnets to the application gateway and enters the name of an internal host
The gateway checks the user’s source IP address and accepts or rejects it according to any access criteria in place
The user might need to be authenticated
The proxy service creates a telnet connection between the gateway and the internal server
The proxy service passes bytes between the two connections
The application gateway logs the connection

If you look at figure 11.2, it shows the details of the virtual connection happening on figure 11.1 and emphasizes the many benefits to using proxy services. Lets stop for a moment and try to identify some of these benefits:

Proxy services allow through only those services for which there is a proxy. If an application gateway contains proxies for FTP and telnet, only FTP and telnet are allowed into the protected subnet. All other services are completely blocked. This degree of security is important. Proxy makes sure that only trustable services are allowed through the firewall and prevents untrusted services from being implemented on the firewall without your knowledge.

Let’s take a look at some advantages and disadvantages of application gateways.

There are several advantages to using application gateways over the default mode of permitting application traffic directly to internal hosts. Here it is the five main ones:

Hiding information. The names of internal systems (through DNS) are hidden to outside systems. Only the application gateway host name needs to be known to outside systems.
Robust authentication and logging. The traffic can be pre-authenticated before it reaches internal hosts. It can also be logged more efficiently than if logged with standard host logging.
Cost-effectiveness. Authentication/logging software and hardware are located at the application gateway only.
More comprehensive filtering rules. The rules at the packet-filtering router are more comprehensive than they would be with the routers filtering and directing traffic to several specific systems. With application gateways, the router needs only to allow application traffic destined for the application gateway and block the rest.
E-mail. It can centralize e-mail collection and distribution to internal hosts and users. All internal users would have e-mail addresses of the form user@mailbag, where mailbag is the name of the e-mail gateway. The gateway would receive mail from outside users and then forward it to internal systems.

However, nothing is perfect! Application gateways have disadvantages too. To connect to client-server protocols such as telnet requires two steps, inbound or outbound. Some even require client modification, which is not necessarily the case of a telnet application gateway, but it would still require a modification in user behavior. The user would have to connect to the firewall as opposed to connecting directly to the host. Of course, you could modify a telnet client to make the firewall transparent by allowing a user to specify the destination system (as opposed to the firewall) in the telnet command. The firewall would still serve as the route to the destination system, intercepting the connection and running authentication procedures such as querying for a one-time password.

You can also use application gateways for FTP, e-mail, X Window, and other services.

Note:

Some FTP application gateways have the capability to block put and get commands to specific hosts. They can filter the FTP protocol and block all put commands to the anonymous FTP server. This guarantees that nothing can be uploaded to the server.

So, what are proxies after all? Simply put, proxy are gateway applications basically used to route Internet and web access from within a firewall.

If you have used TIA (The Internet Adapter) or TERM, you probably are familiar with the concept of redirecting a connection. Using these programs, you can redirect a port. Proxy servers work in a similar way, by opening a socket on the server and allowing the connection to pass through.

A proxy is a special HTTP server that typically is run on a firewall. A proxy basically does the following:

Receives a request from a client inside the firewall
Sends this request to the remote server outside of the firewall
Reads the response
Sends it back to the client

Usually, the same proxy is used by all of the clients in a subnet. This enables the proxy to efficiently cache documents that are requested by several clients. Figure 11.3 demonstrates these basic functions.

The fact that a proxy service is not transparent to the user means that either the user or the client will have to be proxified. Either the user is instructed on how to manage the client in order to access certain services (telnet, FTP), or the client, such as Web clients, should be made proxy-aware.

The caching of documents makes proxies very attractive to those outside the firewall. Setting up a proxy server is not difficult. Today, most web client programs already have proxy support built in. It is very simple to configure an entire workgroup to use a caching proxy server, which helps to cut down on network traffic costs because many of the documents are retrieved from a local cache after the initial request has been made.

Proxy has a mechanism that makes a firewall safely permeable for users in an organization without creating a potential security hole through which hackers can get into the organization’s protected network.

This application-level proxying is easily supported with minor modifications for the Web client. Most standard out-of-the-box Web clients can be configured to be a proxy client without any need for compilations or special versions. In a way, you should begin to see proxying as a standard method for getting through firewalls, rather than having clients getting customized to support a special firewall method. This is especially important for your Web clients because the source code will probably not be available for modification.

As an example of this procedure, check the Anonymizer site, at URL http://www.anonymizer.com. All connections passing through the Anonymizer are proxified. The output connection was totally redirected and had its address changed, only that here it is done to protect the identity of the client, rather then access control (another benefit of using proxies!). Clients without DNS (Domain Name Service) can still use the Web because the only thing they need is proxy IP addresses.

Tip:

You can build a proxy-type firewall by using TIS toolkit if you have experience with UNIX and programming. It contains proxies for telnet, FTP, Gopher, Rlogin and a few other programs. Also, as an alternative, you can use Purveyor 1.1 (http://www.process.com), which offers all of that without a need for UNIX and programming knowledge. Best of all, you won’t need an expensive UNIX box, it runs on Windows NT and Windows 95.

Organizations using private network address spaces can still use your Web site as long as the proxy is visible to both the private internal net and the Internet, most likely using two separate network interfaces.

Proxying permits high-level logging of client transactions, which includes the client IP address, date and time, URL, byte count, and success code. Another characteristic of proxying is its capability to filter client transactions at the application-protocol level. It can control access to services for individual methods, server and domain, and so on.

As far as caching, the application-level proxy facilitates it by enabling it to be more effective on the proxy server than on each client. This helps to save disk space because only a single copy is cached. It also enables more efficient caching of documents. Cache can use predictive algorithms such as look ahead and others more effectively because it has many more clients with a much larger sample size on which to base its statistics.

Have you ever thought about browsing a Web site when the server is down? It is possible, if you are caching. As long as you connect to the cache server, you can still browse the site even if the server is down.

Usually, Web clients’ developers have no reason to use firewall versions of their code. But in the case of the application-level proxy, the developers might have an incentive: caching! I believe developers should always use their own products, but they usually don’t with firewall solutions such as SOCKS. Moreover, you will see that a proxy is simpler to configure than SOCKS, and it works across all platforms, not only UNIX.

Technically speaking, as shown in figure 11.4, when a client requests a normal HTTP document, the HTTP server gets only the path and keyword portion of the requested URL. It knows its hostname and that its protocol specifier is http:.

When a proxy server receives a request from a client, HTTP is always used for transactions with the proxy server, even when accessing a resource served by a remote server using another protocol such as Gopher or FTP.

A proxy server always has the information necessary to make an actual request to remote hosts specified in the request URL. Instead of specifying only the pathname and possibly search keywords to the proxy server, as figure 11.5 shows, the full URL is specified.

This way, a proxy server behaves like a client to retrieve a document, calling the same protocol module of Libwww that the client would call to perform the retrieval. However, it is necessary to create an HTTP containing the requested document to the client. A Gopher or FTP directory listing is returned to the client as an HTML document.

Caution:

Netscape does not use libwww so if you are using Netscape, you would not be calling a protocol module of libwww from the client.

Therefore, by nature a proxy server has a hybrid function: It must act as both client and server. A server when accepting HTTP requests from clients connecting to it, and a client (to the remote) to actually retrieve the documents for its own client.

Note:

In order for you to have a complete proxy server, it must speak all of the Web protocols, especially HTTP, FTP, Gopher, WAIS, and NNTP.

One of the HTTP server programs, CERN’s httpd, has a unique architecture. It is built on top of the WWW Common Library. The CERN httpd speaks all of the Web protocols just like Web clients, unlike other HTTP servers built on the WWW Common Library. It has been able to run as a protocol gateway since version 2.00, but not enough to act as a full proxy. With version 2.15, it began to accept full URLs, enabling a proxy to understand which protocol to use when interacting with the target host.

Another important feature with a proxy involving FTP is that if you want to deny incoming connections above port 1023, you can do so by using passive mode (PASV), which is supported.

Caution:

Not all FTP servers support PASV, causing a fallback to normal (PORT) mode. It will fail if incoming connections are refused, but this is what would happen in any case, even if a separate FTP tool were used.

However, before considering caching, you should be aware of at least couple problems that can occur and need to be resolved:

Can you keep a document in the cache and still be sure that it is up-to-date?
Can you decide which documents are worth caching, and for how long?

The caching mechanism is disk-based and persistent. It survives restarts of the proxy process as well as restarts of the server machine itself. When the caching proxy server and a Web client are on the same machine, new possibilities are available. You can configure a proxy to use a local cache, making it possible to give demos without an Internet connection.

A great feature of the HTTP protocol is that it contains a HEAD method for retrieving document header information without having to retrieve the document itself. This is useful to tell you if the document has been modified since your last access. But in cases where the document has changed, you have to make a second connection to the remote server to do the actual GET command request to retrieve the document. Therefore the HTTP protocol needs to be extended to contain an If-modified-Since request header, allowing it to do a conditional GET request.

In case the document has not been modified since the date and time specified, a 304 (Not modified) response will be returned along with a special result code. If the document has been modified, the reply will be as if the request was just a normal GET request.

Tip:

All major HTTP servers already support the conditional GET header.

Just for your information, there is a function called no-cache pragma, which is typically used by a client’s reload operation. This function provides users with the opportunity to do a cache refresh with no visible modifications in the user interface. A no-cache pragma function is forwarded by the proxy server, thus ensuring that if another proxy is also used, the cache on that server is ignored.

In summary, taken from the internal network perspective, a proxy server tends to allow much more outbound access than inbound. Generally, it will not allow Archie connections or direct mailing to the internal network, you will have to configure it.

Also, depending on which proxy server you are using, you should anticipate problems with FTP when doing a GET or an ls because FTP will open a socket on the client and send the information through it. Some proxy server will not allow it, so if you will be using FTP, make sure the proxy server supports it.

Note:

With Purveyor, a client who does not implement Domain Name Services (DNS) will still be able to access your Web site through Purveyor’s proxy server. The proxy IP address is the only information required.

As the applications for proxies rise, there are many features that are still in their early stages, but the basic features are already there! You should plan on having a proxy server on your firewall. Although caching is a wide and complicated area, it is also one of the parts of the proxy server that needs to be improved.

Tip:

You can provide Internet access for companies using one or more private network address spaces, such as a class A IP address 10.*.*.* by installing a proxy server that is visible to the Internet and to the private network.

I believe the HTTP protocol will be further enhanced as Internet growth continues to explode. In the near future you should see multipart requests and responses becoming a standard, enabling both caching and mirroring software to refresh large amounts of files in a single connection. They are already much needed by Web clients to retrieve all of the inlined images with one connection.

Moreover, proxy architecture needs to be standardized. Proxy servers should have a port number assigned by Internet Assigned Numbers Authority (IANA). On the client side, there is a need for a fallback mechanism for proxies so that a client can connect to a second or third proxy server if the primary proxy failed (like DNS). But these are just items on a wish list that will certainly improve netsurfing but are not yet available.

Tip:

If you need to request parameter assignments (protocols, ports, etc) to IANA, they request you to send it by mail to iana@isi.edu. For SNMP network management private enterprise number assignments, please send e-mail to iana-mib@isi.edu.

Taking into consideration the fast growth of the Web, (by the time I finish this chapter, the Web will have surpassed FTP, and gopher all together!), , I believe proxy caching represents a potential (and needed)." Bits and bytes will need to get returned from a nearby cache rather than from a faraway server in a geographically distant place.

SOCKS

SOCKS is a packet that enables servers behind the firewall to gain full access to the Internet. It redirects requests aimed at Internet sites to a server, which in turn authorizes the connections and transfers data back and forth.

Tip:

If you need more information about SOCKS, you can find it at http://www.socks.nec.com. To join the SOCKS mailing list, send mail to majordomo@syl.dl.nec.com with subscribe SOCKS your@e-mail.address in the body of the mail.

SOCKS was designed to allow servers behind a firewall to gain full access to the Internet without requiring direct IP reachability . The application client establishes communication with the application server through SOCKS. Usually the application client makes a request to SOCKS, which typically includes the address of the application server, the type of connection, and the user’s identity.

After SOCKS receives the request, it sets up a proper communication channel to the application server. A proxy circuit is then established and SOCKS, representing the application client, relays the application data between the application client and the application server.

It is SOCKS that performs several functions such as authentication, message security-level negotiation, authorizations, and so on while a proxy circuit is being set up.

SOCKS performs four basic operations (the fourth being a feature of SOCKS V5):

Connection request
Proxy circuit setup
Application data relay
Authentication (V5)

Figure 11.6 shows a control flow model of SOCKS.

Authentication methods are decided by SOCKS based on the security policy clauses that it defines. If none of the methods declared by the client meets the security requirement, SOCKS drops the communication.

As depicted on figure 11.7, after the authentication method is decided upon, the client and SOCKS begin the authentication process using the chosen method. In this case, SOCKS functions as a firewall.

Through an authentication procedure called GSS-API (Generic Security Service Application Program Interface), clients negotiate with SOCKS about the security of messages. Integrity and privacy are the options that can be applied to the rest of messages, including the proxy requests coming from the application client as well as Socks’ replies to the requests and its application data.

As far as UDP-based applications, SOCKS V5 has a connection request: the UDP association. It provides a virtual proxy circuit for seamlessly traversing UDP-based application data. However, be careful here! The proxy circuit for TCP-based applications and UDP-based ones are not the same. They mainly differ in two ways:

UDP’s proxy circuit, a pair of address information of the communication end-points, necessary for sending and receiving datagrams.
Application data, which is encapsulated by UDP proxy headers that include, along with other information, the destination address of a given datagram.

You can use SOCKS in different network environments. Figure 11.8 shows an example of one of the most popular setups.

A single SOCKS can be utilized as a firewall. SOCKS V5 supports authenticated traversal of multiple firewalls, extending it to build a virtual private network as shown in the figure.

The great advantage of the existing authentication scheme integrated into SOCKS is that the centralized network access of SOCKS enables the enforcement of security policy and the control of network access much easier than without centralized access. You need to watch for the fact that these access points unfortunately can become the bottleneck of internetworking. You must try to balance it out with the hierarchical distribution of SOCKS, shadow SOCKS (multiple parallel SOCKS), and other mechanisms for keeping the consistency of your security policy. Also, beware of potential security holes and attacks among multiple SOCKS, and so on, as a factor of acceptability of SOCKS as a secure mechanism for insecure network.

The integration of SOCKS and the Web has substantially increased the area of security on the web. Whereas secure web-related technologies such as S-HTTP (Security-enhanced HyperText Transport Protocol) and SSL (Secure Socket Layer) provide message and server authentications, SOCKS can be successfully integrated to provide user authentication and authorization. Furthermore, the security technologies employed on the Web can also be integrated into SOCKS to enhance the security of proxy connections.

Tcpd, the TCP Wrapper

You should be aware that the TCP Wrapper is not really a firewall utility but provides many of the same effects. By using TCP Wrapper, you can control who has access to your machine and to what services they have access toIt also keeps logs of the connections and does basic forgery detection.

TCP Wrapper was written by Wietse Venema of The Netherlands’ Eindhoven University of Technology. The key source of it is tcpd, a simple wrapper that in action envelopes every network daemon run by inetd. The tcpd wrapper is a simple, great tool to write rules based on acceptance or denial of connections. It also enables you to finger a host that attempts to illegally request an rlogin, for example.

You can use tcpd as an auditing tool. It has the capability to log attempted network connections to the wrapper service, which can greatly improve security. Although it has great features, in order for you to use it, you have to be connected to the Internet thus requiring an IP address.

Tip:

If you want to take a look at the source code for TCP Wrapper, you can download it from ftp://ftp.win.tue.nl/pub/security.

Another feature of TCP Wrapper is its support library, libwrap.a. It can be used by many other programs to provide the same wrapper-like defenses of other services.

Also, it only controls the machine it is installed on, making it a poor choice for network use. Firewalls are much more broad and therefore can protect every machine of every architecture.

However, the major drawback of TCP Wrapper is that it does not work on Apple Macintoshes or Microsoft Windows machines. It’s basically a UNIX security tool.

Setting Up and Configuring the Proxy Server

In order to set up my proxy server I need additional software. For this situation, I need SOCKS.

Note:

You can download SOCKS from

ftp://sunsite.unc.edu/pub/Linux/system/Network/misc/socks-linux-src.tgz

If you care to, you can also download a configuration example, found in the same directory, called socks-config.

By the time I start configuring SOCKS, I should be aware that SOCKS needs two separate configuration files: one to notify the allowed access and the other to route the requests to the appropriate proxy server. I have to make sure the access file is loaded on the server and that the routing file is loaded on every UNIX computer.

I will be using SOCKS version 4.2 beta, but as discussed earlier in this chapter, version 5 is already available. If you’re also using version 4.2 beta, the access file is called sockd.conf. Simply put, it should contain two lines: a permit line and a deny line. For each line I will have three entries:

The identifier (permit/deny). It will be either permit or deny, but I must have both a "permit" and a "deny" line.
The IP address. It holds up to four byte address in typical IP dot notation.
The address modifier. A typical IP address four byte number, acting like an netmask, such as 255.255.255.255.

For example, the line will look like this:

permit 192.168.2.26 255.255.255.255

My goal is to permit every address I want and then deny everything else. Another issue I have to decide is about power users or special ones. I could probably allow some users to access certain services, as well as deny certain users from accessing some of the services that I have allowed in my internal network.

However, this is done by using ident, an application that if on, will have httpd connect to the ident daemon of the remote host and find out the remote login name of the owner of the client socket. Unfortunately the Trumpet Winsock I am using does not support it, nor do some other systems. Keep in mind that if your system supports ident, this is a good feature to use, even though it’s no trustworthy, you should use it for informational purpose only, as it does not add any security to your system.

One thing I need to watch out for, and I am sure you will have to as well, is not to confuse the name of the routing file in SOCKS, socks-conf, with the name of the access file. They are so similar that I find it easy to confuse the two. However, their functions are very different.

The routing file is there to tell SOCKS clients when to use it and when not to use it. Every time an address has a direct connection to another (through Ethernet, for example), SOCKS is not used because its loopback is defined automatically. Therefore, I have three options here:

To deny, which tells SOCKS to reject a request.
To direct, which tells us what address should not use SOCKS (addresses that can be reached without SOCKS).
To sockd, which tells the computer what host has the SOCKS server daemon on it (the syntax is sockd @=<serverlist> <IP address> <modifier>). The @= entry enables me to enter a list of proxy servers IP addresses.

Now, to have my applications working with the proxy server, they need to be "sockified." I need a telnet address for direct communication and another for communications using the proxy server. The instructions to sockify a program are included with SOCKS. Because the programs will be sockified, I will need to change their names. For example, finger will become finger.orig, ftp will become ftp.orig, and so on. The include/socks.h file will hold all of this information.

A nice feature of using Netscape Navigator is that it handles routing and sockifying itself. However, there is another product I plan to use called Purveyor Web Server 1.2. that not only also works as a proxy for FTP, Gopher, and HTTP but also

But one of the reasons I will be using Trumpet Winsock (for Microsoft Windows) is that it comes with built-in proxy server capabilities. I just need to enter the IP address of the server and addresses of all the computers I can reach directly in the setup menu. Trumpet Winsock will then handle all of the outgoing packets.

At this point, I should be done. However, I know I’ll have a problem (and you will too!). SOCKS does not work with UDP, only with TCP. Programs such as Archie use UDP, which means that because SOCKS is my proxy server, it will not be able to work with Archie. Tom Fitzgerald (fitz@wang.com) designed a package called UDPrelay to be used with UDP, but it’s not compatible with Linux yet.

COMPUTING MCGRAW-HILL | Beta Books | Contact Us | Order Information | Online Catalog

Computing McGraw-Hill is an imprint of the McGraw-Hill Professional Book Group.

Copyright © 1997 The McGraw-Hill Companies. All rights reserved. Any use is subject to the Terms of Use; the corporation also has a comprehensive Privacy Policy governing information we may collect from our customers.