Chapter 24
Implementing Cookies
Maintaining a State
When you create a Web site, you normally expect the user to load HTML documents, view
them, navigate from one page to another, and so on. Occasionally, it is important to
enable the Web page to maintain a state. That is, the page “remembers” certain
actions executed by the user during previous sessions.
A classic example of maintaining a given state is a Shopping Cart application,
as implemented, for example, in Netscape’s On-Line Store. The user travels from one
product review to the other, via simple HTML links. When he or she comes across an
interesting product, clicking a button puts the selected product’s data in a “Shopping
Cart.” The Shopping Cart, which is sometimes displayed visually on the page, is
basically a name for a storage mechanism. Since it is not possible to store the data for
each user on the server, the data is kept on the client side, in what is called a Shopping
Cart.
Cookies
Cookies are a general mechanism which server-side applications (such as CGI) and
client-side JavaScript scripts can use to store textual data on the client side for the
purpose of retrieving it later. The term cookies was initially used by Netscape,
the pioneer in this area, and was later adopted by other browsers such as Microsoft IE.
The name cookies does not have any significant meaning.
Cookies are tidbits of information, stored in a browser-dependent format on the client
machine. Netscape Navigator, for example, holds all cookies in a regular text file named cookies.txt
(in the directory where Navigator is installed), whereas MSIE 3.0 stores cookies in
multiple files, located in a user-provided directory.
Cookies and HTTP
The connection established between the server and the client uses a HyperText Transfer
Protocol (HTTP). Although this protocol is very complicated at the implementation level,
it is fairly easy to understand at the conceptual one. When a user requests a page, an HTTP
request is sent to the server, specifying the user’s exact request with some
additional attributes. As a user, you are not aware of any data sent to the server as a
result of your request. Among all elements, an HTTP request includes a header that defines
the most important attributes, such as the URL of the requested page. An HTTP request
includes all valid cookies as well (explained later in this chapter).
When the server replies to the client’s request, it returns an HTTP response
which also features a header. This header contains important information about the file
being returned, such as its MIME types (discussed later in the book).
The general syntax of an HTTP header is as follows:
Field-name: Information
When the server returns an HTTP object to the client, it may also transmit some state
information for the client, to store as cookies. Since a cookie is basically simple text,
the server-side script does not have the ability to abuse the client machine in any way.
In addition to its textual value, a cookie contains several attributes, such as the range
of URLs for which the cookie is valid. Any future HTTP requests from the client to one of
the URLs in the above range will transmit back to the server the current cookie’s value
on the client.
Setting an HTTP Cookie
An HTTP cookie is introduced to the client in an HTTP request, usually by a CGI script,
using the following syntax:
Set-Cookie: NAME=VALUE; expires=DATE; path=pathName;
domain=DOMAIN_NAME; secure
The attributes are as follows:
name=value
name is the name of the cookie by which you can reference it later. Notice
that the only way to access the cookie is by this name. value is the regular
string to be stored as a cookie. It is recommended that the string be encoded using the
“%XX” style (equivalent to JavaScript’s escape function’s output).
Generally speaking, the name=value is the only required attribute of the Set-Cookie
field.
expires=date
expires is an optional attribute which specifies the expiration date of the
cookie. The cookie will no longer be stored or retrieved beyond that date. The date string
is formatted as follows:
Wdy, DD-Mon-YYYY HH:MM:GMT
You will see later that this date format is equivalent to the value returned by the toGMTString()
date’s method. If expires is not specified, the cookie will expire when
the user’s session ends.
domain=domainName
When searching for valid cookies, Navigator compares the domain attributes of
each cookie to the Internet domain name of the host from which the URL will be retrieved.
If there is a tail match, then the cookie will go through a full path matching. “Tail
matching” means that the domain attribute is matched against the tail of the
fully qualified domain name of the host. A domain attribute of “ac.il”, for example,
would tail match “mis.study.ac.il” as well as “mba.haifa.ac.il”.
The domain attribute makes sure that only hosts within the specified domain
can set a cookie for the domain. Domains must have at least two or three periods, to avoid
collision between domains of the form “.com”, “.edu”, etc. There are seven common
top-level domains that require at least two periods in their domain name:
“com”, “edu”, “net”, “org”, “gov”, “mil”, and “int”. All other
domains require at least three periods in their domainName.
The default value of domain is the host name of the server which generated the
cookie response.
path=pathName
path specifies a subset of URLs in a domain for which a cookie is valid. After
domain matching, the pathname component of the URL is compared with the path
attribute, and, if successful, the cookie is considered valid and is sent along with the
URL requests. The path “/foo”, for example, would match “/foobar”
and “/foo/bar/html”. The path “/” is the most general one. If
the path is not specified, it is assumed to be the same path as the document specified in
the cookie’s header.
secure
If a cookie is marked secure, it will only be transmitted across a secured
communication channel between the client and the host. Currently, secured cookies will
only be sent to HTTP servers. If secure is not specified, the cookie will be sent
over unsecured channels.