|
2. Questions regarding both Clients and Servers (TCP/SOCK_STREAM)2.1 How can I tell when a socket is closed on the other end?From Andrew Gierth ( andrew@erlenstar.demon.co.uk): AFAIK: If the peer calls If the peer reboots, or sets I should also point out that when If the peer remains unreachable, we should get some other error. I don't think that So yes, you must expect As an example, suppose you are receiving a file down a TCP link; you
might handle the return from
2.2 What's with the second parameter in bind()?The man page shows it as "
2.3 How do I get the port number for a given service?Use the
2.4 If bind() fails, what should I do with the socket descriptor?If you are exiting, I have been assured by Andrew that all unixes will
close open file descriptors on exit. If you are not exiting though, you
can just close it with a regular 2.5 How do I properly close a socket?This question is usually asked by people who try 2.6 When should I use shutdown()?From Michael Hunter ( mphunter@qnx.com):
2.7 Please explain the TIME_WAIT state.Remember that TCP guarantees all data transmitted will be delivered, if at all possible. When you close a socket, the server goes into a TIME_WAIT state, just to be really really sure that all the data has gone through. When a socket is closed, both sides agree by sending messages to each other that they will send no more data. This, it seemed to me was good enough, and after the handshaking is done, the socket should be closed. The problem is two-fold. First, there is no way to be sure that the last ack was communicated successfully. Second, there may be "wandering duplicates" left on the net that must be dealt with if they are delivered. Andrew Gierth ( andrew@erlenstar.demon.co.uk) helped to explain the closing sequence in the following usenet posting: Assume that a connection is in ESTABLISHED state, and the client is about to do an orderly release. The client's sequence no. is Sc, and the server's is Ss. The pipe is empty in both directions.
Note: the +1 on the sequence numbers is because the FIN counts as one byte of data. (The above diagram is equivalent to fig. 13 from RFC 793). Now consider what happens if the last of those packets is dropped in the network. The client has done with the connection; it has no more data or control info to send, and never will have. But the server does not know whether the client received all the data correctly; that's what the last ACK segment is for. Now the server may or may not care whether the client got the data, but that is not an issue for TCP; TCP is a reliable rotocol, and must distinguish between an orderly connection close where all data is transferred, and a connection abort where data may or may not have been lost. So, if that last packet is dropped, the server will retransmit it (it is, after all, an unacknowledged segment) and will expect to see a suitable ACK segment in reply. If the client went straight to CLOSED, the only possible response to that retransmit would be a RST, which would indicate to the server that data had been lost, when in fact it had not been. (Bear in mind that the server's FIN segment may, additionally, contain data.) DISCLAIMER: This is my interpretation of the RFCs (I have read all the TCP-related ones I could find), but I have not attempted to examine implementation source code or trace actual connections in order to verify it. I am satisfied that the logic is correct, though. More commentarty from Vic: The second issue was addressed by Richard Stevens ( rstevens@noao.edu, author of "Unix Network Programming", see 1.5 Where can I get source code for the book [book title]?). I have put together quotes from some of his postings and email which explain this. I have brought together paragraphs from different postings, and have made as few changes as possible. From Richard Stevens ( rstevens@noao.edu): If the duration of the TIME_WAIT state were just to handle TCP's full-duplex close, then the time would be much smaller, and it would be some function of the current RTO (retransmission timeout), not the MSL (the packet lifetime). A couple of points about the TIME_WAIT state.
A wandering duplicate is a packet that appeared to be lost and was retransmitted. But it wasn't really lost ... some router had problems, held on to the packet for a while (order of seconds, could be a minute if the TTL is large enough) and then re-injects the packet back into the network. But by the time it reappears, the application that sent it originally has already retransmitted the data contained in that packet. Because of these potential problems with TIME_WAIT assassinations, one
should
not avoid the TIME_WAIT state by setting the I have a long discussion of just this topic in my just-released "TCP/IP Illustrated, Volume 3". The TIME_WAIT state is indeed, one of the most misunderstood features of TCP. I'm currently rewriting "Unix Network Programming" (see 1.5 Where can I get source code for the book [book title]?). and will include lots more on this topic, as it is often confusing and misunderstood. An additional note from Andrew: Closing a socket: if 2.8 Why does it take so long to detect that the peer died?From Andrew Gierth ( andrew@erlenstar.demon.co.uk): Because by default, no packets are sent on the TCP connection unless there is data to send or acknowledge. So, if you are simply waiting for data from the peer, there is no way to tell if the peer has silently gone away, or just isn't ready to send any more data yet. This can be a problem (especially if the peer is a PC, and the user just hits the Big Switch...). One solution is to use the RFC1122 specifies that this timeout (if it exists) must be configurable. On the majority of Unix variants, this configuration may only be done globally, affecting all TCP connections which have keepalive enabled. The method of changing the value, moreover, is often difficult and/or poorly documented, and in any case is different for just about every version in existence. If you must change the value, look for something resembling If you're sending to the peer, though, you have some better guarantees; since sending data implies receiving ACKs from the peer, then you will know after the retransmit timeout whether the peer is still alive. But the retransmit timeout is designed to allow for various contingencies, with the intention that TCP connections are not dropped simply as a result of minor network upsets. So you should still expect a delay of several minutes before getting notification of the failure. The approach taken by most application protocols currently in use on the Internet (e.g. FTP, SMTP etc.) is to implement read timeouts on the server end; the server simply gives up on the client if no requests are received in a given time period (often of the order of 15 minutes). Protocols where the connection is maintained even if idle for long periods have two choices:
2.9 What are the pros/cons of select(), non-blocking I/O and SIGIO?Using non-blocking I/O means that you have to poll sockets to see if there is data to be read from them. Polling should usually be avoided since it uses more CPU time than other techniques. Using Using 2.10 Why do I get EPROTO from read()?From Steve Rago ( sar@plc.com):
And an addition note from Andrew ( andrew@erlenstar.demon.co.uk): Not quite to do with On some other implementations, accept seemed to be capable of blocking
if this occured. This is important, since if 2.11 How can I force a socket to send the data in it's buffer?From Richard Stevens ( rstevens@noao.edu): You can't force it. Period. TCP makes up its own mind as to when
it can send data. Now, normally when you call (Snipped suggestion from Andrew Gierth to use Setting this only disables one of the many tests, the Nagle algorithm. But if the original poster's problem is this, then setting this socket option will help. A quick glance at tcp_output() shows around 11 tests TCP has to make as to whether to send a segment or not. Now from Dr. Charles E. Campbell Jr. ( cec@gryphon.gsfc.nasa.gov): As you've surmised, I've never had any problem with disabling Nagle's algorithm. Its basically a buffering method; there's a fixed overhead for all packets, no matter how small. Hence, Nagle's algorithm collects small packets together (no more than .2sec delay) and thereby reduces the amount of overhead bytes being transferred. This approach works well for rcp, for example: the .2 second delay isn't humanly noticeable, and multiple users have their small packets more efficiently transferred. Helps in university settings where most folks using the network are using standard tools such as rcp and ftp, and programs such as telnet may use it, too. However, Nagle's algorithm is pure havoc for real-time control and not much better for keystroke interactive applications (control-C, anyone?). It has seemed to me that the types of new programs using sockets that people write usually do have problems with small packet delays. One way to bypass Nagle's algorithm selectively is to use "out-of-band" messaging, but that is limited in its content and has other effects (such as a loss of sequentiality) (by the way, out-of-band is often used for that ctrl-C, too). More from Vic: So to sum it all up, if you are having trouble and need to flush the
socket, setting the I asked Andrew something to the effect of "What promises does TCP make about when it will get around to writing data to the network?" I thought his reply should be put under this question: Not many promises, but some. I'll try and quote chapter and verse on this: References:
RFC 1122, "Requirements for Internet Hosts" (also STD 3)
The first of the interesting cases is "window closed" (ie. there is no buffer space at the receiver; this can delay data indefinitely, but only if the receiving process is not actually reading the data that is available) Vic asks: OK, it makes sense that if the client isn't reading, the data isn't going to make it across the connection. I take it this causes the sender to block after the recieve queue is filled? The sender blocks when the socket send buffer is full, so buffers will be full at both ends. While the window is closed, the sending TCP sends window probe packets. This ensures that when the window finally does open again, the sending TCP detects the fact. [RFC1122, ss 4.2.2.17] The second interesting case is "Nagle algorithm" (small segments, e.g.
keystrokes, are delayed to
form larger segments if ACKs are expected from the peer; this
is what is disabled with Vic Asks: Does this mean that my tcpclient sample should set TCP_NODELAY to ensure that the end-of-line code is indeed put out onto the network when sent? No. tcpclient.c is doing the right thing as it stands; trying to write
as much data as possible in as few calls to The Nagle algorithm only has an effect when a second Since this delay has negative consequences for certain applications, generally those where a stream of small requests are being sent without response, e.g. mouse movements, the standards specify that an option must exist to disable it. [RFC1122, ss 4.2.3.4] Additional note: RFC1122 also says:
So programs should avoid calls to The other possible sources of delay in the TCP are not really controllable by the program, but they can only delay the data temporarily. Vic asks: By temporarily, you mean that the data will go as soon as it can, and I won't get stuck in a position where one side is waiting on a response, and the other side hasn't recieved the request? (Or at least I won't get stuck forever) You can only deadlock if you somehow manage to fill up all the buffers in both directions... not easy. If it is possible to do this, (can't think of a good example though), the solution is to use nonblocking mode, especially for writes. Then you can buffer excess data in the program as necessary. 2.12 Where can a get a library for programming sockets?There is the Simple Sockets Library by Charles E. Campbell, Jr. PhD. and Terry McRoberts. The file is called ssl.tar.gz, and you can download it from this faq's home page. For c++ there is the Socket++ library which is on ftp://ftp.virginia.edu/pub/socket++-1.10.tar.gz. There is also C++ Wrappers, but I can't find this package anywhere. The file is called C++_wrappers.tar.gz. I have asked the people where it used to be stored where I can find it now, but I never heard back. From http://www.cs.wustl.edu/~schmidt you should be able to find the ACE toolkit. PING Software Group has some libraries that include a sockets interface among other things. You can find them at http://love.geology.yale.edu/~markl/ping. I don't have any experience with any of these libraries, so I can't recomend one over the other. 2.13 How come select says there is data, but read returns zero?The data that causes select to return is the EOF because the other side has closed the connection. This causes read to return zero. For more information see 2.1 How can I tell when a socket is closed on the other end? 2.14 Whats the difference between select() and poll()?From Richard Stevens ( rstevens@noao.edu): The basic difference is that With 2.15 How do I send [this] over a socket?Anything other than single bytes of data will probably get mangled unless
you take care. For integer values you can use 2.16 How do I use TCP_NODELAY?First off, be sure you really want to use it in the first place. It will
disable the Nagle algorithm
(see
2.11 How can I force a socket to send the data in it's buffer?),
which will cause network traffic
to increase, with smaller than needed packets wasting bandwidth. Also,
from what I have been able to tell, the speed increase is very small, so
you should probably do it without Here is a code example, with a warning about using it from Andrew Gierth:
2.17 What exactly does the Nagle algorithm do?It groups together as much data as it can between ACK's from the other end of the connection. I found this really confusing until Andrew Gierth ( andrew@erlenstar.demon.co.uk) drew the following diagram, and explained: This diagram is not intended to be complete, just to illustrate the point better... Case 1: client writes 1 byte per
Total segments: 5. (If Case 2: client writes all data with one
Total segments: 3. Time for response = RTT (therefore minimum possible). Hope this makes things a bit clearer... Note that in case 2, you don't want the implementation to gratuitously delay sending the data, since that would add straight onto the response time. 2.18 What is the difference between read() and recv()?From Andrew Gierth ( andrew@erlenstar.demon.co.uk):
It is unlikely that send()/recv() would be dropped; perhaps someone with a copy of the POSIX drafts for socket calls can check... Portability note: non-unix systems may not allow 2.19 I see that send()/write() can generate SIGPIPE. Is there any advantage to handling the signal, rather than just ignoring it and checking for the EPIPE error? Are there any useful parameters passed to the signal catching function?From Andrew Gierth ( andrew@erlenstar.demon.co.uk): In general, the only parameter passed to a signal handler is the signal number that caused it to be invoked. Some systems have optional additional parameters, but they are no use to you in this case. My advice is to just ignore There is one situation where you should not ignore 2.20 After the chroot(), calls to socket() are failing. Why?From Andrew Gierth ( andrew@erlenstar.demon.co.uk): On systems where sockets are implemented on top of Streams (e.g. all
SysV-based systems, presumably including Solaris), the Your system documentation may or may not specify exactly which device nodes are required; I can't help you there (sorry). (Editors note: Adrian Hall ( adrian@waltham.harvard.net) suggested checking the man page for ftpd, which should list the files you need to copy and devices you need to create in the chroot'd environment.) A less-obvious issue with 2.21 Why do I keep getting EINTR from the socket calls?This isn't really so much an error as an exit condition. It means that
the call was interrupted by a signal. Any call that might block should
be wrapped in a loop that checkes for 2.22 When will my application receive SIGPIPE?From Richard Stevens ( rstevens@noao.edu): Very simple: with TCP you get Basically an RST is TCP's response to some packet that it doesn't expect
and has no other way of dealing with. A common case is when the peer closes
the connection (sending you a FIN) but you ignore it because you're writing
and not reading. (You should be using 2.23 What are socket exceptions? What is out-of-band data?Unlike exceptions in C++, socket exceptions do not indicate that an error has occured. Socket exceptions usually refer to the notification that out-of-band data has arrived. Out-of-band data (called "urgent data" in TCP) looks to the application like a separate stream of data from the main data stream. This can be useful for separating two different kinds of data. Note that just because it is called "urgent data" does not mean that it will be delivered any faster, or with higher priorety than data in the in-band data stream. Also beware that unlike the main data stream, the out-of-bound data may be lost if your application can't keep up with it. 2.24 How can I find the full hostname (FQDN) of the system I'mrunning on?From Richard Stevens ( rstevens@noao.edu): Some systems set the hostname to the FQDN and others set it to just the unqualified host name. I know the current BIND FAQ recommends the FQDN, but most Solaris systems, for example, tend to use only the unqualified host name. Regardless, the way around this is to first get the host's name (perhaps
an FQDN, perhaps unaualified). Most systems support the Posix way to do
this using 2.25 How would I put my socket in non-blocking mode?From Andrew Gierth ( andrew@erlenstar.demon.co.uk): Technically, fcntl(soc, F_SETFL, O_NONBLOCK) is incorrect since it clobbers all other file flags. Generally one gets away with it since the other flags (O_APPEND for example) don't really apply much to sockets. In a similarly rough vein, you would use fcntl(soc, F_SETFL, 0) to go back to blocking mode. To do it right, use F_GETFL to get the current flags, set or clear the O_NONBLOCK flag, then use F_SETFL to set the flags. And yes, the flag can be changed either way at will. Previous Next Table of Contents |
|||||||||||||||||
With any suggestions or questions please feel free to contact us |