1

Recently I started working with sockets. I realized that when reading from a network stream, you can not know how much data is coming in. So either you know in advance how many bytes have to be recieved or you know which bytes.

Since I am currently trying to implement a C# WebSocket server I need to process HTTP requests. A HTTP request can have arbitrary length, so knowing in advance how many bytes is out of the question. But a HTTP request always has a certain format. It starts with the request-line, followed by zero or more headers, etc. So with all this information it should be simple, right?

Nope.

One approach I came up with was reading all data until a specific sequence of bytes was recognized. The StreamReader class has the ReadLine method which, I believe, works like this. For HTTP a reasonable delimiter would be the empty line separating the message head from the body.

The obvious problem here is the requirement of a (preferrably short) termination sequence, like a line break. Even the HTTP specification suggests that these two adjacent CRLFs are not a good choice, since they could also occur at the beginning of the message. And after all, two CRLFs are not a simple delimiter anyways.

So expanding the method to arbitrary type-3 grammars, I concluded the best choice for parsing the data is a finite state machine. I can feed the data to the machine byte after byte, just as I am reading it from the network stream. And as soon as the machine accepts the input I can stop reading data. Also, the FSM could immediately capture the significant tokens.

But is this really the best solution? Reading byte after byte and validating it with a custom parser seems tedious and expensive. And the FSM would be either slow or quite ugly. So...

How do you process data from a network stream when the form is known but not the size?

How can classes like the HttpListener parse the messages and be fast at it too?

Did I miss something here? How would this usually be done?

Community
  • 1
  • 1
Lucius
  • 3,705
  • 2
  • 22
  • 41
  • Why are you trying to recreate something that already exists? – M.Babcock Sep 02 '13 at 00:14
  • 1
    @M.Babcock I guess there are few reasons to do that. Whishing to deal with a simpler subset of a feature, needing deeper access than the library provides, or simply wanting to know how these things work. Anyways, while there are some basic WebSocket Server implementations around, the .NET library does **not** include one for Windows 7. – Lucius Sep 02 '13 at 08:24

2 Answers2

4

HttpListener and other such components can parse the messages because the format is deterministic. The Request is well documented. The request header is a series of CRLF-terminated lines, followed by a blank line (two CRLF in a row).

The message body can be difficult to parse, but it's deterministic in that the header tells you what encoding is used, whether it's compressed, etc. Even multi-part messages are not terribly difficult to parse.

Yes, you do need a state machine to parse HTTP messages. And yes you have to parse it byte-by-byte. It's somewhat involved, but it's very fast. Typically you read a bunch of data from the stream into a buffer and then process that buffer byte-by-byte. You don't read the stream one byte at a time because the overhead will kill performance.

You should take a look at the HttpListener source code to see how it all works. Go to http://referencesource.microsoft.com/netframework.aspx and download the .NET 4.5 Update 1 source.

Be prepared to spend a lot of time digging through that and through the HTTP spec.

By the way, it's not difficult to create a program that handles a small subset of HTTP requests. But I wonder why you'd want to do that when you can just use HttpListener and have all the details handled for you.

Update

You are talking about two different protocols. HTTP and WebSocket are two entirely different things. As the Wikipedia article says:

The WebSocket Protocol is an independent TCP-based protocol. Its only relationship to HTTP is that its handshake is interpreted by HTTP servers as an Upgrade request.

With HTTP, you know that the server will send the stream and then close the connection; it's a stream of bytes with a defined end. WebSocket is a message-based protocol; it enables a stream of messages. Those messages have to be delineated in some way; the sender has to tell the receiver where the end of the message is. That can be implicit or explicit. There are several different ways this is done:

  1. The sender includes the length of message in the first few bytes of the message. For example, the first four bytes are a binary integer that says how many bytes follow in that message. So the receiver reads the first four bytes, converts that to an integer, and then reads that many bytes.
  2. The length of the message is implicit. For example, sender and receiver agree that all messages are 80 bytes long.
  3. The first byte of the message is a message type, and each message type has a defined length. For example, message type 1 is 40 bytes, message type 2 is 27 bytes, etc.
  4. Messages have some terminator. In a line-oriented message system, for example, messages are terminated by CRLF. The sender sends the text and then CRLF. The receiver reads bytes until it receives CRLF.

Whatever the case, sender and receiver must agree on how messages are structured. Otherwise the case that you're worried about does crop up: the receiver is left waiting for bytes that will never be received.

In order to handle possible communications problems you set the ReceiveTimeout property on the socket, so that a Read will throw SocketException if it takes too long to receive a complete message. That way, your program won't be left waiting indefinitely for data that is not forthcoming. But this should only happen in the case of communications problems. Any reasonable message format will include a way to determine the length of a message; either you know how much data is coming, or you know when you've reached the end of a message.

Jim Mischel
  • 131,090
  • 20
  • 188
  • 351
  • Thanks for your help! Some follow-up questions: Why is it slow reading a stream byte-by-byte? What's the alternative? How can I read the data into a buffer *before* parsing it? (That's basically my original question.) If I read too much the call will just block and wait. Remember, I don't know how much data is coming. Also, I'd really like to take a look at the source code files, but seem to be lacking the software to read them. While the HttpListener handles HTTP communication, the WebSocket feature is sadly not available on Windows 7. And yes, I just want to handle a small subset of HTTP. – Lucius Sep 02 '13 at 08:20
  • @Lucius: [NetworkStream.Read](http://msdn.microsoft.com/en-us/library/system.net.sockets.networkstream.read.aspx) reads *up to* the number of bytes you ask for. So if you ask for 100,000 bytes and there are only 50,000 bytes available, you will get 50,000 bytes in your buffer. Used correctly, you will not be left waiting for data that will never come. – Jim Mischel Sep 02 '13 at 09:50
  • [This seems only to be true if the connection is *closed*](http://stackoverflow.com/a/6958290/925580). Here I am working with a still open connection, since the same connection is later used for WebSocket communication. From my understanding, if I try reading the 100,000 bytes and there are only 50,000 bytes available, `Read` will block until the data is recieved *or* the connection is closed. – Lucius Sep 02 '13 at 09:57
  • Thanks for the update! It pretty much sums up what my own thoughts were. (My question was not quite clear, I apologize.) What I was worried about were the performance implications when reading and parsing a NetworkStream one byte at a time. But I found the `BufferedStream` class, which seem to be very helpful here. As it seems I will still have to parse the initial handshake byte-by-byte, but that's not too bad, considering it is only done once per connection. Thank you again! – Lucius Sep 02 '13 at 17:36
0

If you want to send a message you can just pre-pend the size of the message to it. Get the number of bytes in the message, pre-pend a ulong to it. At the receiver, read the size of a ulong, parse it, then read that amount of bytes from the stream and then close it.

In a HTTP header you can read: Content-Length The length of the request body in octets (8-bit bytes)

Brett Sanderson
  • 308
  • 2
  • 9
  • Hi! The Content-Length header tells me the size of the body. In order to find out that there even exists such a header / get it's value I have to first parse the message head (with arbitrary size). – Lucius Sep 02 '13 at 08:23