Understanding the HTTP Protocol (Part 2)

by Don Parker [Published on 4 Oct. 2005 / Last Updated on 4 Oct. 2005]

In part one of this article series, we covered the HTTP traffic metrics that come from a web browser client. This second part will cover what the web server itself will send, and expand a little more on HTTP itself.

If you missed the first article in this series please go read, Understanding the HTTP Protocol (Part 1).
If you would like to read the next article in this series please check out Understanding the HTTP Protocol (Part 3).

HTTP the protocol

Well we saw in part one of the HTTP article that there are certain values that a web client will send to a web server. This is done in order to make sure that both the client, and the server will speak the same language, as it were. These specific values are sent, and then parsed by the web server once received by the client.

Now seems a good time to point out that most everything on the Internet today communicates in a particular fashion. That being they observe the client/server model. In the case of HTTP there is Internet Explorer the client, and IIS the server. Another case would be the client Mozilla Firefox, and Apache the web server. There are some notable exceptions to this rule though, can you think of one?

Is all the world a client/server model?

Much to the chagrin of organizations like RIAA, and MPAA the p2p protocol does not observe the client/server model. Peer to peer works much like its name ie: direct peer to peer connections. P2P does not use centralized servers, but is rather solely made up of client computers. The closest thing to a server that P2P uses is a supernode which is once again you guessed it, a client. Enough said on P2P though, and now back to the topic at hand! Oh! Don’t be fooled into thinking that Trojans operate like p2p applications due to their common usage of ephemeral ports. Trojans very much operate in a client/server configuration. Before I get sidetracked once again, back to HTTP we go.

The devil is in the details

Well the above noted explanation now brings us back to our packet seen below. We shall now begin to go through the various metrics, as seen in the ASCII content, and expand upon their meaning. Like the title above this paragraph says, the devil is very much in the details. Please note that I will make my comments directly beneath the packet.

10:14:50.526262 IP (tos 0x10, ttl  55, id 29186, offset 0, flags [none], proto: TCP (6), length: 1470) 72.14.207.99.80 > 192.168.1.100.1722: ., cksum 0xbe07 (correct), 3866955399:3866956829(1430) ack 3141402927 win 6432
0x0000:  4510 05be 7202 0000 3706 32aa 480e cf63  E...r...7.2.H..c
0x0010:  c0a8 0164 0050 06ba e67d 0e87 bb3e 012f  ...d.P...}...>./
0x0020:  5010 1920 be07 0000 4854 5450 2f31 2e31  P.......HTTP/1.1
0x0030:  2032 3030 204f 4b0d 0a43 6163 6865 2d43  .200.OK..Cache-C
0x0040:  6f6e 7472 6f6c 3a20 7072 6976 6174 650d  ontrol:.private.
0x0050:  0a43 6f6e 7465 6e74 2d54 7970 653a 2074  .Content-Type:.t
0x0060:  6578 742f 6874 6d6c 0d0a 5365 7276 6572  ext/html..Server
0x0070:  3a20 4757 532f 322e 310d 0a54 7261 6e73  :.GWS/2.1..Trans
0x0080:  6665 722d 456e 636f 6469 6e67 3a20 6368  fer-Encoding:.ch
0x0090:  756e 6b65 640d 0a44 6174 653a 2053 6174  unked..Date:.Sat
0x00a0:  2c20 3330 204a 756c 2032 3030 3520 3134  ,.30.Jul.2005.14
0x00b0:  3a31 343a 3530 2047 4d54 0d0a 0d0a 6132  :14:50.GMT....a2
0x00c0:  630d 0a3c 6874 6d6c 3e3c 6865 6164 3e3c  c..<html><head><
0x00d0:  6d65 7461 2068 7474 702d 6571 7569 763d  meta.http-equiv=
0x00e0:  2263 6f6e 7465 6e74 2d74 7970 6522 2063  "content-type".c

Well to begin with we should know where in the packet the HTTP data actually starts. If you still have that TCP/IP and tcpdump flyer I recommended you download at the bottom of the page I just hyperlinked to we can easily find out where the data starts. We can see from the underlined 06 that the protocol being ferried about is TCP. From the underlined value 5 we know that there are no options in the TCP header. With this info in hand we know then that the HTTP data starts at the underlined 4854 and carries on to the end of the packet itself. This is a quick and easy way to orient yourself to the contents of the packet. With that now dealt with let us start breaking out the server’s response as seen in the packet above.

Time to bust out the info!

HTTP/1.1 200 OK

The underlined text above appears in the ASCII content of the packet. It is also underlined in the packet itself above. This is saying that that the web server uses HTTP protocol version 1.1. It also means that the document the web client requested has been found, and is included in the response. The numerical value 200 as seen is actually a status code. More to follow on status codes later, and their role.

Cache-Control: private

This cryptic little field means that the document sent to the web client is not to be cached by a proxy, and is intended only for the user requesting the document. There is a whole lot more to caching, and how it works. Interesting reading if one is so inclined.

Content-type: text/html

Shown above is what the server is telling to the client ie: that the included document being sent is in a text/html format. That way the web client will know how to render the information.

Server: GWS/2.1

Identified here is the type of server, or server software that is being used by you guessed it, the web server. In this case the type of server used by Google.

Transfer-Encoding: chunked

In HTTP 1.1 chunked transfer encoding is supported. What does it do though you ask? Well simply put, chunked transfer encoding will modify the body of a message so that it can be transferred as a series of chunks. Each chunk has its own size indicator. In contrast a normal HTTP file transfer will contain a “Content-Length” field indicating the amount of data being transferred.

Date: Sat 30 Jul 2005 14:14:50 GMT

Well we can infer from this line that it would be the date and time in GMT as seen on the server. That was one was fairly simple to guess at, and be correct. Were they all to be that easy!

<html><head><meta.http-equiv=”content-type”

For those of you who may be unfamiliar with HTML code, and what it looks like, be aware that the underlined portion above is indeed HTML code. A pretty good giveaway as well that you are seeing HTML is the use of <> before each code. It is via this HTML code that your web browser ie: Internet Explorer, knows how to properly render the page to you. Via all this HTML is the color of the page, and how all the information itself is formatted, be it in paragraph form, bullet, or table.

Should you wish to check out all the HTML contained in the web page you are presently reading (this document!) please click on “View” in your web browser, then click on “Source”. Doing so will display the web page in its source code format. Pretty neat eh! If you want to have more fun then simply paste the web pages source code in say notepad and made some modifications, and then reload that page. You will see that you have now modified the contents of that page. Once again that would elicit a “pretty neat” comment from me.

Well on that note we will break this article at this point. I will cover in the last part of the series on HTTP more of the minutiae of HTTP, and ways of playing, or modifying HTTP requests. Till then keep a close eye on your packets!

If you missed the first article in this series please go read, Understanding the HTTP Protocol (Part 1).
If you would like to read the next article in this series please check out Understanding the HTTP Protocol (Part 3).

See Also

Advertisement

Featured Links