Information Technology Reference
In-Depth Information
two separate TCP connections are established: one between the client and HTTP
proxy, and one between the proxy and server. The use of this intermediary means
that, unlike NAT, the TCP/IP packet headers contain no identifying features to
differentiate streams emanating from different hosts. This renders attacks like
Bellovin's IPid technique useless.
Original host information can still be found however, in the HTTP headers
of outgoing requests. Plainly configured HTTP proxies pass these headers to the
web server unchanged. If browsers in a network are not all configured identically,
these headers can be used [5] to resolve at least some of the HTTP trac to
same-host sets. Of course, this assumes that the headers are present, and have
not been scrubbed by an anonymizing proxy.
1.4
Anonymizing HTTP Proxy
The HTTP Anonymizing Proxy performs the same functions as a plain proxy,
but scrubs all non-essential headers from outgoing requests. Without any headers
to uniquely identify distinct hosts, keying on HTTP headers is not at all effective.
The Link Chaining Attack can be an effective technique under the condi-
tions of an anonymizing web proxy because it operates on the HTTP message
body. Although HTTP headers can be changed by intermediate devices, the
web content itself cannot be changed in any meaningful way without affecting
the browsing experience. The Link Chaining Attack takes advantage of this by
reconstructing individual web pages from the trac stream and following the
links they contain forward in time to chain TCP connections into user session
fragments.
1.5
Research Goals
The aim of this work is to develop a technique that aids the analyst's manual
sessionization by grouping TCP connections into fragments that are as large and
accurate as possible. The technique follows the hyperlinks in HTTP messages
to identify the TCP connections that belong together. The theory is described
in section 2 and the experiment is outlined in section 3. Before presenting the
results in section 5 we propose some metrics to evaluate the quality of fragments
isolated by our technique. In the analysis of section 6 we validate the work by
establishing a lower bound on the effective analyst speedup.
2Th ory
The Link Chaining technique coalesces independent TCP connections into same-
host groups by following hyperlinks in web pages. By matching the URLs con-
tained in the body of an HTTP response of one connection to the URLs in the
HTTP requests of all other connections, and judiciously removing impossible or
improbable links, it is possible to assemble fragments of user sessions.
The TCP connection is the basic building block in this process. Figure 1
depicts the HTTP requests and responses of two independent TCP connections.
Search WWH ::




Custom Search