Information Technology Reference
In-Depth Information
A Passive External Web Surveillance Technique
for Private Networks
Constantine Daicos and Scott Knight
Royal Military College of Canada,
PO Box 17000, Station Forces Kingston, Ontario Canada K7K 7B4
cdaicos@gmail.com, knight-s@rmc.ca
Abstract. The variety and richness of what users browse on the Inter-
net has made the communications of web-browsing hosts an attractive
target for surveillance. We show that passive external surveillance of web-
browsing hosts in private networks is possible despite the anonymizing
effects of NATs and HTTP proxies at the gateway. These devices ef-
fectively anonymize the origin of communication streams, and remove
many identifying features, making it di cult to group web tra c into
mutually disjoint same-host single user sets called sessions. Sessions of-
fer a complete picture of each user's web browsing experience. Without
them, passive external surveillance is of little use. This paper offers a con-
tent analysis technique called Link Chaining that aids the sessionization
process by recovering large pieces of sessions called session fragments.
The technique is based on the knowledge that the majority of down-
loaded web resources are clicked-to from other web pages. By following
hyperlinks in the bodies of HTTP messages in passively collected trace
data, web trac can be be coalesced into session fragments and used
by human analysts to isolate individual users' sessions. The technique
gives the human analyst a significant advantage over manual methods.
The implementation presented here has been tested on accumulated local
data and demonstrates the feasibility of the scheme.
1
Introduction
Given a raw trace of web trac collected from the outside of a private network,
an adversary performing surveillance can be expected to take three steps:
1. Reconstruct TCP/IP connections from raw packets
2. Organize the connections into user sessions
(mutually disjoint same-host sets)
3. Browse the web content of each session to gather intelligence
Without the effects of gateway devices, the second step is trivial. The ad-
versary logging packets from the outside can group them by the original host's
IP address and produce user sessions. With (network address translation (NAT)
and HTTP proxies however, the original IP address and other identifying infor-
mation is absent, making it very dicult for to group trac into user sessions.
Search WWH ::




Custom Search