Information Technology Reference
In-Depth Information
with a few mail servers) and an enterprise system (such as a corporate network with
many mail servers possibly of different types).
The principle behind MET's operation is to model baseline email flows to and from
particular individual email accounts and sub-populations of email accounts (eg., de-
partments within an enclave or corporate division) and to continuously monitor ongo-
ing email behavior to determine whether that behavior conforms to the baseline. The
statistics MET gathers to compute its baseline models of behavior includes groups of
accounts that typically exchange emails (eg., “social cliques” within an organization),
and the frequency of messages and the typical times and days those messages are
exchanged. Statistical distributions are computed over periods of time, which serve as
a training period for a behavior profile. These models are used to determine typical
behaviors that may be used to detect abnormal deviations of interest, such as an un-
usual burst of email activity indicative of the propagation of an email virus within a
population, or violations of email security policies, such as the outbound transmission
of Word document attachments at unusual hours of the day.
EMT provides a set of models an analyst may use to understand and glean impor-
tant information about individual emails, user account behaviors, and abnormal at-
tachment behaviors for a wide range of analysis and detection tasks. The classifier
and various profile models are trained by an analyst using EMT's convenient and easy
to use GUI to manage the training and learning processes. There is an “alert” function
in EMT which provides the means of specifying general conditions that are indicative
of abnormal behavior to detect events that may require further inspection and analy-
sis, including potential account misuses, self-propagating viral worms delivered in
email attachments, likely inbound SPAM email, bulk outbound SPAM, and email
accounts that are being used to launch SPAM.
EMT is also capable of identifying similar user accounts to detect groups of SPAM
accounts that may be used by a “SPAMbot”, or to identify the group of initial victims
of a virus in a large enclave of many hundreds or thousands of users. For example, if a
virus victim is discovered, the short term profile behavior of that victim can be used
to cluster a set of email accounts that are most similar in their short term behavior, so
that a security analyst can more effectively detect whether other victims exist, and to
propagate this information via MET to limit the spread and damage caused by a new
viral incident.
2
EMT Toolkit
MET, and its associated subsystem MEF (the Malicious Email Filter) was initially
conceived and started as a project in the Columbia IDS Lab in 1999. The initial re-
search focused on the means to statistically model the behavior of email attachments,
and support the coordinated sharing of information among a wide area of email serv-
ers to identify malicious attachments. In order to properly share such information,
each attachment must be uniquely identified, which is accomplished through the
computation of an MD5 hash of the entire attachment. A new generation of polymor-
phic virii can easily thwart this strategy by morphing each instance of the attachment
that is being propagated. Hence, no unique hash would exist to identify the originating
virus and each of its variant progeny. (It is possible to identify the progenitor by
analysis of entry points and attachment contents as described in the Malicious Email
Filter paper [0].)
Search WWH ::




Custom Search