Information Technology Reference
In-Depth Information
the words: free, money, big, lose weight, etc. Users can empirically give some key
words and give higher weight to them when count their frequency. To avoid counting
those common words user can give a stopwords list to delete those words first, which
is a common approach in NLP. We are still doing further experiments and analysis on
this content-based part.
Recent studies have indicated a higher accuracy using weighted naïve bayes on
email contents. But as these methods become common, the spam writers are finding
new and unusual ways to circumvent content analysis filters. These include, pasting in
encyclopedia entries into the non rendered html section of the email, using formatting
tricks to break apart words, using images, and redirection.
4
Discussion
4.1
Deployment and Testing
It is important to note that testing EMT and MET in a laboratory environment is not
particularly informative, but perhaps suggestive of performance. The behavior models
are naturally specific to a site or particular account(s) and thus performance will vary
depending upon the quality of data available for modeling, and the parameter settings
and thresholds employed.
It is also important to recognize that no single modeling technique in EMT's reper-
toire can be guaranteed to have no false negatives, or few false positives. Rather,
EMT is designed to assist an analyst or security staff member architect a set of models
whose outcomes provide evidence for some particular detection task. The combina-
tion of this evidence is specified in the alert logic section as simple Boolean combina-
tions of model outputs; and the overall detection rates will clearly be adjusted and
vary depending upon the user supplied specifications of threshold logic. In the follow-
ing section, several examples are provided to suggest the manner in which an EMT
and MET system may be used and deployed.
To help direct our development and refinement of EMT and MET, we deployed
MET at two external locations, and the version of EMT described herein at one exter-
nal location. (Because of the nature of the sensitivity of the target application, email
analysis, both organizations prefer that their identities be left unknown; a topic we
address in section 6.) It is instructive, however, to consider how MET was used in
one case to detect and stop the spread of a virus.
The particular incident was an inbound copy of the “Hybris” virus, which appeared
sometime in late 2000. Hybris, interestingly enough, does not attack address topics of
Window's based clients, but rather takes over Winsock to sniff for email addresses
that it targets for its propagation. The recipient of the virus noticed the inbound email
and a clear “clique violation” (the sender email address was a dead giveaway). Fortu-
nately, the intended victim was a Linux station, so the virus could not propagate itself,
and hence MET detected no abnormal attachment flow. Nonetheless, the intended
victim simply used the MET interface to inspect the attachment he received, and no-
ticed 4 other internal recipients of the same attachment. Each were notified within the
same office and subsequently eradicated the message preventing the spread among
other Windows clients within the office.
Search WWH ::




Custom Search