A Behavior-Based Approach to Securing Email Systems - Computer Network Security

Information Technology Reference

In-Depth Information

from intuition, it is easier to detect a potential anomaly if the size of the recipient list

of the attack email is large.

Table 3. Simulation of user cliques, with 5 attack strategies.

Attack Strategy

Detection Rate

Send to all addresses, one at a time

0

Send many emails, each containing 2 ran-

dom addresses

13 %

Send many emails, each containing 3 ran-

dom addresses

49 %

Send many emails, each containing 5 ran-

dom addresses

96 %

Send 1 email, containing all addresses

100 %

3

Supervised Machine Learning Models

In addition to the attachment and account frequency models, EMT includes an inte-

grated supervised learning feature akin to that implemented in the MEF system previ-

ously reported in [12].

3.1

Modeling Malicious Attachments

MEF is designed to extract content features of a set of known malicious attachments,

as well as benign attachments. The features are then used to compose a set of training

data for a supervised learning program that computes a classifier. Fig. 2 displays an

attachment profile including a class label that is either “malicious” or “benign”.

MEF was designed as a component of MET. Each attachment flowing into an

email account would first be tested by a previously learned classifier, and if the likeli-

hood of “malicious” were deemed high enough, the attachment would be so labeled,

and the rest of the MET machinery would be called into action to communicate the

newly discovered malicious attachment, sending reports from MET clients to MET

servers.

The core elements of MEF are also being integrated into EMT. However, here the

features extracted from the training data include content-based features of email bod-

ies (not just attachment features).

The Naïve Bayes learning program is used to compute classifiers over labeled

email messages deemed interesting or malicious by a security analyst. The GUI al-

lows the user to mark emails indicating those that are interesting and those that are

not, and then may learn a classifier that is subsequently used to mark the remaining

set of unlabeled emails in the database automatically.

A Naïve Bayes [5] classifier computes the likelihood that an email is interesting

given a set of features extracted from the set of training emails that are specified by

the analyst. In the current version of EMT, the set of features extracted from emails

includes a set of static features such as domain name, time, sender email name, num-

ber of attachments, the MIME-type of the attachment, the likelihood the attachment is

Computer Network Security

Search WWH ::

Custom Search

Home