Information Technology Reference
In-Depth Information
- Node V :Thesetofnodes V represents unique users' Twitter ID.
- Edge E :An edge ( i , j )
E between two nodes v i , v j could represent
a retweet relation. The retweet relations are the combination of following
relations and retweet time.
As shown in Figure 2(a) and 2(b), the message-passing graph of a spam looks
different to its Twitter graph. For a normal tweet, the two graph looks alike.
This is helpful for spam detection.
āˆˆ
(a) Twitter graph
(b) Message-passing graph
Fig. 2. This spam tweet is posted by @ followback 707 . (a)Use the Twitter API to
establish the original diagram can not highlight automatically retweet behavior. (b) The
automatically retweet behaviors of spam collusion account are preserved in message-
passing graph.
3.3 Time Evolution Features Extraction
Graph-Based Features. As we consider treat the whole Twitter social network
as a directed graph G(V, E) , there are several graph-based features we can use:
Average clustering coe cient : The clustering coecient C u of a node u
is defined as shown in Equation 1:
2 T u
deg ( u )( deg ( u )
C u =
(1)
āˆ’
1)
where u
V, deg ( u ) is the number of neighbours of u and T u is the number of
connected pairs between all neighbours of u [22,2]. Spammers or spam collusion
accounts usually blindly retweet the tweets which were posted by small com-
munities. Therefore, their retweet relations are with with very high overlapping
probability. We use average clustering coecient of a graph which is written in
Equation 2 to replace the clustering coecient:
AC = 1
āˆˆ
n
uāˆˆG
C u
(2)
 
Search WWH ::




Custom Search