Information Technology Reference
In-Depth Information
effectiveness of our model depends on user features and attribute relations extracted
from user-generated multimedia content such as posts. Therefore, the utility of our
model is irrelevant to that whether a user is celebrity one or not.
Performance analysis . We observed three major types of factors that affect the
results. (1) Predictable difficulty. Our results reveal that it has different difficulty
levels to infer different user attributes. It is more difficult to derive the multi-valued
attributes (occupation, interest, sentiment orientation) compared with binary valued
attribute derivation (gender, age, and relationship). For example, most of user posts
relate to some factual activities and do not express sentimental opinions, whichmakes
it extremely difficult to judge the user sentiment orientation. (2) Data sparsity and
missing issue. Though we focus on the popular users in Google
, there is still a
portion of user's posts, which is scarce or noisy. For example, some users only have
reposts. For the users with scarce posts, we conduct a preprocessing step to remove
such users. The user's posts are noisy. How to design effective user features for
attribute inference is important. The results shows that our textual and visual user
features are effective for attribute inference. (3) Unexpected annotation. Though the
available profiles from other platforms can help us ease the annotation, there still
exist users with noisy posts and no referred sources to help annotation, which can
make the attribute labeling unreliable, and thus affect the evaluation. For example,
it is difficult to accurately judge the occupation of some users due to the ambiguity
of the posts to implicitly express the occupation information.
In the future, we will investigate the following research directions: (1) The pro-
posed framework is quite flexible, with potential extensions tomore practical applica-
tions, such as personalized recommendation and friend suggestion. (2) A nonlinear
version of the model, e.g., designing the potential functions with kernels, can be
developed to improve the robustness and generalization capability.
+
References
1. Anagnostopoulos, A., Kumar, R., Mahdian, M.: Influence and correlation in social networks.
In: KDD, pp. 7-15 (2008)
2. Bennett, P.N., White, R.W., Chu, W., Dumais, S.T., Bailey, P., Borisyuk, F., Cui, X.: Modeling
the impact of short- and long-term behavior on search personalization. In: SIGIR, pp. 185-194
(2012)
3. Bi, B., Shokouhi, M., Kosinski, M., Graepel, T.: Inferring the demographics of search users:
social data meets search queries. In: WWW, pp. 131-140 (2013)
4. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3 , 993-1022
(2003)
5. Boulis, C., Ostendorf, M.: A quantitative analysis of lexical differences between genders in
telephone conversations. In: ACL (2005)
6. Cheng, Z., Caverlee, J., Lee, K.: You are where you tweet: a content-based approach to geo-
locating twitter users. In: CIKM, pp. 759-768 (2010)
7. Conover, M., Gonçalves, B., Ratkiewicz, J. Flammini, A., Menczer, F.: Predicting the political
alignment of twitter users. In: SocialCom/PASSAT, pp. 192-199 (2011)
8. Crandall, D.J., Cosley, D., Huttenlocher, D.P., Kleinberg, J.M., Suri, S.: Feedback effects
between similarity and social influence in online communities. In: KDD, pp. 160-168 (2008)
 
Search WWH ::




Custom Search