Information Technology Reference
In-Depth Information
significantly enhances credit conditions, and the intermediary's bid on a credit listing
has a crucial impact on the resulting interest rate. Pope & Sydnor (2008) analyzed
discrimination in Prosper, found that loan listings with blacks in the attached picture
are 25 to 35 percent less likely to receive funding than those of whites with similar
credit profiles. Badunenko et al. (2010) observed that female borrowers pay on average
higher interest rates than males at the largest German P2P lending platform, due to
female borrowers deliberately offer higher interest rates in anticipation that they would
be otherwise discriminated.
The above researches mainly focus on one or part of information of loans. In
this paper, we try to investigate all the loan information in a uniform framework. Spe-
cifically, we develop a Bayesian network model with all the information in table list-
ing, including the amount of loan to request, interest rate, category of loan, borrowers'
credit score, homeowner, dept-to-income-rate, month-loan-payment. Using a large
sample of paid or default loan data of Prosper from 2008 to 2011, we construct a
Tree Augmented Naïve (TAN) Bayesian network model. Then we experimentally
tested this model, using the data in 2012, and compared them to logistic regression, and
Luo's method (Luo et al, 2011). Experimental results reveal that TAN Bayesian net-
work can significantly help investors make better investment decisions than other
models.
The rest of this paper is organized as follows: The base knowledge of TAN Bayesian
network model is provided in Section 2. In Section 3, a Bayesian network model for
P2P lending is built and compared to other investment models. Finally, we conclude the
work in Section 4.
2
Bayesian Networks
Tree Augmented Naïve (TAN) Bayesian network algorithm (Chow & Lui, 1968) is
used mainly for classification. It efficiently creates a simple Bayesian network model,
allowing for each predictor to depend on another predictor in addition to the target
variable. Its main advantages are its classification accuracy and favorable performance
compared with general Bayesian network models. As for the paper, the target variable
loan status will be simplified as 1=paid or 0=default two classes, then a listing with
portfolios can forecast to classified as 0 or 1 by the Bayesian network model con-
structed by the past loans.
2.1
TAN Classifier Learning Procedure
Let X = (X1, X2, … , Xn) represent a categorical predictor vector and Y represent the
target category, The learning procedure is summarized in Fig. 1 and illustrated in more
detail below.
Search WWH ::




Custom Search