Information Technology Reference
In-Depth Information
Therefore, we digitize these data by width-fixed method. For an attribute, assume that
the maximum value of the attribute be
V
, and the minimum value of the attribute be
max
V
, we set the separation width to be
d =
(
V
max V
-
)
/
5
, then the attribute is
min
min
digitized to be 0, 1, 2, 3, 4, 5, when the value belongs to {
V
,
V
+ d }, {
V
+ d ,
min
min
min
V
+2 d }, 2{
V
+2 d ,
V
+3 d }, 3{
V
+3 d ,
V
+4 d }, and {
V
+4 d ,
V
}, respec-
min
min
min
min
min
min
max
tively.
3.3
Forecast of Return Probability
The data to construct a Bayesian network is selected from the duration from 2008 to
2010. The network aims at predicting the return rate from Jan 1st, 2011 to April 30th,
2011. As we just concern about the people who would pay back as the model classified.
The accuracy is calculated as follows:
f
11
R
=
(3)
f
+
f
11
01
f is the number of Status=1 and B-Status=1, 0 f is the number of really
Status=0 but B-Status=1. When the Bayesian network model is built, we can calculate
the Bayesian probability with the information input. The Bayesian network algorithm is
described as Figure3.
Specifically, we select the data from 2008.1 to 2010.1 to build model and use the
data of 2011.1 to check the model. Next, add the 2010.2 data to the learning data while
the check data is 2011.2, and by this analogy. With the TAN Bayesian method and
model, we calculate the Bayesian probability of all the check data. Then the return rate
of different probabilities can be calculated, that is, select those B-Status=1 loan that its
Bayesian probability is higher than parameterθand compare with the really Status. The
result is shown by Table1 and Figure 2.
Where
11
Table 1. Return Rate of Different Bayesian Network Probability
Real Return
Rate
θ
0.5
0.6
0.7
0.8
0.9
2011.1
0.74
0.78
0.81
0.79
0.90
1.00
2011.2
0.84
0.85
0.86
0.88
0.95
1.00
2011.3
0.79
0.81
0.82
0.84
0.89
1.00
2011.4
0.75
0.75
0.76
0.78
0.91
0.95
In P2P lending, our investment decision model ranks loans, from the best to the
worst, according to the probability by Bayesian network. Investors can choose the top
ones as the candidate set. We find empirical evidence to show the effectiveness of our
model and the influence of different parameters. From Figure 3, the paid loans' distri-
butions of Bayesian network probability is markedly higher than the default loans.
 
Search WWH ::




Custom Search