Civil Engineering Reference
In-Depth Information
openness, presented as trd. sir and tir measure the change of the secondary and
tertiary industry. We use GDP per capita as the measure of economic development,
presented as dev. The ratio of governmental revenue and expenditure to GDP is
of importance to explain the labor income share; we use goi and goe to present
the two explanatory variables. The scale of state-owned enterprises is measured by
the employment of state-owned enterprise to the total employment, presented as
soe. While tax represents the tax burden which is measured by the net taxes on
production to GDP. All of the data we take in this paper are downloaded from the
National Statistics Bureau.
Now we give an introduction about the mixed data nonparametric variable
detection method ( Li and Racine , 2004 ), which is the core econometric method we
take. The regression model we build in this paper: Y
.
Local constant least square (LCLS) method can estimate the unknown function:
=
g
(
X
)+ ε
i K
) 1 i K
g
(
x
)=(
(
x
)
i
(
x
)
Y
,
(1)
Y n ) ,
where Y
=(
Y 1 ,
Y 2 ,...,
i is a n
×
1 vector of ones, and K
(
x
)
is a diagonal n
matrix of kernel weighting functions, where K
(
x
)
represents the kernel weighting
functions for mixed continuous and discrete data.
Local linear least square (LLLS) method is almost the same with LCLS method,
which is about the weighted least square estimate with weights determined by kernel
functions and bandwidths, but more weight is given to the point than LCLS.
The respective bandwidths
of the variables can be computed through cross-
validation (C.V.) method. When the LCLS method bandwidths reach its upper
bound, the kernel weighting function of that bandwidth becomes a constant; the
variable belonging to the bandwidth is essentially detected ( Hall et al. 2007 ). And
LLLS method bandwidths reach the upper bound, the ordered or unordered discrete
data should be detected, but for the continuous data, it indicated they should model
in a linear way.
Hall et al. ( 2007 ) suggest two standard deviations of the variables is an upper
bound for identification of relevant and linear. For the discrete data, if the bandwidth
reaches its upper bound one, these variables should be detected from our model.
We adopt the variable detection steps recommended by Henderson et al.
( 2011 ):
(
h
, λ )
Step 1: Identification of the relevant variables or determinant using LCLS
Step 2: After detection of the irrelevant variables through Step1, identification of
the mechanism (linearity or not) of the relevant variables using LLLS
Search WWH ::




Custom Search