Environmental Engineering Reference
In-Depth Information
rst, the order in which the different
explanatory variables are included in the regression
This allows for two interpretable outputs:
is
meaningful. It gives an indication on which variables contribute most to explaining
the regressant. 15
Second, the size and sign of the coef
when reducing lambda
cients of a
'
best
'
model can be interpreted.
We de
ne the best model as the model that best performs an n
1 prediction
exercise. That is, we do not focus on maximising the goodness-of-
t, but want to
minimise the forecasting error. This allows an indication which combination of
factors is best able to predict patenting and whether these factors have a positive or
negative impact on the prediction. The standard Lasso does not come with an easy
way to calculate the standard errors of the coef
cient estimates, and a Bayesian
approach would help in this regard. In any case, it is interesting to see which
variables are most effective in explaining the variation in the explained variable,
and in which direction this variation appears.
In order to make the results more easily interpretable, all variables are stand-
ardised. Also, model selection is restricted to models with at most 25 explanatory
variables.
We present the result for solar in Table 4 . The Lasso algorithm only selects 11
out of the 47,000 variables as being most relevant for predicting solar patenting
behaviour.
The rst, observation is that rdd_solar and rdd_res, i.e., the spending on
RD&D for solar and the spending on RD&D for all renewables have a measurable
effect. The delay with which rdd_solar increases patenting appears to be 3
4 years.
A second observation in that pat_total is important. We interpret this variable as
a control for the overall patenting activity in a country/year.
The third important variable is market size. If dep_total is large, the impact of
rdd_solar on patenting gets bigger.
The stability of the above-presented results is con
-
rmed by a plot of the coef-
cients selected by the Lasso for a range of lambdas (Fig. 8 ).
For wind, a larger number of variables have been included in the estimation by
the Lasso.
Again, total patenting (pat_total) is controlling for the general propensity to
patent in a given country in a given year. And patenting in solar (pat_solar) seems
even better suited to control for the propensity to patent in (renewable?) energy
technologies.
Also, RD&D spending on wind technology seems to encourage patenting in this
area. We
nd rather long and disperse time-lags for the effect of RD&D on
15
for some function f and some
constant c. With the ridge, f is the sum of the squares of the coef cients. Hence in the Ridge, all
coef cients are non-zero, but a larger value is assigned to the coef cient that helps reducing the
SSR the most. With the Lasso, f is the sum of the absolute values of the coef cients. Thus again,
we obtain larger beta for variables that help reducing SSR, but in addition, the least signi cant
coef cients are forced to 0.
For shrinkage estimators such as ridge or lasso
'
f(betas) < c
'
Search WWH ::




Custom Search