Environmental Engineering Reference
In-Depth Information
3 Analysis
We do not possess a theoretical model that explains patenting in a certain tech-
nology in a certain country based on past deployment, RD&D spending and other
variables. 13 While our prior belief is that both, deployment, RD&D spending and
their interaction have all a positive effect on patenting, it is unclear to us how fast
the corresponding inputs might generate innovation and whether this effect is linear
or not. Consequently we decided to rely on a data-driven approach to select the
relevant variables, time lags, operations (such as the logarithm) and interactions. To
select the explanatory variables included in our model we proceed in
ve steps.
First, we create four
of each of the original variables (level, log, square
root and square). Then we include the
'
derivatives
'
ve lags in the set of explanatory
variables. Third, we include all possible partial sums of consecutive lags, such as
the deployment in the past 5 years, or the RD&D spending 3
rst
6 years ago. Fourth,
we include dummies for countries and years. Finally, we create all possible bilateral
interaction terms between all these variables (original variables, derivatives, lags,
partial sums and dummies). For example, one variable is the interaction of
deployment in the last
-
ve years with the RD&D spending 3
6 years ago. This
-
gives us more than 47,000 explanatory variables.
A standard panel regression of 28 countries times 20 years based on about
47,000 explanatory variables (that are suffering almost perfect collinearity) is
obviously unfeasible. To select the explanatory variables that are most useful in
explaining the patenting in certain technologies we employ a penalised regression
approach (see [ 15 ], 14
. Basically, instead of running an
unconstrained optimisation problem (of SSR or likelihood), the Lasso does a
constrained optimisation with a penalty. The Lasso is a particular case of shrinkage
estimator. These are estimators that optimise on a restricted set of values for the
coef
the so-called
'
Lasso
'
cients of the variables. The penalty parameter can be chosen by the researcher,
and controls how large this restricted set is. The particular form of the penalty
function results in sets of different shapes. The Lasso penalty in particular results in
subsets that have a corner at zero in all dimensions. The outcome is that the
optimum is reached with many coef
cients set exactly to 0. Hence, by its con-
struction the Lasso performs a variable selection. Thereby, the larger the lambda,
the more restrictive the variable selection is and the smaller the set of non-zero
coef
cients for all non-zero
variables have been shrunk. While other selection mechanisms that do not apply
shrinkage may be unstable because they are affected by collinearity, the Lasso
overcomes this issue by construction.
cients. In addition to the variable selection, the coef
13
To our knowledge, existing models like
one factor learning curves
,
two factor learning
curves
or Cobb-Douglas patent production functions are not based on theoretical models either.
14 As patents are typically discretely scaled (i.e., 1, 2, 3, ) we base the regression on a Poisson
model.
Search WWH ::




Custom Search