Biology Reference
In-Depth Information
> cv.pred.all = lapply(1:dim(data)[2],
+
function(gene) {
+
y = data[-c(1), gene]
+
lasso.cv = cv.lars(y = y, x = x,
+
mode = "fraction")
+
frac = lasso.cv$index[which.min(lasso.cv$cv)]
+
predict(fit.all[[gene]], s = frac,
+
type = "coef", mode = "fraction")
+})
> cv.pred.all[[1]]$coefficients
> cv.pred.all[[2]]$coefficients
> cv.pred.all[[3]]$coefficients
> cv.pred.all[[4]]$coefficients
(e) We can conclude that the LASSO is not selective enough when there are too few
variables. In this case (4 variables and 18 time points), the classic VAR process
inference procedure provided by the vars package is more appropriate.
3.2 Consider the arth800 data set from the GeneNet package, which we
analyzed in Sects. 3.5.2 and 3.5.3 .
(a) Load the data set from the GeneNet package. The time series expression of
the 800 genes is included in a data set called arth800.expr . Investigate
its properties using the exploratory analysis techniques covered in Chap. 1 .
(b) For this practical exercise, we will work on a subset of variables (one for
each gene) having a large variance. Compute the variance of each of the 800
variables, plot the various variance values in decreasing order, and create
a data set with the variables greater than 2 .
(c) Can you fit a VAR process with a usual approach from this data set?
(d) Which alternative approaches can be used to fit a VAR process from this
data set?
(e) Estimate a dynamic Bayesian network with each of the alternative ap-
proaches presented in this chapter.
(a) > library(GeneNet)
> data(arth800)
> summary(arth800.expr)
> dim(arth800.expr)
The data contains 2 sets of 11 time points.
(b) > variance = diag(var(arth800.expr))
> plot(sort(variance, decreasing = TRUE),
+ type = "l", ylab = "Variance")
> abline(h = 2, lty = 2)
> posVar2 = which(variance > 2)
> dataVar2 = arth800.expr[, posVar2]
> dim(dataVar2)
Search WWH ::




Custom Search