Environmental Engineering Reference
In-Depth Information
10.10.2.2  Coefficient  of  Determination.  The coeffi-
cient of determination, commonly denoted as R 2 , is
defined as
the concentration and the volumetric flow rate, so ran-
domness in the measured concentration and/or volu-
metric flow rate will lead to randomness in the mass
flow rate. When dealing with functions of random vari-
ables, it is usually desirable to be able to relate the
probability distribution of the dependent variable to the
probability distributions of the independent variables.
This relationship is usually complex and can seldom be
determined analytically. However, the relationship
between the moments (including the mean and vari-
ance) of the dependent and independent variables can
usually be approximated, and in some cases estimated
exactly.
Consider a dependent variable y that is a function of
a set of independent random variables x such that
N
(
)
2
y
y
ˆ
i
R
2
= −
i
=
1
(10.139)
N
2
(
y
y
)
i
i
=
1
where y i is the i ith measurement of y , y is the expected
value of y , usually predicted by a linear regression equa-
tion, and y is the average value of the N samples of y .
In cases where y is estimated by linear regression from
a variable x , it can be shown the the coefficient of deter-
mination is equal to square of the correlation coefficient
between x and y , hence
y
= ( x
(10.141)
R
2
=
r xy
2
(10.140)
If the expected value of the set of independent vari-
ables is denoted by x , then the expected value of y ,
denoted by y , is given by
The expression given by Equation (10.139) shows
that R 2 has the following properties:
ˆ
( ˆ )
y
=
f
x
(10.142)
When the linear regression line perfectly predicts
the sample data, all the residuals are zero and
R 2 = 1.
When the variance of the residuals is the same as
the variance about the mean of the data, R 2 = 0.
R 2 is a measure of the fraction of the sample vari-
ance explained using the linear regression line.
The variance of y will generally depend on form of
the function f . The most basic functional forms involve
only addition, subtraction, multiplication, and division,
while functions in general can be much more complex.
Variance relationships for the the basic arithmetic oper-
ations, as well as for some other commonly encountered
functions, are given below.
In general, the statistic R 2 or (equivalently) r xy should
only be used as a broad indicator of the relationship
between variables due to the following shortcomings:
10.11.1 Addition and Subtraction
The general case of addition and/or subtraction of n
random variables can be expressed as
The value of r xy and R 2 (and a linear regression line)
can be very sensitive to a single datum that is much
higher or much lower than other data points.
The value of r xy and R 2 depends on the steepness
of the slope of the relationship being studied,
causing r xy and R 2 to be greater in the case of a
steeper line with the same residuals as a line with
lesser slope.
n
y
=
a
+
a x
i
(10.143)
0
i
i
=
1
where x i are random variables and a i are constant coef-
ficients. It can be shown that the mean and variance of
y , denoted by μ y and σ 2 , respectively, are given by (Kot-
tegoda and rosso, 1997)
10.11 FUNCTIONS OF RANDOM VARIABLES
n
µ
=
a
+
a
µ
(10.144)
y
0
i
i
In many cases, random variables are combined to form
other random variables. This combination is usually
expressed in the form of a function, where the depen-
dent variable is a function of independent variables. If
one or more of the independent variables are random,
then the dependent variable will be random also. For
example, the mass flow rate is equal to the product of
i
=
1
n
n
n
,
σ
2
=
a
2
σ
2
+
a a
σ
(10.145)
y
i
i
i
j
ij
i
=
1
i
=
1
j
=
1
j
i
where μ i and σ i 2 are the mean and variance of x i , respec-
tively, and σ ij is the covariance between x i and x j .
Search WWH ::




Custom Search