Information Technology Reference
In-Depth Information
8.2
Theoretical Background
,
A homography is a mapping between two images of a planar scene P .Let p =( u
v )
ξ
represent the pixel coordinates of a 3D point
P as observed in the normalized
A
B
image plane of a pinhole camera. Let
(resp.
) denote projective coordinates
for the image plane of a camera A (resp. B ), and
{
A
}
(resp.
{
B
}
) denote its frame of
reference. A (3
×
3) homography matrix H :
A →B
defines the following mapping:
p B = w ( H
p A ),where
,
p )= ( h 11 u + h 12 v + h 13 )
/
( h 31 u + h 32 v + h 33 )
w ( H
,
.
( h 21 u + h 22 v + h 23 )
/
( h 31 u + h 32 v + h 33 )
The mapping is defined up to a scale factor. That is, for any scaling factor
μ
= 0,
p B = w (
p A )= w ( H
p A ). The Lie group SL (3) is the set of real matrices
μ
H
,
,
3
×
3
SL (3)=
. If we suppose that the camera continuously
observes the planar object, any homography can be represented by a homography
matrix H
{
H
R
det( H )=1
}
SL (3) such that
K R + tn
d
K 1
H = γ
(8.1)
where K is the upper triangular matrix containing the camera intrinsic parameters,
R is the rotation matrix representing the orientation of
{
}
{
}
B
with respect to
A
, t is
{
}
{
}
the translation vector of coordinates of the origin of
B
expressed in
A
, n is the
normal to the planar surface P expressed in
{
A
}
, d is the orthogonal distance of the
origin of
{
A
}
to the planar surface, and
γ
is a scaling factor:
γ = det R + tn
d
= 1 + n R t
d
3
3
.
Correspondingly, knowing the camera intrinsic parameters matrix K , any full rank
3
3 matrix with unitary determinant can be decomposed according to (8.1) (see
[9] for a numerical decomposition and [18] for the analytical decomposition). Note
that there exist two possible solutions to the decomposition. The planar surface P is
parametrized by
×
n ξ = d
P = { ξ ∈{
A
}|
}
For any two frames
{
A
}
and
{
B
}
whose origins lie on the same side of the planar
surface P then n Rt
> −
d by construction and the determinant of the associated
homography det( H )=1.
The map w is a group action of SL (3) on R 2 :
w ( H 1 ,
w ( H 2 ,
p )) = w ( H 1 H 2 ,
p )
where H 1 ,
SL (3). The geometrical meaning of this property is that
the 3D motion of the camera between views
H 2 and H 1 H 2
{
A
}
and
{
B
}
, followed by the 3D
Search WWH ::




Custom Search