Advanced 3D/Video Compositing - Mastering Blender

Graphics Reference

In-Depth Information

Camera Tracking from Video

In order to composite 3D content into a video like this one, it's necessary to create an accurate representation

of the camera's movement within Blender. When you render out the 3D content using Blender's virtual camera,

the placement and rotation of the 3D content in the frame must match the placement and rotation of where the

content should be in the live-action video.

Thiscanbecalculatedbytherelative2Dmovementof2Dpointsinthevideothatcorrespondtopointsinthe

real 3D space. Specifically, the phenomenon of parallax is used. For the same lateral camera movement, points

nearer to the camera will move farther across the screen than points more distant from the camera.

I specifically mention lateral camera movement, because certain common camera movements do not have

this characteristic of parallax. In particular, tripod pans, where the camera rotates at a fixed point, do not yield

parallax information. Imagine standing in a park full of trees. Your friend hides behind a tree where you cannot

see her. In order to catch a glimpse of your friend, you must move laterally. If you only rotate in place, you will

not see your friend. In fact, if you had friends hiding from you behind every tree in the park, you would never

find any of them by rotating like a camera on a tripod. Only lateral motion increases the spatial information

available to you.

Thesameistrueofcameratracking.Inordertore-createa3Denvironmentautomatically,thevideoprovided

must include parallax information, i.e., lateral movement. This isn't a major problem. If you have a shot that is

purely a tripod pan, you don't actually need to do camera tracking. Simply compositing 3D content as though

you were working with a still image will work fine. Blender can handle shots that have both lateral movement

and panning, but it does not get the parallax information necessary to re-create the space from the panning

movement.

The points used for tracking are represented in the Clip Editor as tracking markers. These are set by hand or

automatically to correspond with recognizable points on the video. When I say “recognizable,” I mean points

that Blender's computer vision pattern-recognition algorithm can identify as being the same feature from frame

to frame. Features are real-world things that the algorithm attempts to recognize. A bunch of black pixels sur-

roundedbyafieldofyellowpixelsinframe35islikelytobethesamefeatureasasimilarbunchofblackpixels

surrounded by yellow in frame 36, even if the specific pixels are different because the feature moved.

TherearebasicallytwowaystogoaboutcameratrackinginBlender.Youcanuseautomaticfeaturedetection

and then correct the many errors by hand, or you can do feature selection by hand, which should result in fewer

errors from the start but will proceed more slowly to build up a sufficiently large feature set. Both methods can

be painstaking, depending largely on the content of the video you're trying to track. I will describe using hand-

selected features.

Anatomy of a Good Feature

A good feature should be recognizable in 2D. That is, it should be composed of contrasting pixels in the image.

It must also represent a specific, unique 3D point in the real world. For example, an intersection between two

overhead cables might form a single point in a 2D image, but if the cables are separated in space, the intersec-

tion does not represent a unique 3D point. This would not be a good feature. The kind of intuition and common

sense a human being can use when selecting features to track is the advantage of selecting features by hand.

Figure 10-6 shows several more examples of good tracking points (on the left) that represent specific 3D points

and bad tracking points (on the right) that will change with the camera movement because they do not corres-

pond with a real 3D point. Specular highlights and curved surfaces can also be a source of bad features.

Search WWH ::

Custom Search

Home