Classification of Video Sources (Video Search Engines)

As we discuss video search, it is important to keep in mind that the nature and quality of video varies widely depending on the application. The value of video can be difficult to judge; we can assess this on many dimensions such as image quality, or more subjective aspects such as educational, entertainment or historic value. Cost is more quantifiable, in fact, we can think of a “production cost spectrum” as shown in Fig. 1.1, where level of effort or cost of production vary from almost nothing to perhaps thousands of dollars per minute of final product for broadcast television content. Major motion picture costs can run even higher, particularly if we factor in the cost of promoting the project. Clearly, this huge range in content value has significant implications for Web search engine systems – it affects the content, quality, encoding, and availability of metadata and affects the degree to which automated methods can be employed to generate additional metadata to create index data.

Video production costs vary widely depending upon the application.

Fig. 1.1. Video production costs vary widely depending upon the application.

Webcams / Security

At the low end of the video production cost spectrum is content from automated cameras such as security or Webcams. These systems typically have some level of operator control, but the operator controls a large number of cameras, often from a remote location. As a result, they rely on automatic gain control (AGC) and are often oriented poorly with respect to available natural lighting. Optics and processing circuitry are low cost, resulting in poor imagine quality. Often the effective frame rate for these systems is reduced, perhaps even extremely so – to on the order of one image saved per minute. There is typically little or no camera motion, although some panning systems may be employed, and some views, such as from traffic cameras, may be affected by wind causing undesirable camera oscillations.

Video Telephony / Teleconferencing

Video telephony typically employs low cost terminal equipment, and relates to search when we consider video-mail systems. Semi-automatic systems are available for video conferencing that may include automated camera controls to follow the most active speaker, or remote and local camera control using a motorized pan, tilt, and zoom. Higher end conferencing systems feature high definition cameras and monitors, but image quality still suffers from poor room lighting conditions and lack of camera operators.

Industrial / Academic / Medical

Specialized systems for machine inspection for manufacturing quality control can run 24 hours a day, but much of the video is not stored. Video from ultrasound or other medical diagnostic equipment can be costly to produce due to equipment costs and skilled technicians or staff required. Remote sensing (satellite or high altitude reconnaissance) video may be very high resolution, and include telemetry data. Like other scientific applications such as microscopy, this video may be largely two dimensional, with little depth.

User Generated Content

Perhaps the lowest production cost for any manually created video is referred to as “user generated” content such as from mobile phone capture or from consumer-grade digital cameras. The cost of entry for these systems is miniscule – in some cases wireless service providers give users camera phones for free with a subscription. Of course, the cost per minute of video is related to the service charges, but for digital cameras, there is essentially no cost after the camera is purchased. While most users simply share or perhaps store videos in their personal media collections, some will go the next step and edit the clips into more palatable presentations. Free editing tools are available for both Windows and Mac platforms. The learning curve is very short and users can easily add transitions, titles, etc. Video editing is one of the most resource intensive applications, but most recently purchased PCs are up to the task. We are also seeing the emergence of online editing tools which remove the requirement of a powerful client PC since the editing is done on the server. Again, the cost per minute of produced video is low, but this assumes that the authors’ time invested in editing the video is not valued. Video Blogs or Vlogs are typically amateur-produced content on a recurring basis and published to the Web, often with text commentary and these also fall into this category.

Public Access and Government (PEG) Content

In the US, local governments and community organizations such as high schools are given access to one or more cable channels to broadcast events for the benefit of the community and for educational purposes. Usually the production staff are not video professionals, and may even be students of broadcasting. The equipment may be better than consumer-grade, but it is semi-professional at best. This keeps costs low, but between this and the lack of experienced staff, it results in low quality output.

Enterprise Content

Corporations are increasingly using video as an additional means of employee communications as well as for training, and public relations purposes. The production may be outsourced or handled by a dedicated group. The content is produced using semi-pro or pro equipment (also known as “Pro A/V”) but often with a small staff serving multiple roles in the production process.

Rushes, Raw Footage

Professionally produced video relies on a formal workflow process, one stage of which involves creating several shots of each scene. The footage from this stage of the process serves as the raw material for the editing process. The quality is usually good, since professional grade cameras and good scene lighting techniques result in low noise images, and professional camera operators know to avoid the mistakes made by armatures (rapid unstable camera motion, automatic gain control artifacts, etc.) There is often a 10 to 1 ratio or higher of this content to the final product.


National news is expensive to produce and is typically of high technical quality, but due to demanding production schedules and live coverage, as well as the lower production budgets available for local news production, some artifacts may be present in the output. In fact, in some cases broadcasters may use low quality video from low bit rate links for feeds from extremely remote locations.


Promotional video takes a wide range of forms, from the familiar 30 second spots all the way up through one hour infomercials to 24 hour shopping channels. In addition to marketing, public relations groups in corporations use video as an effective tool to get their message out. Archives and databases of advertising content are used for competitive analysis by corporations. While TV viewers loathe most ads, some have entertainment value, and the notion of targeted ads or telescoping where interested users interactively delve deeper into ads of interest may reduce the stigma of TV ads somewhat.

Episodic TV Programming

This category includes primetime entertainment programming, comedies, dramas, game shows, soap operas, etc. Within this category, there is a range of production costs with the assumption that better programming costs more, but generates more viewers and therefore more advertising revenue. Increasingly, we find that episodic content is made available via DVD for purchase (generally released such that only the previous seasons’ episodes are available.) Content may be funded by subscription, publicly subsidized, or commercial – in which case the narrative flow will be interrupted with commercial breaks. For most commercial TV news and episodic programming, the entire program format and sequencing is driven by the placement and duration of commercial messages.

Feature Films

Again, there is a range here from independent (“indy”) films or documentaries which may have a very low budget, all the way up to Hollywood movies such as Titanic which cost $200M or about $1M/per minute. In addition to major motion pictures for theatrical release, there is the second tier, with somewhat lower associated costs, such as made-for-TV movies and movies released on DVD-only. It is interesting to note that “digital cinema” is being developed for digital distribution and projection of movies, but the expensive installed base of film projectors as well as other factors has slowed its deployment.

Content Value

Within many of these types, there is a range of purposes: to inform, to entertain, and to persuade. The content reflects on its creators as well as its intended audience, their culture and value systems. Continually, creative thinkers strive to build novel experiences for audiences. Therefore these general classes should be understood to be only approximate, to give the reader a flavor of the range of video material encountered, and to provide an appreciation of the scope of the problem domain for video search engines.

Although we can estimate production costs for these content types, estimating the value for the user is more difficult. Security footage is largely valueless except for rare instances when a criminal act is captured, and then the value can have enormous impact. Home video content may be quite valuable for immediate family members, but of little value for anyone else. If a person gains celebrity, then video from their childhood may be of great interest for large audiences. News archives have incalculable historic value. In general terms, we may assess a searchable video collection on these merits:

1.    production cost and percieved value of the contentl;

2.    size and breadth of coverage of the content archive;

3.    size of the audience interested in the content;

4.    motivation for search (entertainment, research, forensic, etc.);

5.    degree to which the content is accessible (on line, either open or restricted);

6.    video quality (resolution, bit rate).

It is also important to consider the value of automated indexing systems, and here we draw a distinction between using media processing to derive information about the video contents and using manual methods to create this data. Even though it may be of great value to spot a terrorist in 10,000 hours of airport security camera footage, if there are no reliable algorithms to perform this search, then we cannot realize this potential value. Also, manually created metadata may be available to different degrees for each of these content types either via logging production data (e.g. the text of the titles typed into a consumer video editor) or by annotating postproduction information such as with Major League Baseball statistics. For manually extracted data, a special purpose database is constructed, while a search engine must derive common tags from a wide range of content sources – and currently this metadata normalization is not a fully automated process. High value content benefits less from automated media logging or metadata extraction while for lower production budget content, these automated methods are more valuable since manual labeling is not practical. Since the value of the content falls off at lower production costs, the center of the production cost spectrum, semi-professional or enterprise content, represents an area of opportunity for video systems research [Chang02].

Next post:

Previous post: