Watermarking Integration into Portals

INTRODUCTION

Digital watermarking has become an accepted security technology to protect media such as images, audio, video, 3-D, or even text-based documents (Cox & Miller, 2002). Watermarking algorithms embed information into media data by imperceptible changes of the media. They enable copyright or integrity protection, broadcast monitoring, and various other applications. Depending on targeted application and media type, various concepts and approaches for digital watermarking exist.

MOTIVATION

So far, watermarking algorithms were encapsulated in individual applications. For example, before customers download an audio book from an online store, the audio book gets watermarked with the customer ID, thus, individualizing the audio book allowing tracing of illegitimate publications back to the original customer (Steinebach & Zmudzinski, 2004). Usually, such applications only deal with a single type of media, and the algorithm is tightly integrated into the application’s workflow. What is missing is the flexibility to watermark a medium of arbitrary type with a number of appropriate watermarking algorithms.

On Web sites of watermarking algorithm developers, one can often find samples of watermarked media to prove the watermarks imperceptibility. But still, when dealing with interested persons that want to use watermarking technology, the most common question from them is: “Does the watermarking process degrade the quality of my media?” The best answer to this is to have them apply the watermarking to their media. But for this, in many cases, a direct contact between the interested person and the watermarking developer is necessary, since the appropriate algorithm and the best configurations have to be determined. Therefore, a simple way of allowing everyone to test watermarking using their own media, whatever they are, would greatly facilitate the acceptance of watermarking.

Portals are the ideal technology for this scenario. A watermarking portal could bring together potential users and algorithm developers. Service functionality, like file upload and graphical information exchange, have long been in place. So such a portal could concentrate on the crucial issues of the watermarking workflow. Alas (as described), watermarking algorithms tend to be not flexible or generic enough to allow watermarking of arbitrary media. This is due to the fact that for watermarking a certain media type, it is necessary to deal with the semantics of this media type and the semantics of, for example, an image and an audio file vary highly.

But still, there is a lot that watermarking algorithms have in common. When these common issues are refined into uniform interfaces, a watermarking portal could be built where developers register their compliant algorithms and the portal offers them to users, enabling them to watermark digital media of arbitrary types.

This article is structured as follows. After a short introduction to the basic concepts of digital watermarking, we will describe a set of challenges one encounters when integrating watermarking as a technology into Web applications using the Fraunhofer Watermarking-Portal as an example. The article is concluded by a summary and some remarks on future trends.

DIGITAL WATERMARKING

Digital watermarking describes the process of attaching information inseparably to a digital medium such as image, audio, video, or text to provide some form of added-value (Sequeira & Kundur, 2001). The attachment of information is known as watermark embedding and the attached information as message (sometimes also simply the watermark). Watermarking algorithms usually slightly alter certain characteristics of the carrier medium (like the relation of energy of frequency bands in audio or average brightness of pixel-blocks in images), so that afterwards they represent the embedded information. Watermark retrieval (or detection) algorithms try to read the embedded information. In order to ensure security and confidentiality, the message is often protected by a secret key and without its knowledge, it is hardly possible to access, alter, or remove the message from the medium.

Watermarking is widely described as a form of communication (Cox, Miller, & McKellips, 1999), with the sender embedding the message into a carrier signal (the medium) and the receiver retrieving the message. From this point of view, watermarking algorithms have important characteristics:

Imperceptibility describes how much (or rather how little) the embedding process perceivably changes the carrier medium. Capacity describes the number of bits that can be transmitted through the watermarking process. Robustness measures how stable the embedded information is against alterations of the carrier medium. Finally, security describes how secure the embedded information is; without the knowledge of a secret key, the embedded information should neither be accessible, alterable, or removable. There are important differences between security of classic cryptographic systems and watermarking systems. A prominent aspect is the fact that attackers without control over a watermark detector can never be sure if they have removed the embedded information (Cayre, Fontaine, & Furon, 2005).

It is important to note that, in contrast to any other form of enriching the medium with information, the product of the embedding process is still a digital medium that can be consumed, transferred, and processed without restrictions.

Applications for digital watermarking range from copyright and integrity protection via broadcast monitoring to simple annotation. For copyright protection, robust watermarking algorithms are used that either identify sender (copyright holder) or receiver (buyer) of a digital medium. Integrity protection uses both fragile algorithms (embedded information is destroyed when medium is altered) and robust algorithms (robustly embedding important features of the original medium). For broadcast monitoring and annotation watermarking, robustness constraints are not so severe; algorithms with high capacity are needed in this situation.

The range of applications and the interplay of algorithm characteristics involved give a hint how complex selection of appropriate algorithms is. A portal as an intermediary between algorithm developers and potential users might lower entrance boundaries. This moves the need for selection and parameterization of algorithms from the users to the experts integrating the watermarking technology into the portal (Thiemert, Steinebach, Dittmann, & Lang, 2003).

A PORTAL APPROACH TO WATERMARKING

This section describes challenges that arise when integrating watermarking technology into portals, and their resolutions. In order to have a more graspable background, we will illustrate challenges and resolutions using an existing portal, the Fraunhofer Watermarking Portal (Fraunhofer IPSI, 2006), a Web application for watermarking personal media. The solutions presented can easily be generalized and most are applicable to any portal trying to integrate watermarking technology.

Each portal has two portal aspects. From a consumer side, it is a portal to watermarking algorithms allowing watermarking of their media. From an algorithm developer side, it is a portal for watermarking algorithms in the sense that arbitrary algorithms (independent of media type or implementation language) should be registrable.

CHALLENGE: GENERIC STRUCTURE

All Web applications, including portals, should have an internal structure that is as generic as possible. This is a rather universal prerequisite that applies, especially in fields that are not consolidated yet, standards are not yet established, and the full impact of the technology cannot be foreseen. Watermarking is such a field. A portal trying to integrate watermarking should therefore be able to react as flexibly as possible to the changes that will surely come. Such changes include, but are not limited to, watermarking algorithms, media processing, and analysis or production workflows.

Resolutions to this challenge are multitude. One established answer to the challenge is the model-view-controller (MVC) paradigm (Burbeck, 1992). Many Web application frameworks have been upon this paradigm; Struts being the most prominent example (Apache Struts). The larger and more powerful the framework, the more complicated its configuration can get. This is not ideal for integrating watermarking. We therefore propose a more lightweight variant customized for a finer control of the request/response flow.

An HttpServlet within a Tomcat application server serves as the FrontController of the Web application accepting the client’s (usually a Web browser) request, and selecting and assembling the response. The Views are JavaServer-pages (JSP) and the Model consists of a large collection of different classes including classes responsible for representing and storing media as well as algorithms. Any interaction with the Web application is modeled as a Process that consists of Steps. A login-process might consist of a Step that displays a login form, which is submitted and validated. Supplying correct credentials lead to an activation Step (after the very first login) or to the welcome Step. Failing to do so, leads back to the Step that displays the login form. In contrast to Struts, Views have no knowledge of the next Steps and thus, only request the next Step of a Process (actually, this is also possible in Struts, but leads to a single “action” with a complex “forward” hierarchy). Each Step has Views assigned to it. Prior and subsequent to displaying Views, WebActions can be performed. They are named Pre- and PostStepActions, accordingly. It is the WebActions that actually define what the application is doing. They define its semantics while Processes and Steps, rather, define its structure. Thus Pre-StepActions can be used to make sure that all requirements for displaying a View are fulfilled. PostStepActions can be used to analyze or parse new requests stemming from Steps they are assigned to. As in other MVC frameworks, Processes and Steps are defined by an XML document (see Table 1). WebActions are the only entities that are allowed to change the model. Views only read from the Model and do not interpret previous requests. This strict separation of concerns also allows displaying nonactive Views (pure HTML or PDF documents) and still be able to process requests and to interact with the model.

Table 1. Example process

CHALLENGE: WATERMARKING FUNCTIONALITY

As outlined in the previous section, there is no single approach to watermarking. What can be done is to discuss generic prerequisites for watermarking. This is briefly outlined from an end-users’ point of view.

Before users can watermark their digital media, the media have to be uploaded first. After uploading, the media are analyzed and crucial information, like type and format of the media, is extracted. This information enables the portal to offer users appropriate algorithms for watermarking their media. After selecting algorithm and which message should be embedded, the actual embedding of the watermark into the medium can be started (see next section). Since embedding (or detection) depending on the media size and complexity of the chosen algorithm might take anything from a few hundreds milliseconds to hours, Embedding- or DetectionJobs should be asynchronous. An EmbeddingJob finally results in a watermarked version of the original medium. Detection of a watermark works similarly with the exception that the outcome of a DetectionJob is the embedded message, which might be text based or simply binary.

CHALLENGE: USING ARBITRARY WATERMARKING ALGoRITHMs

A major challenge is to allow portals the flexibility to process any type of media with arbitrary watermarking algorithms. This challenge concerns the generic functionality described, but primarily, it concerns the integration of watermarking algorithms, in whatever flavor they are available.

For this, we have modeled watermarking as Java classes. Each Algorithm consists of a WatermarkEmbedder and WatermarkDetector, which in turn can come in two flavors: stream-based and URI-based (files are described as URIs). WatermarkEmbedder and -Detector are interfaces that encapsulate the actual watermarking functionality, and which the algorithm developer has to implement. Table 2 shows an example.

Table 2. Streamwatermarkembedder interface

Table 3. Sample algodescription.xml

With Java-based algorithms, implementing such interfaces is relatively simple, but most watermarking algorithms are implemented in C/C++, and come either as libraries or executable files. In order to also incorporate those kinds of algorithms, we have created two implementations of these interfaces, the DllEmbedder and the ExeEmebbder. The Dl-lEmbedder is based on the Java Native Interface and all the C developer has to do is implementing the corresponding header file. The ExeEmbedder first assembles all necessary parameters, thus building a command string that is executed through the system’s command line. This solves execution of (almost) arbitrary algorithms.

But execution alone is not the only challenge. Before an algorithm can be executed, it should be configured. But in contrast to the embedding process itself, which all watermarking algorithms should have in common, parameters cannot be described in such a general way; they are somewhat unique for each algorithm. Therefore, the developers have to be able to specify which parameters their algorithms need and how they are structured. Such problems also arise in the context of benchmarking watermarking algorithms (Dittmann, Lang, & Steinebach, 2002; Kutter & Petitcolas, 1999). Finally, a portal has to be aware of the very existence of available algorithms. This is all resolved by having algorithm developers generate an xml-based description of the algorithm (see Table 3), which is interpreted by a management class called AlgorithmManager.

The.AlgorithmManager parses this description, dynamically loads embedder- and detector-classes, and instantiates them (Java Reflection API). The portal integrating watermarking technology interacts with the embedder/detector interfaces only; no knowledge of concrete algorithms is necessary. The embedder/detector interfaces also include setting specific parameters using name-value pairs. So that after selecting an algorithm, advanced users can be given the choice to configure the algorithm; for novice users, default values are used. The visualization of parameters is no trivial problem, but the algodescriptionxml offers a wide range of predefined data types that canbe used: Choices are visualized as radio buttons, percentages as slider bars, and the text to embed as an edit form. So summarizing, the combination of all these elements in the algodescriptionxml allows registering and dynamically instantiating algorithms (even during runtime!) in any portal, displaying their configurations to users and executing them in a generic way.

CHALLENGE: ALGORITHM SECURITY

The portal approach to watermarking and the associated public access to embedder and detector of the algorithm is a severe issue for the algorithms security: A portal that gives anyone public access to watermarking functionality specified in section “Challenge: Watermarking functionality” (like the Fraunhofer Watermarking-Portal) is a so called oracle, and enables “oracle attacks” (Cayre et al., 2005, also called sensitivity attack) against the algorithm. In this attack against robust watermarking algorithms, the attacker subsequently modifies the watermarked medium and checks with the help of the public detector, whether the embedded watermark is still readable. This allows one to explore weaknesses of algorithms, and to fine-tune attacks against media watermarked with this algorithm, even without knowledge of the secret key. This challenge is difficult and still waits for its resolution. The challenge is also valid for every algorithm that can be bought. Only first steps in answering the challenge have been undertaken:

First of all, users of public portals can be made unaware of their secret key; it needs not to be made public. So attackers do not know the influence of the key to the way the algorithm works. And secondly, access to certain algorithms can be restricted to privileged (i.e., trusted) users.

An alternative to the portal approach described would be to enable users downloading of the watermarking algorithms instead of uploading the media to the portal. But this would reduce the security and the trust in the algorithms: On the one hand, reverse engineering of, and automated robustness attacks against the watermarking algorithms would be possible. On the other hand, the secret user-dependent embedding key would have to be made known to users, allowing them further attacks against the watermarking security. Therefore, the portal approach is not only more convenient, but also more secure.

CONCLUSION

In this article, we have presented a set of challenges that arise when integrating watermarking as a technology into Web applications and portals. The major challenges are the definition of watermarking as a generic process, and the possibility to integrate arbitrary watermarking algorithms, regardless of media type, algorithm characteristics, or implementation language. One resolution is to model the watermarking process as a set of generic interfaces, as done in the AlgorithmManager framework. The portal works with these interfaces only. All algorithm developers have to do is to comply with the interfaces and describe the specifics of their algorithms by an XML-file. The AlgorithmManager parses this XML description and provides instantiations of the generic interfaces to the portal. As an example of the possibilities arising from the integration of watermarking into portals, we have introduced the Fraunhofer Water-marking-Portal, which is a Web-based application built upon a lightweight implementation of the model-view-controller paradigm that allows generic (graphical) access to algorithms for watermarking digital media. It is the first portal where consumers can watermark their own, personal media, and where developers can register their algorithms regardless of type and origin, thus making them accessible to the public.

The solutions described in this article are but a first step for watermarking to become a black-box technology like, for example, encryption. Since watermarking is the only security technology able to close the analogue hole (the security breach that arises by digital-analogue-digital conversion), more systems will need to integrate watermarking into their workflows. Future research will, aside from algorithm enhancement, be about simple configuration of algorithms and expert systems that recommend algorithms and certain sets of watermarking parameters for specific problems.

KEY TERMS

Algorithm Developer: A person creating watermarking algorithms in the language of the AlgorithmManager framework, the entity that registers watermarking algorithms.

Algorithm Manager: A (Java) framework specifying a generic watermarking workflow and interfaces. Also, the central class of the framework, where implementations of interfaces can be registered and accessed.

Algorithm User: A person building an application incorporating watermarking algorithms.

Digital Watermarking: Digital watermarking describes the process of attaching information inseparably to a digital medium (watermark embedding) to provide some form of added value, as well as the process of the retrieval of this information (watermark retrieval). For copyright protection, watermark algorithms in use are usually transparent (watermark is not perceivable by humans) and robust (information survives alterations of the carrier medium).

Watermark Key: A secret information needed to embed, retrieve, alter, or remove the watermark message. In contrast to a cryptographic key, which ciphers the information, a watermark key usually describes in which parts of the medium the information can be found.

Watermark Message: The attached information in digital watermarking; sometimes also the attached information already transformed into the domain of the medium to be watermarked. Watermark messages are usually binary or textual.

Watermarking Parameter: Information used for configuring watermarking algorithms. Watermarking parameters vary significantly from algorithm to algorithm, but are essential for defining a generic watermarking process.