Privacy Protection and Face Recognition (Face Recognition Applications) Part 2

Institutional Databases

Increasingly in recent years, governments and corporations have sought to harness Information Technology to improve efficiency in their provision of services, to prevent fraud and to ensure the security of citizens. Such developments have involved collecting more information and making that information more readily available to searching and through links between databases. Silos of information, collected for an authorized process are readily accepted for the benefits they bring, but the public becomes more uneasy as such databases succumb to “function creep”, being used for purposes not originally intended, especially when several such databases are linked together to enable searches across multiple domains. Plans for Australian identity cards were rejected because of just such fears [17] and there was a significant backlash when retired Admiral John Poindexter conceived the “Total Information Awareness” (TIA) project [43] which aimed to gather and mine large quantities of data, of all kinds, and use these to detect and track criminals and terrorists. The Orwellian potential for such a project raised an outcry that resulted in the project being renamed the Terrorist Information Awareness project, an epithet calculated to stifle objection in post-September 11th America.

Naturally, faces are an important part of such electronic databases allowing the verification of identity for such purposes as border control and driver licensing, but registered faces provide a link between definitive, exploitable identification information such as name, address, social security number, bank accounts, immigration status, criminal record and medical history and the mass of images of individuals that is building up from other channels like surveillance and photo-sharing.4 Many authors, from Bentham [5] to the present have expressed concern about the potential for state oppression by the exercise of extensive monitoring and the projection that such monitoring is pervasive if unknowable.

The widespread use of electronic records and their portability has led to numerous cases of records being leaked or lost, and their potential value for identity theft has made them a target for theft and hacking, from within as well as outside the controlling institution. This inadvertent exposure is a major reason for strong automatic privacy protection controls such as encryption, tight access control and image redaction even in databases where normal use would not lead to privacy intrusion.

Technology for Enabling Privacy

In recent years, a number of technological solutions have been proposed for the general problem of privacy protection in images and video, and for face privacy protection in particular. In this section, we review the principal methods being developed: intervention, redaction, and provably secret processing, together with a discussion of privacy policies and tokens for claiming or relinquishing privacy protection.

Intervention

Patel et al. [41] have proposed a system that prevents unauthorized photography by detecting cameras using their retro-reflective properties. In their detection system, a bright infra-red source is located near a camera. If the lens of another camera is pointed toward the detector, a strong retro-reflection is seen in the image, which can easily be detected automatically. When a camera is detected, a light is flashed towards it using a digital projector, spoiling any images that it may record. This unusual approach, dubbed an “anti-paparazzi” device, exploits computer vision to create a privacy-protection solution where no control can be exerted over the use of the images once recorded. As well as privacy protection, the system is envisaged for copyright protection, for instance to prevent recording of new release films in cinemas.

Visual Privacy by Redaction

Most recent work on visual privacy protection has focused on systems that modify, or redact, visual images to mask out privacy sensitive information. Such systems typically use computer vision technology to determine privacy sensitive regions of the image, for instance tracking moving people in a video [51], or detecting faces [22] in still or moving images. Such regions of interest are then changed in some way to prevent subsequent viewers or algorithms from extracting the privacy sensitive information. Obscuration methods that are commonly used include blurring, masking, pixellating [27], scrambling [18], or permuting pixels [12]. Recent work has investigated the limitations of some of these, for instance Gross et al. [25] show that simple pixellation and blurring may not be strong enough to defeat face recognition systems. They train a parrot [37] recognizer on gallery images with the same distortion as the probe and obtain markedly higher recognition rates than using a system trained on clean images. Neustaedter et al. [35] have also found global blurring and other obscuration techniques to be unable to supply simultaneously both sufficient privacy and adequate information for always-on home video conferencing. Koshimizu et al. [31] have explored the acceptability of different obscuration and rerendering techniques for video surveillance.

Stronger masking with greater changes to the image may have the limitation of reducing the usability of the video for its intended purpose, but rerendering [51] may alleviate this by showing computer generated images to convey important information hidden by the redaction process. One example of this would be to obscure a person’s face in an image with a computer generated face—hiding the identity yet preserving the gaze direction and expression. Two extensions of this using face modeling are described in Sect. 27.5.2.

One important aspect of redaction systems is reversibility. It may be desirable for some purposes to permanently destroy the privacy-intrusive information, but for others it may desirable or necessary, perhaps for evidential reasons, to be able to reconstruct the original video.

When redacted information is to be presented to the user, one fundamental question is what redaction is necessary and when the redaction is to occur. In some scenarios, the system may need to guarantee that redaction happens at the earliest stage, and that unredacted data is never accessible. For such a scenario, we proposed the PrivacyCam [52], a camera with on-board redaction that behaves as a normal video camera but outputs video with privacy-sensitive areas redacted. Only one level of redaction at a time is possible when such a system is a drop-in replacement for an analogue camera. However for a general system, it may be necessary to present both redacted and unredacted data to end users according to the task and their rights, and to allow different types and extents of redaction according to the circumstances.

In a distributed surveillance system, there are three principal locations through which the data must pass: the video processor, database and browser (or end-user application), at each of which the redaction could take place:

Browser: Here the unredacted data is delivered to the client and client software carries out the redaction and presents the redacted information to the user. This scenario means that redacted data does not need to be stored and transmitted but metadata for redaction does need to be transferred with the raw data. Since the browser is the part of a system most exposed to attack, transmitting the unredacted data there is not secure.

Fig. 27.1 Double redaction: Video is decomposed into information streams which protect privacy and are recombined when needed by a sufficiently authorized user. Information is not duplicated and sensitive information is only transmitted when authorized. Optionally an unmodified (but encrypted) copy of the video may need to be securely stored to meet requirements for evidence

Content management: The content management system can redact the information when requested for viewing, which will minimize storage requirements and allow complete flexibility, but involve additional processing (with the same keyframe perhaps being redacted multiple times), latency and imposes image modification requirements on the database system. If the unredacted video is stored, unauthorized access can reveal the privacy-intrusive data.

Video analytics: The video analytics system has access to the richest information about the video activity and content, and thus can have the finest control over the redaction, but committing at encoding time allows for no post-hoc flexibility. In the other two scenarios, for instance, a set of people and objects could be chosen and obscured on-the-fly. Sending redacted and raw frames to the database imposes bandwidth and storage requirements.

Double redaction: Perhaps the most flexible and secure method is double redaction [50], in which privacy protection is applied at the earliest possible stage (ideally at the camera), and privacy-protected data flows through the system by default. Separate encrypted streams containing the private data can be transmitted in parallel to the content management system and to authorized end users, allowing the inversion of the privacy protection in controlled circumstances. The operating point of the detection system can even be changed continuously at display time according to a user’s rights, to obscure all possible detections, or only those above a certain confidence. Figure 27.1 shows an example of such a double redaction scheme, with two levels of redaction.

Several authors [28, 40] have adopted such a double redaction system and have explored options for embedding or hiding additional data streams in the redacted video data, for instance Zhang et al. [64] store the information as a watermark, Car-illo et al. [12] use a reversible cryptographic pixel permutation process to obscure information in a manner that can be reversed, given the right key, and that is robust to compression and transcoding of video. Li et al. transform sensitive data using the

Discrete Wavelet Transform, preserving only low frequency information and hide the encrypted high-frequency information in the JPEG image.

Cryptographically Secure Processing

A recent development in visual privacy protection is in the development of cryptographically secure methods for processing images. These methods establish protocols by which two parties can collaborate to process images without risk of privacy intrusion. In particular, if one party owns images and another party has an image processing algorithm, algorithms such as “Blind Vision” [3] allow certain algorithmic operations to be carried out by the second party on the first party’s data without the data itself or the algorithm being made available to the other party. Such systems have been applied to the problems of face detection and recognition, as will be discussed in Sect. 27.5.3.

Privacy Policies and Tokens

An important aspect of privacy protection systems is the policies for determining what data needs to be obscured for which users. As we have seen in Sect. 27.2, privacy systems may need to operate in different modes according to different factors including the roles, authorization and relationship of the observer and the observed. Determining privacy policies is a complex area, made more so when detection of the privacy sensitive information is not reliable.

Brassil [9] and Wickramasuriya et al. [62] explore the use of devices (detected through separate sensors) that can be used to claim privacy protection in public cameras, and Schiff et al. [49] use visual cues (hats or jackets) to designate individuals whose faces are to be redacted, or preserved from redaction.

Systems for Face Privacy Protection

In this section, we describe three approaches that have been specifically designed for privacy protection of face images.

Google Street View

As mentioned in Sect. 27.3.4, Google’s Street View and similar sites present particular privacy challenges with their vast coverage of street-level public imagery. Frome et al. [22] describe the system that they developed to address this problem.

They highlight both the scale of the problem and the challenging nature of their “in the wild” data. They use a standard sliding-window face detector that classifies each square region of each image as face or non-face. The detector is applied with two operating points, and the results are combined with a number of other features (including face color and a 3D position estimate) using a neural network to determine a final classification of the region as face or non-face. All face detections are blurred in the final imagery that is served in Google Maps. A similar system is used to detect and blur license plates.

They describe the criteria used for choosing the redaction method, that it should be: (1) irreversible; (2) visually acceptable for both true faces and false positives; (3) makes it clear to the public that redaction has taken place, a requirement that precludes the use of rerendering techniques from the next section. To meet these requirements, the authors choose to redact the faces with Gaussian blurring and the addition of noise.

De-identifying Face Images

Coutaz et al. [15], described a system for preserving privacy in the CoMedi media space which gives remote participants a sense of co-presence. The system offered shadowing and resolution lowering redaction methods [27] for privacy protection but also used eigenspace filtering for face redaction. In this technique an eigen-face [59] representation is constructed using a training set of face images, and faces detected in the mediaspace video are projected into that eigenspace before rerendering. This effectively constrains the rendered face to conform to variations seen in the training set, and obscures other kinds of appearance differences. While this can protect privacy in some ways, such as hiding socially incorrect gestures or expressions, it is also shown to have limitations. The choice of the correct model and the corresponding training set is crucial. Using a mismatched model may unintentionally change the identity, pose, or expression of the face.

In several papers, Sweeney and collaborators [23-25, 36] have described a series of algorithms for de-identifying faces that extend this eigenface approach, tackling the problem of identity hiding. They use the Active Appearance Model [14], face representation which normalizes for pose and facial expression. Their algorithms create a weighted average of faces from different people, such that the resulting face is not identifiable. This deidentification is termed k-same in that it results in a face whose chance of being correctly identified is no more than 1. In their more recent work [25], they use a multifactor decomposition in their face representations that reduces blending artifacts and allows the facial expression to be preserved while hiding the face identity. They also consider the application of this in a medical video database, showing patients’ responses to pain, in which facial expression, not identity, is important.

Blind Face Recognition

As described in Sect. 27.4.3, a new field of research is cryptographically provable privacy preserving signal processing, or “Blind vision”. Recent work has applied this to face detection [4] and recognition algorithms. Erkin et al. [19] describe a secure implementation of an eigenface face recognition algorithm [59]. Their system performs the operations of projecting a face image onto the eigenvectors of “face subspace” and calculating the distances to each of the enrolled faces, without the querying party, Alice, having to reveal the query image, nor the owner of the face recognizer, Bob, having to reveal the enrolled faces. Such a secure multiparty computation can be very laborious and time consuming, with a single recognition taking 10-20 s, though speed-ups have been proposed [47].

Delivering Visual Privacy

The technological tools of the previous section can help to prevent privacy intrusion from image-based applications, but as we have seen they form only part of a privacy solution, along with information security and privacy policies. To ensure that the privacy benefits are delivered effectively, two further factors must be considered— ensuring that systems are used where appropriate and operate effectively when installed.

Operating Point

Video information processing systems are error prone. Perfect performance can not be guaranteed, even under fairly benign operating conditions, and systems make two types of errors when determining image regions for redaction: missed detection (of an event or object) and false alarm (triggering when the event or object is not present). We can trade these errors off against one another, choosing an operating point with high sensitivity that has few missed detections, but many false alarms, or one with low sensitivity that has few false alarms, but more often fails to detect real events when they occur.

The problems of imperfect image processing can be minimized by selecting the appropriate system operating point. The costs of missed detection and false alarm can be quite different, as seen in Sect. 27.5.1, where not blurring a face reveals private information and blurring a non-face degrades the quality of the information provided. In a surveillance system, the operating point for privacy protection may be chosen differently than for general object detection for indexing. Given the sensitive nature of the information, it is likely that a single missed detection may reveal personal information over extended periods of time. For example, failing to detect, and thus obscure, a face in a single frame of video could allow identity information to be displayed and thus compromise the anonymity of days of aggregated track information associated with the supposedly anonymous individual. On the other hand, an occasional false alarm and unnecessary redaction may have a limited impact on the effectiveness of the installation. The operating point can be part of the access-control structure—greater authorization allows the reduction of the false alarm rate at a higher risk of compromising privacy. Additional measures such as limiting access to freeze-frame or data export functions can also reduce the risks associated with occasional failures in the system. For some applications it will be some time before algorithms are accurate enough to deliver an operating point that gives useful privacy benefits without degrading the usefulness of the data provided.

Even with perfect detection, anonymity cannot be guaranteed. While face recognition is the most salient identifier in video, a number of other biometrics such as face, gait or ear shape; and weak identifiers (height, pace length, skin color, clothing color) can still be preserved after face redaction. Contextual information alone may be enough to uniquely identify a person even when all identifying characteristics are obscured in the video. Obscuring biometrics and weak identifiers will nevertheless reduce the potential for privacy intrusion. These privacy-protection algorithms, even when operating imperfectly, will serve the purpose of making it harder, if not impossible, to run automatic algorithms to extract privacy-intrusive information, and making abuses by human operators more difficult or costly.

Will Privacy Technology Be Used?

The techniques described in this topic could be considered as optional additions to systems that display images—that will cost more and risk impinging on the usefulness of the systems, while the privacy protection benefits may accrue to stakeholders other than the service provider or the primary users. We must then ask why providers of image-based services will choose to bear the extra burden of implementing privacy protection technologies, even when the technologies are fast and accurate enough to be practically deployed. Clearly in many cases companies will choose to implement them as being the “right thing” to do, out of concern for protecting privacy, and for guarding their good name. Others may be pressured by the public, shareholders or customers to apply such technologies, or be asked to do so by privacy ombudsmen. Finally explicit legislation may be implemented to require such technologies, though creating manageable legislation for the nebulous area of privacy is extremely difficult. Existing legislation in some jurisdictions may already require the deployment of these techniques in domains such as surveillance as soon as they become feasible and commercially available.

Even when privacy protection methods are mandated, compliance and enforcement are still open to question, particularly in private systems such as medical images and surveillance. McCahill and Norris [33] estimated that nearly 80% of CCTV systems in London’s business space did not comply with current data protection legislation, which specifies privacy protection controls such as preventing unauthorized people from viewing CCTV monitors. Legislating public access to surveillance systems as proposed by Brin [10] is one solution, but that still begs the question—are there are additional video feeds that are not available for public scrutiny? A potential solution that we have proposed [52] is certification and registration of systems, along the lines of the TRUSTe system that evolved for Internet privacy. Vendors of video systems might invite certification of their privacy-protection system by some independent body. (In the US, the Federal Trade Commission Act5 has the power to enforce companies’ privacy policies.) For purpose-built devices with a dedicated camera sensor (like PrivacyCam, Sect. 27.4.2), this would suffice. Individual surveillance installations could also be certified for compliance with installation and operating procedures, with a certification of the privacy protection offered by the surveillance site prominently displayed on the equipment and CCTV advisory notices. Such notices might include a site (or even camera) identification number and the URL or SMS number of the surveillance privacy registrar where the site can be looked up to confirm the certification of the surveillance system. Consumer complaints would invoke investigations by the registrar, and conscientious companies could invite voluntary inspections.

Conclusions

As cameras and networking have become cheaper and ubiquitous, there has been an explosion in the dissemination of images, for new and traditional applications from photo-sharing to surveillance and medical imaging. With this explosion, there has been a corresponding increase in the potential for privacy-intrusive uses of those images. Thus far, controls on such privacy intrusions have been very limited. We have examined how images in different domains can contain sensitive information, particularly images of faces that allow individuals to be identified. We have described ways in which that information can be obscured by redaction, based on computer vision techniques to identify regions of interest, and image processing techniques to carry out the redaction in a secure, possibly invertible, manner. Finally, we have described three particular systems that have been used to apply privacy preserving techniques to face images and explored ways in which such privacy protection techniques can be deployed and might become more widespread.