Managing and Querying Encrypted Data - Database Security: Applications and Trends

Databases Reference

In-Depth Information

cipher. Alice first generates a sequence of pseudo-random values s 1 ,...,s l us-

ing a stream cipher, where each s i is n

m bit long. For each pseudo-random

string s i , Alice computes a pseudo-random function F k c ( s i ) seeded on key k c

which generates a random m -bit sequence 5 . Using the result of F k c ( s i ), Alice

computes a n -bit sequence t i := <s i ,F k c ( s i ) > , where <a,b> denotes con-

catenation of the string a and b ). Now to encrypt the n -bit word w i , Alice

computes the XOR of w i with t i , i.e., ciphertext c i := w i ⊕

−

t i . Since, only Alice

generates the pseudo-random stream t 1 ,...,t l so no one else can decrypt c i .

Given the above representation of text document, the search mechanism

works as follows. When Alice needs to search for files that contain a word w ,

she transmits w and the key k c to the server. The server (Bob) searches for w

in the index files associated with documents by checking whether c i ⊕

w is of

the form <s,F k c ( s ) > . The server returns to Alice documents that contain

the keyword w which can then be decrypted by Alice.

The scheme described above provides secrecy if the pseudo-random func-

tion F , the stream cipher used to generate s i , and the encryption of the

document D are secure(that is, the value t i are indistinguishable from truly

random bits for any computationally bounded adversary). Essentially, the ad-

versary cannot learn content of the documents simply based on ciphertext

representation.

While the approach described above is secure, it has a fundamental limi-

tation that the adversary learns the keyword w i that the client searches for.

The search strategy allows the adversary to learn which documents contain

which keywords over time using such query logs. Furthermore, the adversary

can launch attacks by searching for words on his own without explicit autho-

rization by the user thereby learning document content.

A simple strategy to prevent server from knowing the exact search word is

to pre-encrypt each word w of the clear text separately using a deterministic

encryption algorithm E k p , where the key k p is a private key which is kept

hidden from the adversary. After this pre-encryption phase, the user has a

sequence of E -encrypted words x 1 ,...,x l . Now he post-encrypts that sequence

using the stream cipher construction as before to obtain c i := x i ⊕

t i , where

x i = E k p ( w i )and t i = <s i ,F k c ( x i ) > . During search, the client, instead of

revealing the keyword to be searched, Computes E k p ( w i ) with the server.

The proposed scheme is secure and ensures that the adversary does not

learn document content from query logs. The scheme is formalized below.

5 Pseudo-random functions: A pseudo-random function denoted as F : K F ×

} n and Y denotes

the set { 0 , 1 } m . Intuitively, a pseudo-random function is computationally indistin-

guishable from a random function - given pairs ( x i ,f ( x 1 ,k )) ,..., ( x m ,f ( x m ,k )),

an adversary cannot predict f ( x m +1 ,k ) for any x m +1 . In other words, F takes a

key k ∈ K F the set of keys, a n bit sequence x ∈ X where X is the set { 0 , 1 } n

and returns a m bit sequence y ∈ Y where Y is the set { 0 , 1 } m .

→

Y ,where K F

is the set of keys, X denotes the set

{

0 , 1

Database Security: Applications and Trends

Search WWH ::

Custom Search

Home