First Principal Components Analysis: A New Side Channel Distinguisher - Information Security and Cryptology-ICISC 2010

Cryptography Reference

In-Depth Information

V i,j = f ( d i ,k j )with1

K .Foreachvalue V i,j ,theat-

tacker computes a hypothetical power consumption value h i,j basedonapower

consumption model. The most commonly used power models are the Hamming

distance (HD) and the Hamming weight (HW) [6]. R being the number of possi-

ble values that the power consumption model could take, the traces are arranged

in X ( X ≤ R ) different partitions for each key hypothesis k j .Wedenotethese

partitions as a vector P k j

≤

T and 1

≤

K .Forin-

stance, suppose that our power consumption model is the HD and that it can

take integral values from 0 to 4: HD =

=( P k j , 1 ,P k j , 2 ,...,P k j ,X )with1

≤

i =1 . The trivial

partitioning is to associate each HD i value to one partition. Thus X = R =5.

One other possibility is to build only X =3 partitions in this way : First partition

for HD > 2, second for HD =2andthirdfor HD < 2. Intuitively, the more

accurate the used power model is, the better our description of the secret in-

formation will be. Many papers are dealing with the investigation of new power

models and techniques for traces classification [1, 21]. The optimal choice of the

power consumption model, including the partitioning process, is out of the scope

of this paper. In what follows, our study will focus on the Hamming distance

model as it is one of the most commonly used, and often one of the most ecient.

{

0 , 1 , 2 , 3 , 4

}

{

HD i }

3.2 References Computation

Once traces are arranged in X partitions for each key hypothesis k j ,wepropose

to compute for each partition a statistical trace based on one CS and referred

to as reference . For instance, if CS is the ”mean” then the reference would be

the average of all traces that belong to the considered partition. Actually, the X

references of one key hypothesis k j will be used by PCA as criterions to highlight

differences between the X partitions. For references computation, we notice that

the same CS (the mean, the variance ...) is used for all partitions and for all

key hypotheses k j . One reference is an L -dimensional time vector. Thus we have

one dataset of X references, for each k j . We denote this set by V ref k j .Inwhat

follows, our study will focus on analysing each dataset V ref k j corresponding to

each key hypothesis k j . This analysis will allow the attacker to discriminate the

behavior of the secret key with regards to all other key hypotheses. Moreover, it

will reduce the computational complexity of the PCA step.

3.3 FPCA Distinguisher

For one key hypothesis k j , the dependencies between references are made more

eligible by PCA, when the references are projected to the new axes system

composed by the principal components. The PCA is used to analyze these de-

pendencies by measuring the dispersion of the references in the new coordinate

space. Indeed, the larger the eigenvalue, denoted by λ , corresponding to one

eigenvector is, the greater is the dispersion of the references on this eigenvector.

As stated by equation (1), the total variance of one V ref k j is equal to the sum

of all eigenvalues corresponding to all principal components:

Information Security and Cryptology-ICISC 2010

Search WWH ::

Custom Search

Home