Cryptography Reference
In-Depth Information
V i,j = f ( d i ,k j )with1
K .Foreachvalue V i,j ,theat-
tacker computes a hypothetical power consumption value h i,j basedonapower
consumption model. The most commonly used power models are the Hamming
distance (HD) and the Hamming weight (HW) [6]. R being the number of possi-
ble values that the power consumption model could take, the traces are arranged
in X ( X ≤ R ) different partitions for each key hypothesis k j .Wedenotethese
partitions as a vector P k j
i
T and 1
j
K .Forin-
stance, suppose that our power consumption model is the HD and that it can
take integral values from 0 to 4: HD =
=( P k j , 1 ,P k j , 2 ,...,P k j ,X )with1
j
5
i =1 . The trivial
partitioning is to associate each HD i value to one partition. Thus X = R =5.
One other possibility is to build only X =3 partitions in this way : First partition
for HD > 2, second for HD =2andthirdfor HD < 2. Intuitively, the more
accurate the used power model is, the better our description of the secret in-
formation will be. Many papers are dealing with the investigation of new power
models and techniques for traces classification [1, 21]. The optimal choice of the
power consumption model, including the partitioning process, is out of the scope
of this paper. In what follows, our study will focus on the Hamming distance
model as it is one of the most commonly used, and often one of the most ecient.
{
0 , 1 , 2 , 3 , 4
}
=
{
HD i }
3.2 References Computation
Once traces are arranged in X partitions for each key hypothesis k j ,wepropose
to compute for each partition a statistical trace based on one CS and referred
to as reference . For instance, if CS is the ”mean” then the reference would be
the average of all traces that belong to the considered partition. Actually, the X
references of one key hypothesis k j will be used by PCA as criterions to highlight
differences between the X partitions. For references computation, we notice that
the same CS (the mean, the variance ...) is used for all partitions and for all
key hypotheses k j . One reference is an L -dimensional time vector. Thus we have
one dataset of X references, for each k j . We denote this set by V ref k j .Inwhat
follows, our study will focus on analysing each dataset V ref k j corresponding to
each key hypothesis k j . This analysis will allow the attacker to discriminate the
behavior of the secret key with regards to all other key hypotheses. Moreover, it
will reduce the computational complexity of the PCA step.
3.3 FPCA Distinguisher
For one key hypothesis k j , the dependencies between references are made more
eligible by PCA, when the references are projected to the new axes system
composed by the principal components. The PCA is used to analyze these de-
pendencies by measuring the dispersion of the references in the new coordinate
space. Indeed, the larger the eigenvalue, denoted by λ , corresponding to one
eigenvector is, the greater is the dispersion of the references on this eigenvector.
As stated by equation (1), the total variance of one V ref k j is equal to the sum
of all eigenvalues corresponding to all principal components:
 
Search WWH ::




Custom Search