Differential Power Analysis (DPA) and Template Attacks (TA) operate on a 'divide-and-conquer' (DC) basis: the dependency of the power consumption on small, 'guessable' parts of the intermediate state in the first or last round is exploited by comparing the leakage measurements with attacker predictions based on key guesses and approximate (or detailed, in the case of TA) knowledge of the functional form of the leakage. The full key is then recovered by enumerating over the highest-ranked subkeys. DC methods are computationally efficient and adapt straightforwardly to a wide range of algorithms and scenarios requiring minimal ('grey box') familiarity with implementation details. They are robust to noise (from measurement error or non-targeted system processes), which simply increase the number of traces needed to maintain the success rate. However, they generally discard all but a fraction of the information present in a given trace, so their overall data complexity is far from optimal and, importantly, inadequate to indicate the true extent of the vulnerability of a device for evaluation purposes. This drawback is exacerbated by the sensitivity of the data complexity to the quality of the leakage predictions (or ‘templates’, in the case of TA).

By contrast, analytical methods such as 'Simple' Power Analysis (SPA) aim to recover the whole key simultaneously by exploiting a large number of leaked intermediate values (possibly incorporating multiple rounds) from just one or a small number of traces. Some methods achieve this by reducing the key candidate space and then enumerating; others, by solving (e.g. via a SAT solver) a system of algebraic equations in function of the intermediate values and the side-channel auxiliary information in which variables representing the key are treated as unknowns to be recovered. The obvious advantage of such strategies is their low data complexity. However, as well as being far more complicated to implement than DC methods, and much more dependent on detailed knowledge of the implementation, they also suffer from weak noise resistance, generally requiring 'hard' error-free classifications of all intermediate values into sufficiently small subsets of the output space.

Wouldn’t it be neat, then, to have a strategy that efficiently exploits all the useful intermediate values in a trace, just like analytical attacks, whilst requiring only ‘soft’ (probabilistic) information on those values (and thereby remaining robust to noise) just like DC attacks?

Such is the spec that Veyrat-Charvillon et al. seek to meet, and to do so they re-express the side-channel key recovery problem as a "Low Density Parity Check"-like code to be decoded. This allows them to apply Belief Propagation (BP) — an algorithm which passes messages around a factor graph representing the code until convergence is reached. As long as the factor graph is tree-like, BP is able to find reasonable approximations in reasonable time for solutions that take exponential time (in the number of variables) to compute exactly.

The factor graph comprises nodes for variables and nodes for functions, with edges connecting functions to the variables appearing in them. In the case at hand — 'decoding' an AES key using plaintexts, ciphertexts and side-channel traces — the variables x

_{i}are the intermediate values handled by the device, and there are two categories of function: those corresponding to probabilistic information on unknown values derived from the observed side-channel leakages, f(

*x*) = P(

*x*=

*v*|

**L**); and indicator functions representing the known executed operations, thereby enforcing the correctness of AES, e.g. f(

*x*) =

_{i},x_{j},x_{k}**1**

_{{OP(xi,xj) = xk}}(where OP might be, say, an XOR). The parts of the key appear as

*x*'s which are

_{i}*a priori*unknown; propagating the known information around the factor graph refines the distribution on the keys, with increasing certainty towards the solution.

In cases where a single trace is not sufficient to identify the key (for example, because of a low signal-to-noise ratio (SNR)) additional traces can be incorporated by 'chaining together' several factor graphs with a shared key schedule (i.e. because the

*x*’s representing the key remain the same in each run of the algorithm). This becomes quite memory intensive — 16MB of space per trace — but is an improvement on previous SPA techniques which are unable to handle the added complexity of multiple observations (although they do benefit from repeat observations relating to the same algorithm inputs, which can be averaged to produce an enhanced SNR). A space-saving alternative would be to build and propagate the factor graphs for each trace in turn, retaining the posterior distribution on the key after exploiting one trace as the prior distribution when exploiting the next. However, the authors state (in their conclusion) that this requires first adapting the method to only propagate messages towards the key schedule, something they leave for future work.

_{i}The advantages over analytic strategies are obvious: SASCA is far more robust to low SNR, both because it can handle 'soft’ information about the intermediates and because it can easily incorporate multiple traces. Moreover, the authors state that BP decoding improves on the time and memory complexity of SAT solvers and optimisers. The advantages over DC-style strategies are verified experimentally in a tightly controlled simulated attack environment based on the AES FURIOUS implementation with Hamming weight leakage as SNR varies. Template attacks are the appropriate benchmark, widely-recognised as the ‘optimal’ approach as far as DC methods go; they rely on a similar profiling phase to SASCA, although for far fewer intermediate values, and requiring less detailed knowledge of the assembly code. As expected, SASCA, with its option to incorporate numerous intermediates, is experimentally shown to substantially outperform TA (in terms of the number of traces needed) at all tested SNR levels, and to succeed even with unknown inputs and outputs.

In all, SASCA is an interesting, elegant proposal, which certainly does go considerable way — theoretically, at least — towards closing the gap between DC and analytic side-channel strategies. However, its practical usefulness is not yet established: so far, it has only been tested in a carefully constructed simulated scenario; its ready transferability beyond that scenario is certainly not obvious. For one thing, the profiling phase is high-effort and requires in-depth assembly code-level knowledge of the attacked implementation; for another, BP is not guaranteed to converge when applied to “non-tree-like” factor graphs (i.e., graphs with cycles) and the authors have not yet been able to indicate the extent to which real-world cryptographic implementations of interest are or are not suitable targets for the methodology. Ideas developed concurrently by Oren et al. and presented at CHES 2014 [link] appear a more 'practical' approach to handling 'soft' side-channel information, but are only able to exploit leakage from the first encryption round and the key schedule. So, the actual contribution of subsequent rounds to the performance of SASCA needs to be clarified. Fortunately, the topic is one of continuing interest to many, so all of these avenues for further investigation are likely to be explored enthusiastically…

## No comments:

## Post a Comment