Advice on power side channel analysis

Hi, I need some advice on an analysis I am doing for my research. Disclosure: I don’t have much background in security. My research focuses on computer hardware. So any advice/suggestions you have for me would be much appreciated.

I am trying to carry out a power side channel analysis on the CW-NANO platform. I am trying to attack the modulus operation, which seems to be most typically implemented as a division to get the remainder. For the M0 CPU used in the CW-NANO, division is implemented in software. Looking at the assembly code, this operation is implemented as a sequence of shifts and subtracts.

In my case, the dividend a is known to the attacker already. So its just a matter of finding the divisor b to recover the remainder (i.e., r = a%b) . First, I did a test to see if the number of cycles required would vary based on the dividend (a) and the divisor (b). Here is a table of the time taken (in CPU cycles) for different combinations of a/b on the ARM-M0 CPU.

image

Its therefore pretty easy to establish the range of b based on the time taken for division. Next, I wanted to see, for the range of values which take the same amount of time, is the power trace different? So, I captured traces of 200/b and then subtracted the average of all traces and plotted them. Here is that graph:

To me, it seems like there are significant differences based on the value being used. A few caveats: 1) I plotted only powers of 10 for simplicity but ideally, I’d like this to work for all values of b and 2) I took 100 traces for each b value and they are all identical.

My question is: how should I go about ‘predicting’ the value of b given a trace? I’m not sure CPA is the right approach since there isn’t a single ‘peak’ similar to the Hamming weight. Would multi-variate CPA work? Or should I try to train some type of classification model to see if that can predict the value? Thank you!

Hi,

I don’t think CPA makes any sense here in general. What are you expecting power consumption to be linear with?

As you’ve said, this modulus operation isn’t constant time, so maybe you could do some sort of timing attack if you could vary a.

Alex

1 Like

I think you could still try DPA/CPA, but probably in a way somewhat too complicated for this case. Assuming a is known, you can make bit-wise guesses on the bits in b and try to compute correlation with the two hypothetical values of the intermediate data of the division. You’d have to do it bitwise, recover one bit at a time, after a bit recovery you’d do another CPA attack on the next bit of b and so on. But once you make a wrong guess on the bits all following calculation is done on rubbish data. So yeah, might be difficult.

But the SPA style attack looks more promising. I’d plot more values of b, translate them to sequences of shifts/subtractions and some patterns may pop up.

2 Likes

Hi, thank you both! I did try CPA but I wasn’t able to get it to work. In fact, a much simpler solution ended up working. I just took the traces I showed above and trained a Decision tree classifier on them. Because the traces for a given value of b are identical each time, with a large enough classifier, I am getting 100% accuracy. So unless something changes, that solution works for my scenario!