Implementing new model - AES FPGA CPA attack

Hi all,

I’m trying to attack a specific AES engine that I have on my artix-7 FPGA. I’m using the PA_HW_CW305_1-Attacking_AES_on_an_FPGA jupyter tutorial but instead of

leak_model = cwa.leakage_models.last_round_state_diff

I’m trying to implement a new model that will fit my engine. Instead of one round per cycle in my engine there are two rounds per cycle. Which means, if I understand correctly, that the HD last round diff model shouldn’t work because it’s assuming the trace measurement is done between the 9th round and 10th round. So, what I should do is implement a HD model between the 8th round and the 10th.

  1. Is that correct so far?
  2. My model didn’t work and I’m trying to figure out why, maybe I’m implementing it wrong. This is my code, if you could review it and give me some advice that would be very much appriciated.

The code:

import chipwhisperer.analyzer as cwa

class Round8Round10StateDiff(cwa.AESLeakageHelper):
    name = 'HD: AES Round8 out to Round10 out (last round out) State diff'
    def leakage(self, pt, ct, key, bnum):
        key9 = self.key_schedule_rounds(key, 0, 9)
        key10 = self.key_schedule_rounds(key, 0, 10)
        st10 = ct
        state = [ct[i] ^ key10[i] for i in range(0, 16)] # inv Add round key round 10
        state = self.inv_shiftrows(state)  # inv shift rows round 10
        state = self.inv_subbytes(state)   # inv sub bytes round 10
        state = [ct[i] ^ key9[i] for i in range(0, 16)] # inv Add round key round 9
        state = self.inv_mixcolumns(state) # inv mix columns round 9
        state = self.inv_shiftrows(state)  # inv shift rows round 9
        state = self.inv_subbytes(state)   # inv sub bytes round 9
        st8 = state
        return (st8[bnum] ^ st10[bnum])

leak_model = cwa.leakage_models.new_model(Round8Round10StateDiff)


Hi Ohad,

Analyzer doesn’t actually support CPA attacks across multiple bytes. The rest of the bytes of the key are just 0 by default, with each byte of the key being found individually. This works well for most standard models, since each byte can be found independently, but it won’t for this attack due to the state being combined via mixcolumns. To attack multiple bytes, you’ll either need to modify chipwhisperer/analyzer/attacks/cpa_algorithms/ to guess across multiple bytes, or write your own attack algorithm.

That being said, even with the above modifications, you have a much larger key search space (16*2^8 vs 4*2^32) so the attack will be much harder and probably not feasible.

EDIT: Actually, thinking about it more, the search space will be even worse since the bytes of the key are combined during the key scheduling. There might be some sort of vulnerability with that, but that’s well beyond my cryptographic capabilities.


One thought, just in case: are you actually doing two complete rounds in a single clock cycle, or have you pipelined the rounds, e.g. with two instances each doing one round/cycle? The latter is what I’m used to seeing, and if that’s the case, then the standard model could work (albeit with more traces due to the noise contributed by the second parallel instance of round logic).

Thank you for your answers guys.

Alex - I understand your point about the mixcolumns, but I saw on CW github that there is a model implementing mixcolumns in it:




class Round1Round2StateDiff_KeyMix(AESLeakageHelper):
    name = 'HD: AES Round1/Round2 State diff for key addition'
    def leakage(self, pt, ct, key, bnum):
        state = [pt[i] ^ key[i] for i in range(0, 16)]
        state1 = state[:]
        state = self.subbytes(state)
        state = self.shiftrows(state)
        state = self.mixcolumns(state)

        key2 = self.key_schedule_rounds(key, 0, 1)
        state = [state[i] ^ key2[i] for i in range(0, 16)]

        return state[bnum] ^ state1[bnum]

What is the difference? What am I missing? Doesn’t this model work?

Jean-Pierre - this engine is really doing 2 rounds per cycle, no pipelined rounds.

Thanks again,

Hi Ohad,

There’s a few models in there that are definitely non-functional. Anything in there that uses more than one byte of the key won’t work. There are some comments/exceptions in that indicate that there were plans to allow an “attack” with a fully known key (basically you know every byte of the key except the one that you’re guessing), in which case you can easily perform the mixcolumns and key scheduling operations, but I don’t think this was ever actually implemented.


Thank you for this clarification.