How is it possible for the mixcolumns to revel the state of the inverse sbox

I revisited a CPA I performed on the ATSAMD21 and can’t figure out something fundamental. I was hoping for some guidance. Decryption using AES 128bits.
Basically, when I performed the CPA on the InvSubBytesAndXOR step I was getting very low PGE on the actual target. However, when I performed it on the InvMixColumns step, the PGE was very high and I was able to find the solution without knowing the key in advance.
What I can’t figure out is why would this be? In an inverse AES, the InvMixColumns step has the round 9’s key XOR’d with the block not round 10. I confirmed with my target I’m correctly deciphering the various invaes steps from the power trace. I also found code that I feel is likely the source they used.

All I can think is some sort of cache write back is causing this, but it still doesn’t make sense as that memory block in over written by the invmixcolumns step.
Any thoughts? Here is the code. I marked with “<----” where my CPAs took place. Take a look at the comments in invmixcolumns, for where the CPA worked.

Thank you in advance.

void InvCipher( unsigned char * block, unsigned char * expandedKey )
	unsigned char round = ROUNDS-1;
	int count;
	expandedKey += BLOCKSIZE * ROUNDS;
	XORBytes( block, expandedKey, 16 );
	expandedKey -= BLOCKSIZE;
	do {
		InvShiftRows( block );    
		InvSubBytesAndXOR( block, expandedKey, 16 );  <-- Low PGE
		expandedKey -= BLOCKSIZE;
		InvMixColumns( block );    <-- Very High PGE, see InvMixColumns for detail
	} while( --round );
	InvShiftRows( block );
	InvSubBytesAndXOR( block, expandedKey, 16 );

The combined InvSubBytes and Round key step

void InvSubBytesAndXOR( unsigned char * bytes, unsigned char * key, unsigned char count )
	do {
			// *bytes = sBoxInv[ *bytes ] ^ *key; // Inverse substitute every byte in state and add key.
			*bytes = block2[ *bytes ] ^ *key; 		// Use block2 directly. Increases speed.
		} while( --count );


void InvMixColumns( unsigned char * state )
        <--- Align here and I get Key Round 10 bytes 0, 13, 10, 7 (InvShiftRows order)
	InvMixColumn( state + 0*4 );
        <--- Align here and I get Key Round 10 bytes 4, 1, 14, 11
	InvMixColumn( state + 1*4 );
        <--- Align here and I get Key Round 10 bytes 8, 5, 2, 15 
	InvMixColumn( state + 2*4 );
        <--- Align here and I get Key Round 10 bytes 12, 9, 6 ,3
	InvMixColumn( state + 3*4 );

void InvMixColumn( unsigned char * column )
	unsigned char r0, r1, r2, r3;
	r0 = column[1] ^ column[2] ^ column[3];
	r1 = column[0] ^ column[2] ^ column[3];
	r2 = column[0] ^ column[1] ^ column[3];
	r3 = column[0] ^ column[1] ^ column[2];
	column[0] = (column[0] << 1) ^ (column[0] & 0x80 ? BPOLY : 0);
	column[1] = (column[1] << 1) ^ (column[1] & 0x80 ? BPOLY : 0);
	column[2] = (column[2] << 1) ^ (column[2] & 0x80 ? BPOLY : 0);
	column[3] = (column[3] << 1) ^ (column[3] & 0x80 ? BPOLY : 0);
	r0 ^= column[0] ^ column[1];
	r1 ^= column[1] ^ column[2];
	r2 ^= column[2] ^ column[3];
	r3 ^= column[0] ^ column[3];
	column[0] = (column[0] << 1) ^ (column[0] & 0x80 ? BPOLY : 0);
	column[1] = (column[1] << 1) ^ (column[1] & 0x80 ? BPOLY : 0);
	column[2] = (column[2] << 1) ^ (column[2] & 0x80 ? BPOLY : 0);
	column[3] = (column[3] << 1) ^ (column[3] & 0x80 ? BPOLY : 0);
	r0 ^= column[0] ^ column[2];
	r1 ^= column[1] ^ column[3];
	r2 ^= column[0] ^ column[2];
	r3 ^= column[1] ^ column[3];
	column[0] = (column[0] << 1) ^ (column[0] & 0x80 ? BPOLY : 0);
	column[1] = (column[1] << 1) ^ (column[1] & 0x80 ? BPOLY : 0);
	column[2] = (column[2] << 1) ^ (column[2] & 0x80 ? BPOLY : 0);
	column[3] = (column[3] << 1) ^ (column[3] & 0x80 ? BPOLY : 0);
	column[0] ^= column[1] ^ column[2] ^ column[3];
	r0 ^= column[0];
	r1 ^= column[0];
	r2 ^= column[0];
	r3 ^= column[0];
	column[0] = r0;
	column[1] = r1;
	column[2] = r2;
	column[3] = r3;

I am somewhat confused by this question, so let me ask you more questions rather than answering :slight_smile:

You have AES-128 decryption and you are trying to recover the round key of round 10 by key search, right?

What does PGE stand for?

Leakage of what intermediate data are you trying to exploit? Do you say InvSubBytesAndXOR, or InvMixColumns? In that step of the decryption not only the round key of round 10 was used, but also round 9 round key. I don’t see a way how a key search on this data can be performed.

Or do you mean you are targeting some intermediate data before the round key of round 9 is used, while you know the interval you target in the trace corresponds to the InvMixColumns?

I’ve already recovered round 10’s keys using a CPA, I’m just looking for clarification as to why my approached actually gave better PGE. (PGE) is Partial Guessing Entropy and is a sign of how well your guess stands out in relation to other guesses. In the ChipWhisperer suite, a high PGE difference shows there is a pretty good chance this number is the key’s value.

From the code posted, InvSubBytesAndXOR is a combined InverseSubByte and Add Round Key step of the host I attacked. I was not getting a strong PGE from performing a CPA on this step, but got a huge PGE from attacking the mixcolumns step, thus extracting round 10’s keys.

My question is how can this be?
For more about this attack, I have wrote it up here:


Interesting write up! It’s always good to see someone make the jump from the target boards, where we take care of all the hardware setup to other devices, especially when you have to deal with additional complexities like triggering, trace realignment, and bypassing internal regulators :slight_smile: .

It’s definitely super weird that you see leakage on the MixColumns step and not on the XOR. What sort of leakage model are you using for your analysis? In addition, how did you assign parts of the power trace to the the AES operations?

Also, a small clarification: I believe you’re mixing up correlation and PGE. Correlation is the number we use to rank our guesses, so a high correlation means a good key guess. If you’ve got a big difference in correlation between your best/second best guess, that’s a good way to tell you’ve got the correct key. PGE, on the other hand, is just how many wrong key guesses are ranked about the correct one.

Alright, I think I understand the question better now after reading the blog post (nice work btw!) - you know which part of trace belongs to which processing (same question as Alex - how?), you are targeting the inverse S-box to recover round 10 subkey. You get better results when performing the attack on what should be the InvMixColumns interval, instead of when you are targeting what should be the InvSubBytesAndXOR interval. Right?

Well, that seems strange indeed. But as you now know the key, you can run CPA on various intermediate date at the start of the decryption. By investigating in which interval what data leaks you may get more detailed insights what happens in what part of the trace.

My first question would be - what leakage can you see in the interval you identify as InvMixColumn? There should be the inverse S-box leakage, as you exploted it. Do you also see the actual InvMixColumn leakage? Do you see the InvMixColumn leakage also somewhere later in the trace?

First off thank you for all the hard work you’ve put towards the Chipwhisperer. I really appreciate it.
You are right about PGE, it’s the label of the Correlation in my sample and always just called it that. I’ll fix it in my article too.

For the article, I was explaining Inverse AES for completeness and realized my folly. However, I was getting better correlation from the mixcolumns and not the invsbox. It’s baffling. If I remember correctly, I got about 4 to 5 wrong values attacking the invsbox. Once I aligned the mixcolumns, and attacked that, I was able to successfully get all of the values but one which I ended up brute forcing. I had to align on the four - 4 byte mix column operations, but the results were clear as day.

For the article, I wanted to get some better traces and double check my work, I patched the bootloader to pulse a GPIO around both the InvSubBytesAndXOR functions which reemphasized I was correctly labeling all stages of the INV AES. Below are the envelopes around the InvSubBytesAndXOR function for your viewing. When I aligned the following mixcolumns function, which obviously has a vastly different power trace, the key extraction was quick and the correlation differentials were high.
I’m using the leak_model = cwa.leakage_models.inverse_sbox_output straight out of the 256bit bootloader example.
I honestly, didn’t want to spend a lot of time resetting of the whole workbench, but figured maybe someone has seen this before, or perhaps its something to look into for others. I’ll revisit it as time permits, but definitely it doesn’t make sense.
It may be easier to try on a different atmel target with a UFO board. The code is available through atmel start, but I feel the vendor tweaked it a bit for ARM M0. Seems a github search for InvSubBytesAndXOR shows the same algorithm too.
I have some of my original writeup, before I realized this issue located here: It details a portion of the mix columns where I realigned to CPA those 4 byte keys. My power trace, only contained a small portion of the mixcolumns function which a plot of one I used to CPA is below. I probably do still have the mix column trace data and the extracted keys if NewAE wants them. Of course, not for disbursement though. Direct contact information is on my website.

scope_invsbox_detail scope_invsbox_location

My bet would be some optimization related to the compiler options, especially as I see the loop and the naive implementation.

Thank you for looking into this. I’m actually working off of the disassembly, but determined the c code library. It the mixcolum/mixcolumns step, there is no loads from a past state. The c code is fairly explicit and gets converted to arm assembly really well. I just double checked.

I’ll have to do fairly extensive work to actually read the data off of the original device. I read the bootloaders’s fw from decrypting the bootloader’s update file and then programming it on a secondary chip. I think my next course of action is to patch the app as to allow me to read the real device and ensure the real bootloader is not any different. I guess it’s possible it has some debug code turned on in the mixcolumns state.

The following is the disassembly. The only data that is passed through is the current *digest via R0
Thank you again!

ROM:00000412 @ unsigned __int8 *__fastcall InvMixColumns(unsigned __int8 *state)
ROM:00000412 InvMixColumns: @ CODE XREF: InvCipher+28↓p
ROM:00000412 PUSH {R4,LR}
ROM:00000414 MOVS R4, R0
ROM:00000416 BL InvMixColumn
ROM:0000041A ADDS R0, R4, #4
ROM:0000041C BL InvMixColumn
ROM:00000420 MOVS R0, R4
ROM:00000422 ADDS R0, #8
ROM:00000424 BL InvMixColumn
ROM:00000428 ADDS R4, #0xC
ROM:0000042A MOVS R0, R4
ROM:0000042C BL InvMixColumn
ROM:00000430 POP {R4,PC}
ROM:00000430 @ End of function InvMixColumns

ROM:000001A8 InvMixColumn: @ CODE XREF: InvMixColumns
ROM:000001A8 @ InvMixColumns
ROM:000001A8 var_28 = -0x28
ROM:000001A8 var_27 = -0x27
ROM:000001A8 var_26 = -0x26
ROM:000001A8 var_25 = -0x25
ROM:000001A8 var_24 = -0x24
ROM:000001A8 var_20 = -0x20
ROM:000001A8 var_1C = -0x1C
ROM:000001A8 var_18 = -0x18
ROM:000001A8 PUSH {R4-R7,LR}
ROM:000001AA SUB SP, SP, #0x14
ROM:000001AC LDRB R1, [R0,#3]
ROM:000001AE MOV R2, SP
ROM:000001B0 STRB R1, [R2,#0x28+var_27]
ROM:000001B2 LDRB R5, [R0,#2]
ROM:000001B4 LDRB R1, [R0,#1]
ROM:000001B6 STRB R1, [R2,#0x28+var_26]
ROM:000001B8 MOV R1, SP
ROM:000001BA LDRB R2, [R1,#0x28+var_26]
ROM:000001BC EORS R2, R5
ROM:000001BE LDRB R1, [R1,#0x28+var_27]
ROM:000001C0 EORS R1, R2
ROM:000001C2 LDRB R2, [R0]
ROM:000001C4 MOVS R3, R5
ROM:000001C6 EORS R3, R2
ROM:000001C8 MOV R4, SP
ROM:000001CA LDRB R4, [R4,#0x28+var_27]
ROM:000001CC EORS R4, R3
ROM:000001CE MOV R3, SP
ROM:000001D0 LDRB R3, [R3,#0x28+var_26]
ROM:000001D2 EORS R3, R2
ROM:000001D4 MOV R6, SP
ROM:000001D6 LDRB R6, [R6,#0x28+var_27]
ROM:000001D8 EORS R6, R3
ROM:000001DA STR R6, [SP,#0x28+var_18]
ROM:000001DC EORS R3, R5
ROM:000001DE STR R3, [SP,#0x28+var_20]
ROM:000001E0 MOVS R3, #0
ROM:000001E2 LSLS R6, R2, #0x18
ROM:000001E4 BPL loc_1EE
ROM:000001E6 MOVS R6, #0x1B
ROM:000001E8 MOV R7, SP
ROM:000001EA STRB R6, [R7,#0x28+var_25]
ROM:000001EC B loc_1F2
ROM:000001EE @ ---------------------------------------------------------------------------
ROM:000001EE loc_1EE: @ CODE XREF: InvMixColumn+3C↑j
ROM:000001EE MOV R6, SP
ROM:000001F0 STRB R3, [R6,#0x28+var_25]
ROM:000001F2 loc_1F2: @ CODE XREF: InvMixColumn+44↑j
ROM:000001F2 LSLS R2, R2, #1
ROM:000001F4 MOV R6, SP
ROM:000001F6 LDRB R6, [R6,#0x28+var_25]
ROM:000001F8 EORS R6, R2
ROM:000001FA MOV R2, SP
ROM:000001FC STRB R6, [R2,#0x28+var_28]
ROM:000001FE LDRB R2, [R2,#0x28+var_28]
ROM:00000200 STRB R2, [R0]
ROM:00000202 MOV R2, SP
ROM:00000204 LDRB R2, [R2,#0x28+var_26]
ROM:00000206 LSLS R2, R2, #0x18
ROM:00000208 BPL loc_20E
ROM:0000020A MOVS R2, #0x1B
ROM:0000020C B loc_210
ROM:0000020E @ ---------------------------------------------------------------------------
ROM:0000020E loc_20E: @ CODE XREF: InvMixColumn+60↑j
ROM:0000020E MOVS R2, #0
ROM:00000210 loc_210: @ CODE XREF: InvMixColumn+64↑j
ROM:00000210 MOV R6, SP
ROM:00000212 LDRB R6, [R6,#0x28+var_26]
ROM:00000214 LSLS R6, R6, #1
ROM:00000216 EORS R2, R6
ROM:00000218 MOV R6, SP
ROM:0000021A STRB R2, [R6,#0x28+var_25]
ROM:0000021C MOV R2, SP
ROM:0000021E LDRB R2, [R2,#0x28+var_25]
ROM:00000220 STRB R2, [R0,#1]
ROM:00000222 LSLS R2, R5, #0x18
ROM:00000224 BPL loc_22A
ROM:00000226 MOVS R2, #0x1B
ROM:00000228 B loc_22C
ROM:0000022A @ ---------------------------------------------------------------------------
ROM:0000022A loc_22A: @ CODE XREF: InvMixColumn+7C↑j
ROM:0000022A MOVS R2, #0
ROM:0000022C loc_22C: @ CODE XREF: InvMixColumn+80↑j
ROM:0000022C LSLS R5, R5, #1
ROM:0000022E EORS R2, R5
ROM:00000230 STRB R2, [R0,#2]
ROM:00000232 MOV R5, SP
ROM:00000234 LDRB R5, [R5,#0x28+var_27]
ROM:00000236 LSLS R5, R5, #0x18
ROM:00000238 BPL loc_23E
ROM:0000023A MOVS R5, #0x1B
ROM:0000023C B loc_240
ROM:0000023E @ ---------------------------------------------------------------------------
ROM:0000023E loc_23E: @ CODE XREF: InvMixColumn+90↑j
ROM:0000023E MOVS R5, #0
ROM:00000240 loc_240: @ CODE XREF: InvMixColumn+94↑j
ROM:00000240 LDRB R6, [R6,#1]
ROM:00000242 LSLS R6, R6, #1
ROM:00000244 EORS R5, R6
ROM:00000246 MOV R6, SP
ROM:00000248 STRB R5, [R6,#0x28+var_26]
ROM:0000024A MOV R5, SP
ROM:0000024C LDRB R5, [R5,#0x28+var_26]
ROM:0000024E STRB R5, [R0,#3]
ROM:00000250 MOV R5, SP
ROM:00000252 LDRB R5, [R5,#0x28+var_28]
ROM:00000254 EORS R5, R1
ROM:00000256 MOV R1, SP
ROM:00000258 LDRB R1, [R1,#0x28+var_25]
ROM:0000025A EORS R1, R5
ROM:0000025C STR R1, [SP,#0x28+var_24]
ROM:0000025E MOV R1, SP
ROM:00000260 LDRB R1, [R1,#0x28+var_25]
ROM:00000262 EORS R1, R4
ROM:00000264 EORS R1, R2
ROM:00000266 UXTB R1, R1
ROM:00000268 STR R1, [SP,#0x28+var_1C]
ROM:0000026A LDR R4, [SP,#0x28+var_18]
ROM:0000026C EORS R4, R2
ROM:0000026E MOV R1, SP
ROM:00000270 LDRB R1, [R1,#0x28+var_26]
ROM:00000272 EORS R1, R4
ROM:00000274 UXTB R1, R1
ROM:00000276 LDR R4, [SP,#0x28+var_20]
ROM:00000278 MOV R5, SP
ROM:0000027A LDRB R5, [R5,#0x28+var_28]
ROM:0000027C EORS R5, R4
ROM:0000027E MOV R4, SP
ROM:00000280 LDRB R6, [R4,#0x28+var_26]
ROM:00000282 EORS R6, R5
ROM:00000284 UXTB R6, R6
ROM:00000286 LDRB R4, [R4,#0x28+var_28]
ROM:00000288 LSLS R4, R4, #0x18
ROM:0000028A BPL loc_290
ROM:0000028C MOVS R4, #0x1B
ROM:0000028E B loc_292
ROM:00000290 @ ---------------------------------------------------------------------------
ROM:00000290 loc_290: @ CODE XREF: InvMixColumn+E2↑j
ROM:00000290 MOVS R4, #0
ROM:00000292 loc_292: @ CODE XREF: InvMixColumn+E6↑j
ROM:00000292 MOV R5, SP
ROM:00000294 LDRB R5, [R5,#0x28+var_28]
ROM:00000296 LSLS R5, R5, #1
ROM:00000298 EORS R4, R5
ROM:0000029A MOV R5, SP
ROM:0000029C STRB R4, [R5,#0x28+var_27]
ROM:0000029E MOV R4, SP
ROM:000002A0 LDRB R4, [R4,#0x28+var_27]
ROM:000002A2 STRB R4, [R0]
ROM:000002A4 MOV R4, SP
ROM:000002A6 LDRB R4, [R4,#0x28+var_25]
ROM:000002A8 LSLS R4, R4, #0x18
ROM:000002AA BPL loc_2B0
ROM:000002AC MOVS R4, #0x1B
ROM:000002AE B loc_2B2
ROM:000002B0 @ ---------------------------------------------------------------------------
ROM:000002B0 loc_2B0: @ CODE XREF: InvMixColumn+102↑j
ROM:000002B0 MOVS R4, #0
ROM:000002B2 loc_2B2: @ CODE XREF: InvMixColumn+106↑j
ROM:000002B2 LDRB R5, [R5,#3]
ROM:000002B4 LSLS R5, R5, #1
ROM:000002B6 EORS R4, R5
ROM:000002B8 MOV R5, SP
ROM:000002BA STRB R4, [R5,#0x28+var_28]
ROM:000002BC MOV R4, SP
ROM:000002BE LDRB R4, [R4,#0x28+var_28]
ROM:000002C0 STRB R4, [R0,#1]
ROM:000002C2 LSLS R4, R2, #0x18
ROM:000002C4 BPL loc_2CA
ROM:000002C6 MOVS R4, #0x1B
ROM:000002C8 B loc_2CC
ROM:000002CA @ ---------------------------------------------------------------------------
ROM:000002CA loc_2CA: @ CODE XREF: InvMixColumn+11C↑j
ROM:000002CA MOVS R4, #0
ROM:000002CC loc_2CC: @ CODE XREF: InvMixColumn+120↑j
ROM:000002CC LSLS R2, R2, #1
ROM:000002CE EORS R4, R2
ROM:000002D0 STRB R4, [R0,#2]
ROM:000002D2 MOV R2, SP
ROM:000002D4 LDRB R2, [R2,#0x28+var_26]
ROM:000002D6 LSLS R2, R2, #0x18
ROM:000002D8 BPL loc_2DE
ROM:000002DA MOVS R5, #0x1B
ROM:000002DC B loc_2E0
ROM:000002DE @ ---------------------------------------------------------------------------
ROM:000002DE loc_2DE: @ CODE XREF: InvMixColumn+130↑j
ROM:000002DE MOVS R5, #0
ROM:000002E0 loc_2E0: @ CODE XREF: InvMixColumn+134↑j
ROM:000002E0 MOV R2, SP
ROM:000002E2 LDRB R2, [R2,#0x28+var_26]
ROM:000002E4 LSLS R2, R2, #1
ROM:000002E6 EORS R5, R2
ROM:000002E8 STRB R5, [R0,#3]
ROM:000002EA LDR R2, [SP,#0x28+var_24]
ROM:000002EC MOV R7, SP
ROM:000002EE LDRB R7, [R7,#0x28+var_27]
ROM:000002F0 EORS R7, R2
ROM:000002F2 EORS R7, R4
ROM:000002F4 UXTB R7, R7
ROM:000002F6 STR R7, [SP,#0x28+var_20]
ROM:000002F8 LDR R2, [SP,#0x28+var_1C]
ROM:000002FA MOV R7, SP
ROM:000002FC LDRB R7, [R7,#0x28+var_28]
ROM:000002FE EORS R7, R2
ROM:00000300 EORS R7, R5
ROM:00000302 UXTB R7, R7
ROM:00000304 STR R7, [SP,#0x28+var_24]
ROM:00000306 MOV R2, SP
ROM:00000308 LDRB R2, [R2,#0x28+var_27]
ROM:0000030A EORS R2, R1
ROM:0000030C EORS R2, R4
ROM:0000030E UXTB R2, R2
ROM:00000310 MOV R1, SP
ROM:00000312 LDRB R1, [R1,#0x28+var_28]
ROM:00000314 EORS R1, R6
ROM:00000316 EORS R1, R5
ROM:00000318 UXTB R1, R1
ROM:0000031A MOV R6, SP
ROM:0000031C LDRB R6, [R6,#0x28+var_27]
ROM:0000031E LSLS R6, R6, #0x18
ROM:00000320 BPL loc_326
ROM:00000322 MOVS R6, #0x1B
ROM:00000324 B loc_328
ROM:00000326 @ ---------------------------------------------------------------------------
ROM:00000326 loc_326: @ CODE XREF: InvMixColumn+178↑j
ROM:00000326 MOVS R6, #0
ROM:00000328 loc_328: @ CODE XREF: InvMixColumn+17C↑j
ROM:00000328 MOV R7, SP
ROM:0000032A LDRB R7, [R7,#0x28+var_27]
ROM:0000032C LSLS R7, R7, #1
ROM:0000032E EORS R6, R7
ROM:00000330 MOV R7, SP
ROM:00000332 STRB R6, [R7,#0x28+var_26]
ROM:00000334 MOV R6, SP
ROM:00000336 LDRB R6, [R6,#0x28+var_26]
ROM:00000338 STRB R6, [R0]
ROM:0000033A MOV R6, SP
ROM:0000033C LDRB R6, [R6,#0x28+var_28]
ROM:0000033E LSLS R6, R6, #0x18
ROM:00000340 BPL loc_346
ROM:00000342 MOVS R6, #0x1B
ROM:00000344 B loc_348
ROM:00000346 @ ---------------------------------------------------------------------------
ROM:00000346 loc_346: @ CODE XREF: InvMixColumn+198↑j
ROM:00000346 MOVS R6, #0
ROM:00000348 loc_348: @ CODE XREF: InvMixColumn+19C↑j
ROM:00000348 LDRB R7, [R7]
ROM:0000034A LSLS R7, R7, #1
ROM:0000034C EORS R6, R7
ROM:0000034E MOV R7, SP
ROM:00000350 STRB R6, [R7,#0x28+var_27]
ROM:00000352 MOV R6, SP
ROM:00000354 LDRB R6, [R6,#0x28+var_27]
ROM:00000356 STRB R6, [R0,#1]
ROM:00000358 LSLS R6, R4, #0x18
ROM:0000035A BPL loc_360
ROM:0000035C MOVS R6, #0x1B
ROM:0000035E B loc_362
ROM:00000360 @ ---------------------------------------------------------------------------
ROM:00000360 loc_360: @ CODE XREF: InvMixColumn+1B2↑j
ROM:00000360 MOVS R6, #0
ROM:00000362 loc_362: @ CODE XREF: InvMixColumn+1B6↑j
ROM:00000362 LSLS R4, R4, #1
ROM:00000364 EORS R6, R4
ROM:00000366 MOV R4, SP
ROM:00000368 STRB R6, [R4,#0x28+var_28]
ROM:0000036A LDRB R4, [R4,#0x28+var_28]
ROM:0000036C STRB R4, [R0,#2]
ROM:0000036E LSLS R4, R5, #0x18
ROM:00000370 BPL loc_374
ROM:00000372 MOVS R3, #0x1B
ROM:00000374 loc_374: @ CODE XREF: InvMixColumn+1C8↑j
ROM:00000374 LSLS R4, R5, #1
ROM:00000376 EORS R3, R4
ROM:00000378 STRB R3, [R0,#3]
ROM:0000037A MOV R4, SP
ROM:0000037C LDRB R4, [R4,#0x28+var_26]
ROM:0000037E MOV R5, SP
ROM:00000380 LDRB R5, [R5,#0x28+var_27]
ROM:00000382 MOV R6, SP
ROM:00000384 LDRB R6, [R6,#0x28+var_28]
ROM:00000386 EORS R6, R5
ROM:00000388 EORS R3, R6
ROM:0000038A EORS R3, R4
ROM:0000038C STRB R3, [R0]
ROM:0000038E LDRB R3, [R0]
ROM:00000390 LDR R4, [SP,#0x28+var_24]
ROM:00000392 EORS R4, R3
ROM:00000394 UXTB R4, R4
ROM:00000396 EORS R2, R3
ROM:00000398 EORS R1, R3
ROM:0000039A LDR R5, [SP,#0x28+var_20]
ROM:0000039C EORS R5, R3
ROM:0000039E STRB R5, [R0]
ROM:000003A0 STRB R4, [R0,#1]
ROM:000003A2 STRB R2, [R0,#2]
ROM:000003A4 STRB R1, [R0,#3]
ROM:000003A6 ADD SP, SP, #0x14
ROM:000003A8 POP {R4-R7,PC}

This wouldn’t be the first time that a CPA attack deviated from theory :slight_smile:

The SBox load should be completely separate from the following key XOR, since I don’t think the M0 can use do a memory load as an operand for XOR.

Really, the CPA results you’re seeing seem to be completely at odds with what you’d expect from the assembly.

I finally had a chance to extract the original bootloader from the target and it binary compares with the bootloader I tested the stages with. Therefore, somehow the target is leaky with the mix columns stage or perhaps there is some odd caching leakage with the samd21.
Not really sure what to do next unless I want to turn this into a research project.