CW template attack

Hi!

I’m using CW Lite and trying to do an AES-128 template attack on the UFO STM32F3 target. I’ve been looking at the tutorial that’s now in the jupyter/archive folder, so I guess it’s not officially supported anymore, but I figured I could give it a try. It works well out of the box when I do the attack on the Hamming Weight output of the SBox, I get the correct first sub-key with around 8 attack traces using 10 000 template traces, and it’s down to perhaps 4 attack traces using 100 000 template traces. But then I wanted to try the suggestion in the tutorial and attack the sub-key value itself, it doesn’t look too good. For the attack I’m collecting 100 000 traces with random key and random plaintext, and for the attack I collect traces with fixed key and random plaintext. I also use 5 POIs and a 5 sample “guard band” around them (thus, ignoring immediate nearby samples). I try with 30 attack traces, and output the top 10 key rank at each trace iteration, with the highest to the right, and get this:

[ 76 36 66 44 32 68 64 72 104 8]
[ 10 100 68 76 65 72 36 8 104 44]
[ 73 38 72 69 44 104 74 108 98 100]
[ 69 108 76 38 74 44 104 72 100 98]
[ 72 76 38 69 44 74 108 104 100 98]
[ 76 69 108 104 72 74 44 38 100 98]
[ 74 12 76 104 44 4 38 72 98 100]
[ 69 12 76 104 44 38 4 98 72 100]
[ 69 12 76 104 44 38 98 4 72 100]
[ 12 69 4 76 38 104 72 98 44 100]
[ 74 12 76 4 38 98 72 104 44 100]
[108 69 74 76 72 38 100 104 98 44]
[ 74 108 72 69 76 38 100 98 104 44]
[ 74 108 72 69 76 38 98 100 104 44]
[108 74 72 69 76 38 100 98 104 44]
[108 74 69 72 38 98 76 100 104 44]
[ 12 74 69 38 98 72 76 100 104 44]
[ 65 12 98 69 72 38 100 104 76 44]
[ 74 12 69 98 38 72 100 104 76 44]
[ 65 74 72 69 98 38 100 76 104 44]
[108 72 74 100 38 69 98 76 104 44]
[108 72 100 74 38 98 69 76 104 44]
[ 65 100 72 38 74 69 98 76 104 44]
[ 65 72 100 38 74 69 98 76 104 44]
[ 72 108 100 38 74 76 69 98 104 44]
[ 72 100 108 38 74 76 69 98 104 44]
[ 65 100 108 38 74 76 98 69 104 44]
[100 76 197 38 74 108 98 104 69 44]
[100 197 76 38 74 108 98 104 69 44]
[ 65 74 197 76 38 98 108 104 69 44]

The correct sub-key should be the well-known CW default value 43. I do understand that this requires a lot more traces than with the HW model, but still, should I expect it to be this bad? How many traces should I expect to collect for the template build?

As for the code, it’s mostly copied and pasted from the CW tutorial, but of course made some changes to adopt to attacking the sub-key value itself. I may have done a silly Python mistake along the way, but I’ve gone over it, and can’t seem to find it:

import numpy as np
from scipy.stats import multivariate_normal
import matplotlib.pyplot as plt
import chipwhisperer as cw

def cov(x, y):
    # Find the covariance between two 1D lists (x and y).
    # Note that var(x) = cov(x, x)
    return np.cov(x, y)[0][1]


#%%

project_template = cw.open_project("C:/CW_tests/Captures/STM32F303_rKEY_rPT_t100000_s5000.cwp")

tempTraces = np.empty([len(project_template.traces), len(project_template.waves[0])])
tempTraces[:,:] = project_template.waves[:][:]
tempTraces_list = tempTraces.tolist() # convert to list, convenient for sorting later on

tempPText = np.empty([len(project_template.textins), len(project_template.textins[0])])
tempPText[:,:] = project_template.textins[:][:]
tempPText = np.uint8(tempPText)

# ct = np.empty([len(project_template.textouts), len(project_template.textouts[0])])
# ct[:,:] = project_template.textouts[:][:]
# ct = np.uint8(ct)

tempKey = np.empty([len(project_template.keys), len(project_template.keys[0])])
tempKey[:,:] = project_template.keys[:][:]
tempKey = np.uint8(tempKey)


#%%
# Make 256 blank lists - one for each subkey
tempTracesSubKey = [[] for _ in range(256)]
subKey_to_find = 0

# Sort traces according to sub-byte
for i in range(len(tempTraces_list)):
    SubKey = tempKey[i][subKey_to_find]
    tempTracesSubKey[SubKey].append(tempTraces_list[i])

# Convert to numpy arrays
tempTracesSubKey=[np.array(tempTracesSubKey[SubKey]) for SubKey in range(256)]

#%%
# Find averages
tempMeans = np.zeros((256, len(tempTraces[0])))
for i in range(256):
    tempMeans[i] = np.average(tempTracesSubKey[i], 0) 
    
# Find sum of differences
tempSumDiff = np.zeros(len(tempTraces[0]))
for i in range(256):
    for j in range(i):
        tempSumDiff += np.abs(tempMeans[i] - tempMeans[j])

plt.plot(tempSumDiff)
plt.grid()
plt.show()

#%%
# Find POIs
POIs = []
numPOIs = 5
POIspacing = 5
for i in range(numPOIs):
    # Find the max
    nextPOI = tempSumDiff.argmax()
    POIs.append(nextPOI)
    
    # Make sure we don't pick a nearby value
    poiMin = max(0, nextPOI - POIspacing)
    poiMax = min(nextPOI + POIspacing, len(tempSumDiff))
    for j in range(poiMin, poiMax):
        tempSumDiff[j] = 0
    
print(POIs)

#%%
# Fill up mean and covariance matrix for each sub-key
meanMatrix = np.zeros((256, numPOIs))
covMatrix  = np.zeros((256, numPOIs, numPOIs))
for SubKey in range(256):
    for i in range(numPOIs):
        # Fill in mean
        meanMatrix[SubKey][i] = tempMeans[SubKey][POIs[i]]
        for j in range(numPOIs):
            x = tempTracesSubKey[SubKey][:,POIs[i]]
            y = tempTracesSubKey[SubKey][:,POIs[j]]
            covMatrix[SubKey,i,j] = cov(x, y)
        
# print(meanMatrix)
# print(covMatrix[0])

#%%
# Load attack traces

project_attack = cw.open_project("C:/CW_tests/Captures/STM32F303_fKEY_rPT_t10000_s5000.cwp")

atkTraces = np.empty([len(project_attack.traces), len(project_attack.waves[0])])
atkTraces[:,:] = project_attack.waves[:][:]

atkPText = np.empty([len(project_attack.textins), len(project_attack.textins[0])])
atkPText[:,:] = project_attack.textins[:][:]
atkPText = np.uint8(atkPText)

# ct = np.empty([len(project.textouts), len(project.textouts[0])])
# ct[:,:] = project.textouts[:][:]
# ct = np.uint8(ct)

atkKey = np.empty([len(project_attack.keys), len(project_attack.keys[0])])
atkKey[:,:] = project_attack.keys[:][:]
atkKey = np.uint8(atkKey)

# Use only a subset of the available traces
numTracesAtk = 30
atkTraces = atkTraces[0:numTracesAtk, :]
atkPText = atkPText[0:numTracesAtk, :]
# ct = ct[0:numTraces, :]
atkKey = atkKey[0:numTracesAtk, :]

print(atkKey[0])

#%%

# Attack
# Running total of log P_k
P_k = np.zeros(256)
# for j in range(len(atkTraces)):
for j in range(numTracesAtk):
    # Grab key points and put them in a small matrix
    a = [atkTraces[j][POIs[i]] for i in range(len(POIs))]
    
    # Test each key
    for k in range(256):
        # Find p_{k,j}
        rv = multivariate_normal(meanMatrix[k], covMatrix[k])
        p_kj = rv.logpdf(a)
   
        # Add it to running total
        P_k[k] += p_kj

    # Print our top 10 results so far
    # Best match on the right
    print(P_k.argsort()[-10:])