What is the type of the output of the Neural Network attack?

I have tried to test the profiling attack with neural network, I have used this tutorial wiki.newae.com/Profiling_Attack … l_Networks, I understand that must use the neural network to classify our inputs, which will give us our known key values or Hamming weights to complete the attack.

1/ My first question is about the difference between having the known key values as the attack resuly or the Hamming weights as the attack results?

The results that I have after trying this attack is:

[ 0. 0. 0. 0. 1. 0. 0. 0. 0.] [[ 0.00185322] [ 0.04963187] [ 0.13283786] [ 0.23381804] [ 0.22884202] [ 0.18269986] [ 0.13702157] [ 0.03021828] [ 0.00238221]]

This type of this attack was not clear, I don’t understand what type of information I have.
Could anyone help me please? thanks in advance.

Hi Marita,

This was really just a fun idea I had - I have no guarantee that the attack actually works!

I think you’re asking about two different methods for training the network. Looking at one intermediate byte of the AES operation, you could either train the neural net to look for the byte’s value (0x00 to 0xFF) or its Hamming weight (0 to 8). The first option will be harder to train because the power differences will be very small, but the template attack afterwards should require fewer traces.

I think the output you’re seeing is:

  • Hamming weight of the byte (here, HW = 4), and
  • Neural network output probabilities for each Hamming weight (so the network thinks HW = 3 is most likely)

Dear gdeon
Thank you very much for your answer.
I found many articles that are describing AES attack based on machine learning. So, I really found your tutorial is very interesting. But, I still have some questions.
1/ I don’t understand how you are building the training dataset? and how you are making the classifier?
I don’t understand those lines in the code:

output[i, tempHW[i]] = 1 # I don’t understand this line

What are meaning by this line [inputRange]*len(POIs) ?

[code] # 3: Build training dataset
output = np.zeros([len(tempTraces), 9]) # One-hot signals
for i in range(numTraces):
output[i, tempHW[i]] = 1

4: Try to make classifier

inputRange = [-1, 1]
POIs = [1397, 1345, 1364, 1353, 2113]
net = Net([inputRange]*len(POIs), 20, 9)[/code]

2/ How do you understand that Hamming weight of the byte is 4 and how do you know that the most likely HW for this same byte is 3?

Hamming weight of the byte (here, HW = 4), and Neural network output probabilities for each Hamming weight (so the network thinks HW = 3 is most likely)
Do take this interval [ 0. 0. 0. 0. 1. 0. 0. 0. 0.] , so that the indice of 1 is 4 ?? so based on this, you mean that you talking about the HW 4 ? then when you said that the most likely HW is it because that the HW have the best percentage in the output interval which looks like this:

[[ 0.00185322] [ 0.04963187] [ 0.13283786] [ 0.23381804] [ 0.22884202] [ 0.18269986] [ 0.13702157] [ 0.03021828] [ 0.00238221]]
3/ My third question, I have 499998 traces , Must I put all traces as an epoachs likw this:

net.train_many(tempTraces[0:numTraining, POIs], output[0:numTraining], 0.01, 499998, 0.001*numTraining, True)

Or I must split them in groups?

I would be very grateful if you could help me please.

I’m really struggling to remember all of the details, but I’ll do my best to answer these…

  • To make the training set, I split the traces into two groups. The first group (in the code right now, 50% of the traces)
    is used to train the neural net, and the second group (the remaining traces) are used to check the accuracy of the net.
  • The net’s output is a set of 9 probabilities, which represent how likely it is that the sbox-output Hamming weight is 0,
    1, 2, …, or 8. To make the training set for supervised learning, I made the expected output for each trace, which is all 0s except for a single 1 in the correct position. This is the line output[i, tempHW[i]] = 1
  • I found that training was taking a long time, making it hard to debug, so I tried cutting the traces down to 5 points of interest (which I calculated outside of this code). This means that the neural net expects 5 numbers between -1 and 1. This set of inputs is passed to the neural net via net = Net([inputRange]*len(POIs), 20, 9)
  • I know the expected Hamming weight for every trace because this is the Hamming weight of the SubBytes output, byte 0 - this is easy to calculate. The “most likely HW” is the one that the network indicates as most probable - in the example you posted, 0.23381804 is the highest probability, so HW=3 is the network’s guess.
  • In the train_many function, the entire data set is used to train the network. For us, that means numTraining different examples are passed to the network. However, this isn’t enough to fully train the network - it usually needs to see the same data multiple times to converge to a fully working network. The epochs value in train_many repeats the training for you - in my case, I ran it 2000 times. You probably don’t need to run all of the traces 499998 times. Maybe numTraining should be around 250000 traces for you (but this still sounds like overkill to me…).

I don’t have much experience with machine learning - hopefully you can find a network/training setup that works better than mine did! Keep me posted.