Using HW encryption on STM32F4 target

Hi,

I bought an STM32F415RGT6 chip that I soldered on a blank STM32F target board.
When I compile the firmware for this target it works even though I have to go through OpenOCD as the STM32 programmer gives me an exception while programming (it goes in bootloader mode but fails later) and I can capture without any error. So the board seems to work.
But when I look at the code, it uses TinyAES128 implementation and not the hardware acceleration. Any one experienced in using the STM32F CRYP accelerator?
At the moment I created a new target CW308T_STM32F415 which sets the HWCRYPTO flag with stm32f4 HAL and I added in stm32f4_hal.c the 3 required functions (HW_AES128_Init, HW_AES128_LoadKey and HW_AES128_Enc). But the code I put doesn’t work (Init function reboots the CPU).

Replying to myself.
Implementation done and working: github.com/jmichelp/chipwhisper … bd13bcc321

I wonder if you can tell me what frequency you were running the STM32F4 at.

I thought possibly the ChipWhisperer would not be able to sample fast enough to be able to analyse this device ?

The frequency has to match the one you set when you compile the firmware because otherwise your UART communication will not work.
I typically stick to the Chipwhisperer default frequency of 7.37MHz or I just round it up to 8MHz.
This is of course when you do synchronous measurement (i.e. Chipwhisperer provides the clock to the chip and the sampling is based on this exact same clock). If the chip comes with its own clock and you cannot sync on it, Chipwhisperer still can go at around 100 MS/s so you can attacks chips up to 25MHz clock (theoretically 50 MHz with Shannon-Nyquist limit but the closer you get to this limit the less reliable your attack may be).

Ah OK

Thats a problem for me because the incoming data is via USB , and I have to run the CPU at its normal frequency for that transfer

So I’d need to change clock freq after to block transfer of incoming data, and then switch back after the decrypt function.

I 'm not sure if the USB bus will drop out if I do that, I suspect it will :frowning:

Well, I don’t know exactly what you’re trying to do here but USB shouldn’t be an issue.
If your chip needs to run at 24MHz to work correctly or 48MHz, so be it. Either you configure Chipwhisperer to produce that clock frequency and feed it to the chip or you have a quartz on your board and you feed this clock back to the Chipwhisperer to sync on it.
Don’t change the clock frequency. Like I said, the clock frequency is defined at compile time.
Or maybe I haven’t understood what you’re trying to do.

I am not actually using a STM32F4, I’m using a comparable chip by NXP (aka Freescale) from their K22 series.

But I don’t think there is much documentation on using ChipWhisperer with NXP devices, and my general concern was related to whether the ChipWhisperer was able to sample the power fast enough to handle a 120Mhz CPU.

e.g. Normally the STM32F4xx uses a 8 or 16Mhz clock (or similar values) and the internal PLL multplies that up to either 168 or 180Mhz etc depending on the specific F4 device.

So although the ChipWhisperer is providing the external clock, the actual instruction clock is much faster, and its likely to be executing at least 16 instructions for every clock cycle generated by ChipWhisperer

The STM32F4 derives its USB clock from its main clock, so you can’t adjust the external clock freq without changing the PLL settings, and of course thats not possible because memory mapped registers are locked.

So if the ChipWhisperer needs to run the MCU on a slower clock than the firmware is intended to run, then the USB host also needs to be slowed down to match.

I guess I can use another STM32F4 as a USB host, but I was hoping not to need to write host code in order to get the ChipWhisperer to analyze the current
However it looks increasing likely that I’ll need to do that :frowning:

The main limitation with regards to clock frequency comes from the maximum sampling rate of the on-board scope on Chipwhisperer (aka OpenADC). It’s always a trade-off: to keep CW hardware affordable you need to make some compromises.
So for higher end targets, you need a better scope. Fortunately, CW software stack is compatible with that and you can use a Picoscope or any other VISA-compatible scope (when I will have some time to finish the pull-request with a working VISA plugin…cough cough) in order to lift that limitation.

All you need is a sampling rate that is at least 4 times the target clock speed. So in your case, any modern DSO should be able to meet the 480MS/s. You can even take advantage of the oversampling to do a low-pass FIR filter and remove the noise. You also need to pay attention to the sample buffer of the scope: the DSO might not have enough memory to capture the full AES encryption/decryption in one shot. But because you typically attack either the first round or the last round of AES, you can simply capture the round you target.

Target clock speed is 120Mhz internally to the MK22, so its going to require a PicoScope 6000 series, which is over $4000 on Amazon (amazon.com/Oscilloscope-Pic … B01N0BAV80)

This is well outside my budget.

I’ll need to find an alternative solution to the sampling.

Specifically dropping the clock rate on the MK22 right down e.g to 1Mhz, so that I can still use the ChipWhisperer on-board scope.

But this will require writing dedicated USB host firmware, which runs at perhaps 4.8Mhz instead of 48Mhz

I think you’re mixing up the sample rate and the bandwidth here.
For a clock at 120 MHz, you need a scope that can do 480MS/s (sampling rate). Even a Picoscope 2000 does 1GS/s.

The bandwidth is the maximum frequency that the analog frontend of the scope can handle without distortion. So ideally in your case you want it to be slightly above the clock frequency. 200 MHz bandwidth scope is pretty common nowadays

Thanks for the information.

I’m aware of sample rates vs resolvable frequency. I have a 1Gs/S Rigol scope, which is rated at 100Mhz, as the norm is for at least 10 fold oversampling

The Picoscope 2000 is at best the same performance as my Rigol, but possibly has better USB interfacing.

So perhaps my Rigol would be fast enough if I can work out its USB interface (however I don’t know if its just USB control rather than USB data transfer, as I generally don’t use it as a PC oscilloscope, I use it as bench scope.

Either way, I’d have expected to need a 200Mhz (2Gs/S) scope to be able to resolve 120Mhz, especially as the scope needs to detect the transient current when the bits change state in the MCU.

I think I’ll need to do more analysis on how I can interface to the MK22 and also the Rigol before I proceed.

Or alternatively reduce the clock speed on the MK22 to a much lower value e.g. 100kHz, so that the internal scope on the ChipWhisperer would be fast enough.
But that would require me to build a slow speed USB host interface, probably using a STM32F4, which would also be clocked at low frequency by the ChipWhisperer, which would be used to send the encrypted data to the MK22 via USB

Then one of my previous messages applies: I also have a Rigol (DS4054) and I have work in progress to support Rigol scope in Chipwhisperer. No matter how you connect it to a computer (GPIB, Ethernet, USB), the software interface is the same and is called VISA (from National Instrument). There’s a broken VISA plugin in Chipwhisperer that I fixed but now I still have to implement the commands for Rigol scope and test everything.
If you give me the model of your scope, I can make sure I add it too :slight_smile:

I have an older series DS1102D Rigol.

Its the 100Mhz version (not a modified 50Mhz unit) and has the external 16 channel logic analyser.
I don’t use the logic analyser unit any more, because I have a 100Mhz Saleae Logic 16 (old version which does 1 channel at 100Mhz or 2 channels at 50Mhz etc)

BTW. I fairly knowledgeable on STM32 devices as I’m the maintainer / admin on the community Arduino Core for STM32 and I run www.stm32duino.com
I have a blog youtube.com/user/synergie8/videos

Hi…as per my knowledge there is much documentation on using ChipWhisperer with NXP devices, and my general concern was related to whether the ChipWhisperer was able to sample the power fast enough to handle a 120Mhz CPU.So although the ChipWhisperer is providing the external clock, the actual instruction clock is much faster, and its likely to be executing at least 16 instructions for every clock cycle generated by ChipWhisperer
The STM32F4 derives its USB clock from its main clock, so you can’t adjust the external clock freq without changing the PLL settings, and of course thats not possible because memory mapped registers are locked.

try this when you compile the firmware:

make PLATFORM=CW308_STM32F4 CRYPTO_TARGET=HWAES

As I was tired at some point to compile firmware and remember which option I used, I created a github repository with pre-compiled firmware: github.com/jmichelp/chipwhisper … serial-aes

Thanks so much jmichel, your repo is fantastic