Countermeasure against AES?

NewDwarf · September 9, 2024, 4:49am

…not sure this is countermeasure against SCA but it is better to ask here.
Below is the picture of the set of 10 traces.

The crypto core is clocked by 24Mhz. Internally, the PLL multiplies the clock to ~166 Mhz (according to spec).
I use scope.clock.clkgen_freq = 24000000 with CW Lite to run the crypto core.
So, the target core and CW Lite are synced in terms of clocking.
CW Lite’s ADC is clocked by 96Mhz (4 * 24Mhz).

The scope.adc.trig_count value for each trace has different value which tells that HW AES takes also different time.
To filter the traces with the same trig_count value (252) I use the loop:

N = 20000

for i in tqdm(range(N)):
    while True:
        key, text = ktp.next()
        trace = cw.capture_trace(scope, target, text, key)
        if trace is None:
            continue
        if scope.adc.trig_count != 252:
            continue
        project.traces.append(trace)
        break

which gives us the picture I shared.
We can see the traces are not well aligned.
Is it a countermeasure against SCA based on drifting PLL frequency (on the target core) or is it instability of the frequency (24Mhz) generated by the CW Lite?

jpthibault · September 9, 2024, 3:18pm

If the target is running at 166 MHz and you are undersampling at 96 MHz, don’t expect great results, even if both clocks share a common source.
scope.adc.trig_count is measured by the ADC sampling clock; in this case I would expect it to have a variation of +/- [1,2,3] cycles if the target is running in constant time.
No PLL is perfectly stable; all PLL-generated clocks will have both jitter and wander. For side-channel attack measurements, this is normally dealt with by oversampling.

NewDwarf · September 9, 2024, 3:38pm

Yes, I realize this. I tried CW Husky but it gave worse results with exactly the same script:

So, CW Lite is the option for debugging purpose.

It has a range ~ 250 - 290 cycles. Looks pretty great value.

Maybe PLL’s VCO reference voltage is randomly changed to add “noise” in the output frequency which drives the crypto core?

BTW, did you measure stability (especially on relatively high frequency 24Mhz) of the clock generated by CW Lite? What is stability in compare with crystal oscillator?

jpthibault · September 9, 2024, 4:00pm

No we haven’t – it’s not a concern for its intended use (synchronous sampling).
In the CW-lite, all clocks are generated by the Xilinx Spartan6 DCM; you’ll find its specs in the Xilinx documentation.

NewDwarf · September 9, 2024, 4:17pm

Yes, I agree with this conclusion.
But, in theory, the x4 frequency multiplier and ADC can bring some instability which can be visible on high frequency (96Mhz).
I can try to measure(by sampling with 96Mhz) 24Mhz produced by the CW Lite to check accuracy of obtained data.
If the data will be perfectly aligned, then it can be an evidence of drifted frequency on the target board (probably as a countermeasure against SCA).

NewDwarf · September 9, 2024, 6:13pm

I made other changes. It is known, that sampling, at least on 24Mhz works well. So, I just used

scope.clock.clkgen_freq = 24000000
scope.clock.adc_src = "clkgen_x1"

and got such picture

which almost confirms an idea about drifted clock on the target device.
The possible option to collect synced traces is to add comparison between several points in the reference trace and each new incoming trace and peek matched.