…not sure this is countermeasure against SCA but it is better to ask here.
Below is the picture of the set of 10 traces.
The crypto core is clocked by 24Mhz. Internally, the PLL multiplies the clock to ~166 Mhz (according to spec).
I use scope.clock.clkgen_freq = 24000000 with CW Lite to run the crypto core.
So, the target core and CW Lite are synced in terms of clocking.
CW Lite’s ADC is clocked by 96Mhz (4 * 24Mhz).
The scope.adc.trig_count value for each trace has different value which tells that HW AES takes also different time.
To filter the traces with the same trig_count value (252) I use the loop:
N = 20000
for i in tqdm(range(N)):
while True:
key, text = ktp.next()
trace = cw.capture_trace(scope, target, text, key)
if trace is None:
continue
if scope.adc.trig_count != 252:
continue
project.traces.append(trace)
break
which gives us the picture I shared.
We can see the traces are not well aligned.
Is it a countermeasure against SCA based on drifting PLL frequency (on the target core) or is it instability of the frequency (24Mhz) generated by the CW Lite?
Yes, I realize this. I tried CW Husky but it gave worse results with exactly the same script:
So, CW Lite is the option for debugging purpose.
It has a range ~ 250 - 290 cycles. Looks pretty great value.
Maybe PLL’s VCO reference voltage is randomly changed to add “noise” in the output frequency which drives the crypto core?
BTW, did you measure stability (especially on relatively high frequency 24Mhz) of the clock generated by CW Lite? What is stability in compare with crystal oscillator?
No we haven’t – it’s not a concern for its intended use (synchronous sampling).
In the CW-lite, all clocks are generated by the Xilinx Spartan6 DCM; you’ll find its specs in the Xilinx documentation.
Yes, I agree with this conclusion.
But, in theory, the x4 frequency multiplier and ADC can bring some instability which can be visible on high frequency (96Mhz).
I can try to measure(by sampling with 96Mhz) 24Mhz produced by the CW Lite to check accuracy of obtained data.
If the data will be perfectly aligned, then it can be an evidence of drifted frequency on the target board (probably as a countermeasure against SCA).
I made other changes. It is known, that sampling, at least on 24Mhz works well. So, I just used
scope.clock.clkgen_freq = 24000000
scope.clock.adc_src = "clkgen_x1"
and got such picture
which almost confirms an idea about drifted clock on the target device.
The possible option to collect synced traces is to add comparison between several points in the reference trace and each new incoming trace and peek matched.