I’m doing some fixed-vs-random TVLA with the CW-lite and CW305. Currently, the loop looks something like:
- Prep input, write to registers on target,
- Trigger crypto and capture,
- Read back trace,
and I periodically check the output and write some traces to file. However, this is pretty slow. Gathering a couple million traces takes a day or two. I assume communication with the PC is the bottleneck. This is OK for smaller units (e.g. put a bunch of parallel copies of the UUT on the target), but not for testing larger higher-order masked implementations.
The “best” solution is probably to shift input generation to some other hardware that communicates faster with the target (and whose power supply is sufficiently separated from the target’s). E.g. write the fixed input once and let this additional hardware generate randomness, split the input, etc. There was at least one thread discussing something like this: CW305 Improved acquisition rate.
A few questions:
- What can I do short of the above to increase the rate of acquisition? Any tips or tricks?
- What’s the simplest way to implement something like the above?
- How do people get ~100M traces in a reasonable amount of time? Do they have parallel setups running for a week or two?