Hi! This is my first question here and I am new to this.
By inspecting some of the code in the target or victim in GitHub/newaetech/chipwhisperer I get the impression that the code runs as an interrupt, triggered by the input vector coming in that way, over a serial channel or USB. To me this looks rather repeatable.
What if the input vector were sent into another task, like in a (preemptive) operating (run-time) system? (Or coded in a language that supports tasks/processes). The code could trigger other task(s) to do something âbadly randomâ just to introduce noise, while the encryption is being done?
I could not find any discussion of this, neither here nor in the wiki. But it would be easy to miss out on such matters just by searching and poking around.
The victim programs are constructed in that way to make it the easiest to crack, so thatâs why you have there a trigger flag and only single thread running and itâs clocked from external source in low speed.
Noise removal and trace alignment are part of pre-processing stage.
I try to answer your questions from blog:
secure boot is to make sure that only you can run code on that cpu. For example: you are gaming consoles manufacturer, so you donât want to run my code on your console.
Itâs really depended how silicon is designed, see point 4, 5, 6, below.
Yes, pipelines/caches/speculative execution can impact your traces.
5, 6 Itâs dependent on the mental model, there are models for muliti-cycle implementations like regular CPU or single-cycle implementations like FPGA or dedicated accelerators. And you can also create a new model if you wish. The mental model means how real hardware is build. You normally attack data lines, so how they are implemented in silicon, are they pre-charged or not? This impact your traces.
thanks for your answer! Ok, the example code is for educational purposes, not to make it as difficult as possible to crack.
Thanks for answering my questions in the blog note! I didnât expect that
Safe boot. Ok, itâs about controlling and assuring that a binary image is runnable on that device. I must admit, I have written bootloaders where I did not check this, only the checksum, and another field or two, simply because I saw no other usage. But then, times have changed
I assume that when you say âimpact your tracesâ you mean âmaking it more or less easy to crackâ.
4 to 6. You say âyou normally attack data linesâ. This is what I understood the least reading myself up for the blog note. A NewAE document talked about internal data bus (I think). What you said here was âattack data linesâ and say nothing about internal, but I think that is what you meant? And that âattackâ is not doing something with the physical lines, but âatacking the problemâ of finding out about how the lines go [2]
I am going to post a question about this on the XCore Exchange forum as well (here). That architecture schedules out cycles to tasks over âlogical coresâ, so if I had a task on each of those 8 cores and only one of them was doing encryption then those 7 other may fast jam quite much, provided thwy were set up to do something badly random. This is new to me, I havenât though about any architecture in this context, but XCORE is the one I have been working with recently (plus some FPGA). (Standard disclaimer about my relationship with XMOS)
To add a bit more to this, as 31415 mentioned, the ChipWhisperer tutorials are constructed to highlight the basic principles of the side-channel attack.
Adding noise or time jitter can complicate the attack, but in most cases doesnât stop it. Noise can be defeated by averaging several measurements. Time jitter can be removed through various means. These are all obstacles to the attack, and depending on their intensity they can make the attack much more difficult, or even force the attacker to use a different / more efficient approach.
Itâs absolutely possible to study all this on the ChipWhisperer platform! You can use our examples as a starting point, and take them wherever you want .
Thanks, @jpthibault. Very interesting. Even if I may not personally have the time to go furher with a ChipWhisperer platform (I have too much left to do on the IceCore FPGA board (and even if I am retired and living at these times, there are only 24h per day)), but I will make sure that this matter is conveyed whenever relevant. ChipWhisperer + XCORE (or future xcore.ai) board I would love to do something with.
Sorry for inconvenience I wrote my post with many shortcuts
By âimpact your tracesâ I meant âimpact your power tracesâ.
By âattack data linesâ I meant âexploit physics on data lines exampleâ. And yes, internal (on silicon) data lines.
I donât like yes/no answers. And I have no problem being a little private. I stumbled across this theme and thought it interesting from an academic point of view. I am retired, but do try to learn by blogging, writing articles and doing some hands-on. I am also an affiliated with real-time and concurrency at the university here. So, I guess the answer is no and yes! No for anything product-like, yes to learn and blog about. But I must admit life is too short to learn very much at all, and I have tried to specialise on XMOS processors, partly because I have a history with occam and transputers from the ninetees. It changed my life, and I wanted to try to hang onto that feeling.
Hmm, if you do search for countermeasures you need to find something that is reliable and works all the time. I wouldnât bet my life on task scheduling, because itâs too much unpredictable. What happens if other tasks sleep? Or other cores are in reset or error state?
Again, yes and no! Yes generally (for Linux, f.ex.), no for the XCORE architecture. Or rather, thatâs the thread Iâd like to investigate. This may be shot down later, but itâs fun to fly now:
Aside: XMOS architecture
(Standard disclaimer). The xCORE architecture is a development from the transputer, which had occam as a native language and had microcode for a cooperative scheduler. xCORE is much more advanced and now even more is in hardware blocks, like timers and chanends. And the xC is the accompanying language that supports tasks, synchronous channels, interfaces with defined pattern for client/server and asynchronous messaging. But C and C++ may be used, and they now say that FreeRTOS is available soon.
But the exiting thing in this context is the scheduling. A slice would have 8 logical cores. If I only have one task it would get all the cycles for it, if I have 8 and they each used a core they would get 1/8 of the cycles. This means over 8 cycles they would be scheduled on a cycle basis, one for each. At least thatâs how I have understood it, because this is the mechanism they use for the timeout pragma, where the compiler will err if some timeline has not been met. Even if nondeterministic from the outside, the basic mechanism is full determinism, if not programmed for nondeterminism with random. A task could have a core alone or multiplexing several tasks on one core (their conditional selects (waiting on event) will be joined by the compiler/mapper when that is allowed for a particular task time, of which there is 3).
But in theory I think that if I have one task handle encryption to be spied upon I could in theory let the other cores do something âvery randomâ while that encryption ie being done. That encryption could even communicate (always basically synchronous, or sleep on 10ns ticks) and make it even worse. Plus, that encryption code task could even share a core with other tasks. One could even do several encryption tasks concurrently, by stepping each otherâs feet in some terrible way. But itâs NOT like encryption might have the processorâs cycles for its full duration.
You will find loads of pages with this at my blog, but I have tried to draw up the big lines above.