Clarification on trig_count and to know when AES execution finishes

I did that, but still the results weren’t logical, a simple for loop would result in different trig_count, so I do not get why

If you have an older version of CW you may be running into this issue:

Simply use the workaround as described there.
You can also measure the trigger length using some other logic analyzer.

Hello,

I checked and found that there are some nested loops in my code that do not run in constant time, it is in O(n) or O(n^2). But my question is why would the number of clock cycles(trigger count) differ even though I am executing the same example with the same inputs.

Thanks

int main() {
platform_init();
init_uart();
trigger_setup();
char passwd[2];
my_read(passwd, 2);
trigger_high();

int sum=81;
int res=0;
int n=10;
for (int i = 0; i < n; i++) {
for (int j = 0; j < i; j++) {
// do some operation
res+=sqrt(sum);
}
}
trigger_low();

return 0;

}
I tested on this code that doesnt run in constant time and still the trigger count was constant, can you please advise?

As I said previously there are many reasons why code may not be constant time – even if it looks like it should be. Lots has been written on this subject; here are some good starting points:
https://www.bearssl.org/constanttime.html
https://www.bearssl.org/ctmul.html

extern "C" {
#include <math.h>
#include <stdint.h>
#include <limits.h>

}
namespace FAST_INFERENCE {

static constexpr double layer_1_weight[8][784] = {{0.010481378063559532, -0.20006994903087616, 0.16964876651763916, -0.4134269654750824, 0.599524974822998, -0.2863198518753052, -0.525934636592865, -0.5960907340049744, -0.4246852397918701, -0.6001853346824646, 0.4880187511444092, -0.32405415177345276, -0.5035948753356934, -0.36805260181427, 0.4378497898578644, -0.590733528137207}, {-0.2780782878398895, -0.38749828934669495, 0.018045658245682716, -0.1361837387084961, -0.38750898838043213, 0.4055153429508209, -0.3692069351673126, -0.10092785209417343, 0.4716685116291046, 0.6154587864875793, -0.3489436209201813, -0.6461821794509888, 0.08582321554422379, 0.4470161497592926, -0.22909116744995117, -0.1495954543352127}, {0.025159927085042, 0.6662085056304932, -0.17330637574195862, -0.3318615257740021, -0.33116844296455383, -0.31469303369522095, 0.5019829869270325, 0.23292112350463867, -0.46398624777793884, -0.3344375491142273, -0.3197980225086212, -0.09608247131109238, 0.2707573175430298, -0.7146569490432739, -0.039474453777074814, -0.29152193665504456}, {-0.08383703231811523, -0.4569569528102875, 0.3984542489051819, -0.4955242872238159, -0.22933314740657806, 0.4448719024658203, -0.2325098216533661, 0.5631474852561951, -0.36670613288879395, -0.2718982994556427, -0.6486614942550659, -0.4166144132614136, 0.23851972818374634, -0.41076013445854187, 0.4563061594963074, -0.29672327637672424}};
#arrays deleted to fit post
static constexpr double layer_7_bias[10] = {-0.2741142213344574, 0.2411513477563858, 0.2006010264158249, -0.04932883009314537, -0.018418777734041214, -0.021283414214849472, 0.38660189509391785, -0.05926525220274925, 0.03319782391190529, 0.007607690989971161};

static double layer_1_output[8];
static double layer_2_output[8];
static double layer_3_output[8];
static double layer_4_output[16];
static double layer_5_output[16];
static double layer_6_output[16];
static double layer_7_output[10];
static double layer_8_output[10];


void predict_SimpleCNN(double const * const x, double * pred) {
    auto layer_0_output = x;
	
    // Layer 1: Gemm
    for (int d = 0; d < 8; d++) {
      layer_1_output[d] = layer_1_bias[d];
    }
    **for (int d = 0; d < 8; d++) {**
**      for (int i = 0; i < 784; i++) {**
**       // layer_1_output[d] += layer_1_weight[d][i] * layer_0_output[i];**
**      }**
**    }**

   // Layer 2: BatchNormalization
    for (int d = 0; d < 8; d++) {
      layer_2_output[d] = layer_1_output[d] * layer_2_scale[d] + layer_2_bias[d];
    }

    // Layer 3: Relu
    for (int d = 0; d < 8; d++) {
      layer_3_output[d] = layer_2_output[d] >= 0 ? layer_2_output[d] : 0;
    }

    // Layer 4: Gemm
    for (int d = 0; d < 16; d++) {
      layer_4_output[d] = layer_4_bias[d];
    }
    // Layer 5: BatchNormalization
    for (int d = 0; d < 16; d++) {
      layer_5_output[d] = layer_4_output[d] * layer_5_scale[d] + layer_5_bias[d];
    }

    // Layer 6: Relu
    for (int d = 0; d < 16; d++) {
      layer_6_output[d] = layer_5_output[d] >= 0 ? layer_5_output[d] : 0;
    }

}


}

Thank you for the resources, I went through the first one, but I am not so sure if the bold part in the code(which is causing fluctuations in trigger count) is a memory issue, can you please advise? ps I deleted some elements of the arrays to be able to post this

the loop with ** is the one causing the fluctuations

Ok so your non-constant time code involves multiplication. You should read through the second link that I provided above. For example:

When a CPU has non-constant-time multiplication opcodes, the execution time depends on the size (as integers) of one or both of the operands. For instance, on the ARM Cortex-M3, the umull opcode takes 3 to 5 cycles to complete, the “short” counts (3 or 4) being taken only if both operands are numerically less than 65536

Yes I read it, and when I removed the multiplication,problem is still there

so i guessed it is not the multiplication

The scope trigger count counts the number of clock cycles where the trigger is seen high, and since I am running the same code with the same everything, the cycles should not be changing