brebisson wrote:
hello,
I thought that a LOAD only took 1 cycle if you did not use the loaded value just after...
ie:
LDR R0, [R1]
ADD R2, R2, R4
ADD R2, R2, R0
takes 3 cycles because the end of the LDR is able to run in "parallel" with the ADD R2, R2 R4
while
LDR R0, [R1]
ADD R2, R2, R0
ADD R2, R2, R4
is 4 cycles because we need to wait 1 cycle for the end of the LDR before the ADD R2, R2, R0?
also, I thought that a non executed loop was only 1 cycle as the ARM assumes that a branch is not taken and that the 3 cycle for the loop was only in the case where the loop was executed.
can anyone confirm?
regards, cyrille
That is true of ARM9, but not ARM7. In ARM7, the load is always three cycles (ignoring any delays imposed by the memory system). On the first cycle it calculates the address for the load, the second it actual performs the read and because it doesn't get the data back until the end of the cycle, it then takes a further cycle to update the register. ARM9 has two additional pipeline stages which allow these cycles to be hidden, if (as you say), the loaded data is not used by a following instruction.
It is correct that for both ARM7 and ARM9, a non-taken branch is 1 cycle and a taken branch is 3 cycles.
Riveywood