Jump to content
L293D

__delay_cycles

Recommended Posts

First, thank you pabigot for doing the experiment, and thank you Lyon for suggesting alternative instruction sequence.

Since I am wrestling with what I think is a timing problem in a library, I would have tried some of this out,
but I have some outside timing constraints at the moment, and no time to tinker with electronics for a bit.

 

Since you asked for predictions:

I do not expect the "The universe is borked" result - even at 16MHz it is going to take some time to fetch those instructions, so at some point a sequence of NOOPs

should exhibit performance difference.

 

I would expect one of the other results, (Null, learn something, surprise me).  I haven't studied the architecture enough to have an opinion about which one.

 

However I would also be cautious about expecting that result to be stable/portable.

(It may be true for this version of the Cortex-M4, but will it change on the 129x, or on some future Tiva, or on another manufacturers processor?)

So even if it works today, I would at least put a big caveat comment on any code where I used it warning that could cause portability problems,

or just stop working with a later revision of the processor.  (Or better yet, I would try to find something that is officially supposed to take time.)

 

On the paranoid end - what does the manual say about MOV r8, r8  Does that actually have any meaning that means some action has to happen, or

could a really smart CPU throw that out as well?  (If it is not defined as having some effect - e.g. changing flags, then another possible prediction

is that some of the MOV r8, r8 instructions get tossed out.)

 

I have been sort of thinking as architectures get more devious they should have something like a WAIT n instruction - guaranteed to make the following instruction be at least n clock cycles after the previous.  (Now maybe that should be a compiler directive, and let the compiler be clever, or maybe an actual processor instruction.)

Share this post


Link to post
Share on other sites

The code is here; the analysis is here.

 

The resolution of the prediction is I learned several things---among them, that MOV R8, R8 takes two cycles---but the null hypothesis holds.

 

The conclusion is:

Don’t muck about trying to be clever: for a one-cycle delay just use __NOP(), the ARM CMSIS standard spelling for an inline function that emits the NOP instruction. Where it has an effect, it’s a one-cycle effect. Where it doesn’t, other instructions don’t behave any better.

Share this post


Link to post
Share on other sites

However I would also be cautious about expecting that result to be stable/portable.

(It may be true for this version of the Cortex-M4, but will it change on the 129x, or on some future Tiva, or on another manufacturers processor?)

So even if it works today, I would at least put a big caveat comment on any code where I used it warning that could cause portability problems,

or just stop working with a later revision of the processor.  (Or better yet, I would try to find something that is officially supposed to take time.)

 

On the paranoid end - what does the manual say about MOV r8, r8  Does that actually have any meaning that means some action has to happen, or

could a really smart CPU throw that out as well?  (If it is not defined as having some effect - e.g. changing flags, then another possible prediction

is that some of the MOV r8, r8 instructions get tossed out.)

  directive, and let the compiler be clever, or maybe an actual processor instruction.)

This is why I object to attempts to be clever: any instruction that has no effect could in principle be removed too. If there is any instruction that will do what you describe it's going to be what the vendor specifies as NOP, because there's a boatload of code and years of developer experience that expects NOP in an instruction stream to produce the minimum delay that can be expressed. Sometimes that delay might fall into a stalled pipeline stage or get dropped after decode and so not extend the duration of an unfinished instruction. That's something experienced coders will be comfortable with and will account for if necessary by adding more NOPs. It's no reason to consider searching for another instruction to use instead.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...