Jump to content
43oh

Delay function with one cycle granularity


Recommended Posts

CCS has an intrinsic delay, but it requires a constant value. So you can do this:

__delay_cycles(1234);

but not this:

unsigned x = 1234;
__delay_cycles(x);

 

The usual way to do a simple delay is something like this:

unsigned x = 1234;
while(--x);

in assembly:

mov #1234, R12
dec R12
jne $ - 2

That has three cycle granularity. One cycle for the dec, and two cycles for the conditional jump.

 

To make a delay with one cycle granularity, two techniques can be combined. The first is to make the simple delay loop decrement the count by the number of cycles in the loop. The loop takes three cycles, so the obvious code would be:

sub #3, R12
jne $ - 2

This will not work because subtracting 3 from R12 takes two cycles rather than the one cycle used by dec R12. So the obvious fix would be to subtract four because the loop now requires four cycles:

sub #4, R12
jne $ - 2

This will not work either because subtracting 4 from R12 takes only one cycle! Whenever the constants -1, 0, 1, 2, 4, or 8 are used a compact instruction form can be used that is one cycle shorter. The solution is:

sub #4, R12
nop
jne $ - 4

The loop takes four cycles and subtracts four every iteration. Perfect.

 

One problem with this loop is that it will only terminate if the initial value is a multiple of four. The fix for this is to change the jne (jump if not equal / jump if not zero) to jc (jump if carry). The loop will then terminate when R12 underflows.

sub #4, R12
nop
jc $ - 4

 

When the loop terminates the value in R12 will be -1, -2, -3, or -4. This is the remainder of division - the modulus. It is the negative of the number of cycles that have to be skipped to make the loop exactly the right duration.

A variable delay of one cycle granularity can be done with a computed jump in to a series of instructions that execute in one cycle. This is not practical for a wide range of delay, but is perfect for handling the remainder of the four cycle loop. The MSP430 nop instruction takes one cycle, so...

add R12, PC ; add R12 to the program counter to skip over some of the nops
nop
nop
nop
nop

 

Each nop is two bytes, so the remainder has to be multiplied by two after it is negated:

inv R12
rla R12
add R12, PC
nop
nop
nop
nop

 

A constant can be subtracted from the desired cycle count to compensate for loop overhead (call, return, computed jump)

 

The complete code: delay.asm

          .cdecls C, LIST, "msp430g2211.h"  ; Include device header file

           .text
           .global delay_cycles

delay_cycles
       sub     #20, R12                ; Adjust for loop overhead
       sub     #4, R12                 ; Subtract four w/ four cycles per iteration
       nop                             ;
       jc      $ - 4                   ;
       inv     R12                     ; Negate the remainder
       rla     R12                     ; Multiply by two
       add     R12, PC                 ; One cycle granularity computed jump
       nop                             ;
       nop                             ;
       nop                             ;
       ret                             ; Return

 

A simple test:

void delay_cycles(unsigned);

int main(void)
{
   WDTCTL = WDTPW + WDTHOLD;

   delay_cycles(20);
   delay_cycles(21);
   delay_cycles(22);
   delay_cycles(23);
   delay_cycles(24);
   delay_cycles(25);
   delay_cycles(26);
   delay_cycles(27);
   delay_cycles(28);
   delay_cycles(29);
   delay_cycles(30);
   delay_cycles(31);
}

 

It is possible to use addition rather than subtraction in the loop. This is more efficient on some microcontrollers, but not the MSP430. It may be advantageous in some special applications.

delay_cycles
       inv     R12
       add     #20, R12
       add     #4, R12
       nop
       jnc     $ - 4
       rla     R12
       add     R12, PC
       nop
       nop
       nop
       ret

Link to post
Share on other sites

That's because it is an intrinsic (in GCC at least, but it's very highly likely so is in CCS), the compiler inserts the appropriate instructions on the fly, as it encounterts this call (so it's not a "function" in the traditional sense).

 

If you are curious about a (possible) implementation, I suggest you look at the one in MSPGCC (as it is available to study, while CCS isn't, so not much of a choice :)), which will be around here. The actual code piece inserted by the compiler will probably be in msp430.md, but frankly I've given up searching after the gazillionth indirection :D.

 

You can also try compiling a simple piece of code (with CCS), then disassemble it.

Link to post
Share on other sites

It is explained in SLAU132E. Code is dynamically generated so there is no source code for it (other than the source code of the C/C++ compiler itself).

 

The advantage on an intrinsic is that is can go down to 1 cycle. The disadvantage is possible code bloat due to code being generated for each use of __delay_cycles(). Using a delay function when possible may reduce code size relative to the intrinsic.

post-2341-135135501632_thumb.png

Link to post
Share on other sites
It is explained in SLAU132E. Code is dynamically generated so there is no source code for it (other than the source code of the C/C++ compiler itself).

 

Yeah, I noticed it was dynamic by looking at the generated code. But, I also thought I found it off by a couple cycles once or twice. Guess I need to take a closer look.

 

Thanks for the speedy reply.... Regards...

 

 

The __delay_cycles() intrinsic function does appear to be "cycle accurate". Great news! Thank you...

Link to post
Share on other sites
  • 5 months later...

Also callable from C as

 

void delay_cycles(uint16_t num);

int main(void)
{
  uint16_t foo = 123;
  delay_cycles(foo);
}

 

msp430-gcc -Os -mmcu=msp430g2231 -c main.c -o main.c.o
msp430-gcc -Os -mmcu=msp430g2231 -c delay.s -o delay.s.o
msp430-gcc -mmcu=msp430g2231 -Wl,--sort-common  -o main.elf  main.c.o  delay.s.o

 

Thanks, Rick :thumbup: :clap:

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...