43oh

# Delay function with one cycle granularity

## Recommended Posts

CCS has an intrinsic delay, but it requires a constant value. So you can do this:

```__delay_cycles(1234);
```

but not this:

```unsigned x = 1234;
__delay_cycles(x);
```

The usual way to do a simple delay is something like this:

```unsigned x = 1234;
while(--x);
```

in assembly:

```mov #1234, R12
dec R12
jne \$ - 2
```

That has three cycle granularity. One cycle for the dec, and two cycles for the conditional jump.

To make a delay with one cycle granularity, two techniques can be combined. The first is to make the simple delay loop decrement the count by the number of cycles in the loop. The loop takes three cycles, so the obvious code would be:

```sub #3, R12
jne \$ - 2
```

This will not work because subtracting 3 from R12 takes two cycles rather than the one cycle used by dec R12. So the obvious fix would be to subtract four because the loop now requires four cycles:

```sub #4, R12
jne \$ - 2
```

This will not work either because subtracting 4 from R12 takes only one cycle! Whenever the constants -1, 0, 1, 2, 4, or 8 are used a compact instruction form can be used that is one cycle shorter. The solution is:

```sub #4, R12
nop
jne \$ - 4
```

The loop takes four cycles and subtracts four every iteration. Perfect.

One problem with this loop is that it will only terminate if the initial value is a multiple of four. The fix for this is to change the jne (jump if not equal / jump if not zero) to jc (jump if carry). The loop will then terminate when R12 underflows.

```sub #4, R12
nop
jc \$ - 4
```

When the loop terminates the value in R12 will be -1, -2, -3, or -4. This is the remainder of division - the modulus. It is the negative of the number of cycles that have to be skipped to make the loop exactly the right duration.

A variable delay of one cycle granularity can be done with a computed jump in to a series of instructions that execute in one cycle. This is not practical for a wide range of delay, but is perfect for handling the remainder of the four cycle loop. The MSP430 nop instruction takes one cycle, so...

```add R12, PC ; add R12 to the program counter to skip over some of the nops
nop
nop
nop
nop
```

Each nop is two bytes, so the remainder has to be multiplied by two after it is negated:

```inv R12
rla R12
nop
nop
nop
nop
```

A constant can be subtracted from the desired cycle count to compensate for loop overhead (call, return, computed jump)

The complete code: delay.asm

```          .cdecls C, LIST, "msp430g2211.h"  ; Include device header file

.text
.global delay_cycles

delay_cycles
sub     #4, R12                 ; Subtract four w/ four cycles per iteration
nop                             ;
jc      \$ - 4                   ;
inv     R12                     ; Negate the remainder
rla     R12                     ; Multiply by two
add     R12, PC                 ; One cycle granularity computed jump
nop                             ;
nop                             ;
nop                             ;
ret                             ; Return
```

A simple test:

```void delay_cycles(unsigned);

int main(void)
{
WDTCTL = WDTPW + WDTHOLD;

delay_cycles(20);
delay_cycles(21);
delay_cycles(22);
delay_cycles(23);
delay_cycles(24);
delay_cycles(25);
delay_cycles(26);
delay_cycles(27);
delay_cycles(28);
delay_cycles(29);
delay_cycles(30);
delay_cycles(31);
}
```

It is possible to use addition rather than subtraction in the loop. This is more efficient on some microcontrollers, but not the MSP430. It may be advantageous in some special applications.

```delay_cycles
inv     R12
nop
jnc     \$ - 4
rla     R12
nop
nop
nop
ret
```

##### Share on other sites

Careful if you use a small delay, the initial subtractions can cause an underflow.

I tried this code in some templates which auto-calcuated a 2 cycle delay (2 microseconds on a 1mhz clock), which underflowed to 64k cycles of wait time.

##### Share on other sites

The minimum is 20 cycles as shown in the sample code. The overhead is 20, so it is not possible to do less. Lower numbers will give 65536 + N cycles.

##### Share on other sites

oPossum,

In your explorations, did you come across the MSP430 __delay_cycles() function listing (in CCSv4)? If so, where is it, please?

TIA, Mike

##### Share on other sites

Mac, I was looking for that too at some point but I was not able to find the source code, just the header file.

##### Share on other sites

That's because it is an intrinsic (in GCC at least, but it's very highly likely so is in CCS), the compiler inserts the appropriate instructions on the fly, as it encounterts this call (so it's not a "function" in the traditional sense).

If you are curious about a (possible) implementation, I suggest you look at the one in MSPGCC (as it is available to study, while CCS isn't, so not much of a choice ), which will be around here. The actual code piece inserted by the compiler will probably be in msp430.md, but frankly I've given up searching after the gazillionth indirection .

You can also try compiling a simple piece of code (with CCS), then disassemble it.

##### Share on other sites

It is explained in SLAU132E. Code is dynamically generated so there is no source code for it (other than the source code of the C/C++ compiler itself).

The advantage on an intrinsic is that is can go down to 1 cycle. The disadvantage is possible code bloat due to code being generated for each use of __delay_cycles(). Using a delay function when possible may reduce code size relative to the intrinsic.

##### Share on other sites
It is explained in SLAU132E. Code is dynamically generated so there is no source code for it (other than the source code of the C/C++ compiler itself).

Yeah, I noticed it was dynamic by looking at the generated code. But, I also thought I found it off by a couple cycles once or twice. Guess I need to take a closer look.

Thanks for the speedy reply.... Regards...

The __delay_cycles() intrinsic function does appear to be "cycle accurate". Great news! Thank you...

##### Share on other sites
• 5 months later...

Also callable from C as

```void delay_cycles(uint16_t num);

int main(void)
{
uint16_t foo = 123;
delay_cycles(foo);
}```

```msp430-gcc -Os -mmcu=msp430g2231 -c main.c -o main.c.o
msp430-gcc -Os -mmcu=msp430g2231 -c delay.s -o delay.s.o
msp430-gcc -mmcu=msp430g2231 -Wl,--sort-common  -o main.elf  main.c.o  delay.s.o```

Thanks, Rick :thumbup: :clap:

## Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

×   Pasted as rich text.   Paste as plain text instead

Only 75 emoji are allowed.

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
• Blog

• #### Activity

×
• Create New...