Jump to content
43oh

MSP432 toggle a bit using bit banding?


Recommended Posts

Hi everybody

fooling around with the MSP432 launchpad.

 

Does anybody know how to properly and elegantly use bit banding?

Could not find much on the web and in the examples...

 

 

This does not work, does not even create an error message:

 //		this does what you expect, a short pulse
    	P1OUT |= BIT0;
    	P1OUT &= ~BIT0;

//		this does nothing
    	BITBAND_PERI(P1OUT,BIT0)=1;
    	BITBAND_PERI(P1OUT,BIT0)=0;

does anybody know what I am doing wrong here?

 

Link to post
Share on other sites

Thanks RobG!

Of course this works now. Stupid me.

 

Strange though, performance wise, this bitbanding does not seem to help any thing, 

this code, built around the example msp432p401_cs_03, "configuration for 48Mhz".

 

    volatile uint32_t i;
    P1DIR |= BIT0;
    while (1)
    {
        P1OUT |= BIT0;
        P1OUT &= ~BIT0;
    BITBAND_PERI(P1OUT,0)=1;
    BITBAND_PERI(P1OUT,0)=0;
        for(i=0;i<2;i++);
    }

 

results in this signal:

 

post-37272-0-05283200-1434305555_thumb.png

 

- first pulse exactly the same time as second one

- much slower than I expected: 330nS per pulse, that is 16 clock cycles 

- yes, the device runs at 48Mhz, checked at P4.3 and measured around 7.5mA supply current

Link to post
Share on other sites

Strange though, performance wise, this bitbanding does not seem to help any thing, 

this code, built around the example msp432p401_cs_03, "configuration for 48Mhz".

 

- first pulse exactly the same time as second one

- much slower than I expected: 330nS per pulse, that is 16 clock cycles 

- yes, the device runs at 48Mhz, checked at P4.3 and measured around 7.5mA supply current

 

I got caught out by this too. Looking at the TRM I got the impression that bit-banding is good for performance, but in general that's not the case.

 

I found that a bit-band write took at least as many cycles as the equivalent (interruptible) read-modify-write instruction sequence. I think it basically just implements the RMW sequence inside the bus controller, so the CPU kicks it off with a single instruction and then waits until it finishes.

 

Bit-banding has two clear advantages over the RMW sequence - it's non-interruptible and it occupies less space in flash memory.

 

If you don't need those advantages it may be possible to optimise for speed by avoiding bit-banding. For example, if you're toggling a GPIO pin then maybe you don't care what value the other pins in the same PxOUT register are set to. In that case you can just write the whole register in a single instruction. Perhaps you do care what the other pins are set to, but know that they have a fixed (but unknown) value during the toggling; then you can do the read once and just modify/write on each toggle.

Link to post
Share on other sites
  • 2 months later...

If you look at the disassembly, you can see that is set R1 to either 1 or 0 every time before it writes to the bitband region.

That must waste double the time, strange that ARM4 don't have built in BME, so you can use XOR etc on the bitband region. (eg Decorated store operations)

Just a quick dirty test shows that if you can store 1 to R1 and 0 to R2 you double the rate, if the hardware can keep up that is.

I tried to declare two uint8_t  inside the function so compiler reserved the values in R1 and R2 but no go, it used the stack.
So a BITBAND_PERI Toggle Function needs to be created that does use two registers.

You can trick it to use Registers by call to a function, but it's depending on optimization so I force medium

 

And when you take this in account, it should go even faster (if it applies to this type of STR)
Neighboring load and store single instructions can pipeline their address and data phases. This enables these instructions to complete in a single execution cycle.
 

 while(1)
    {
         BITBAND_PERI(P1OUT,0)=1;                // Set P1.0 (sets R0 to bitband address, sets R1 to 1)
         asm(" MOVS R2, #0");
         asm(" STRB R2, [R0]");
         asm(" STRB R1, [R0]");			 // un-rolled toggle loop
         asm(" STRB R2, [R0]");
         asm(" STRB R1, [R0]");
         asm(" STRB R2, [R0]");
         asm(" STRB R1, [R0]");
         asm(" STRB R2, [R0]");
    }
    while(1)
    {    
        toggle(1,0);  
       
    }
    
}
#pragma optimize = medium
void toggle(char one, char zero)
{
   BITBAND_PERI(P1OUT,0)=one;
   BITBAND_PERI(P1OUT,0)=zero;
   BITBAND_PERI(P1OUT,0)=one;
   BITBAND_PERI(P1OUT,0)=zero;
   BITBAND_PERI(P1OUT,0)=one;
   BITBAND_PERI(P1OUT,0)=zero;
   BITBAND_PERI(P1OUT,0)=one;
   BITBAND_PERI(P1OUT,0)=zero;
}
Link to post
Share on other sites

On the Tiva processors the I/O ports have a nice masking property - you can use part of the address as a mask so that only certain bits are accessed.  (e.g. if you write to a port at a certain offset from the ports base address, it will only change bits 1 and 2 of the port (for instance), and leave the rest alone).  Handy if you need to read or change multiple bits at once (and faster than doing one bit at a time).

 

Don't know if the 432 has that feature, and only useful if you need to do more than one bit at once.

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...