Jump to content

StellarPad: sbi() + sbi() 4.6x Faster than digitalWrite()

Recommended Posts

sbi() and sbi() are 4.6x faster than digitalWrite() on the LaunchPad Stellaris.
Is there a way to improve the implementation of  digitalWrite()?
Results are:
Stellaris on embedXcode


digitalWrdigitalWrite() -() sbi() Comparison
Times in ms
digitalWrite() 1500
cbi() sbi()    325
Stellaris on Energia


digitalWrite() - cbi() sbi() Comparison
Times in ms
digitalWrite() 1500
cbi() sbi()    400

sbi() and sbi()require functions that are not implemented with the MSP430, namely portBASERegister and GPIOPinWrite.

The measures were provided by the following basic sketch, compiled with Energia or embedXcode:

// Core library for code-sense

#include "Energia.h"


#define LOOPS 1000000

#define _pin RED_LED

#define portOutputRegister(x) (regtype)portBASERegister(x)

#define cbi(reg, mask) GPIOPinWrite(reg, mask, 0)

#define sbi(reg, mask) GPIOPinWrite(reg, mask, mask)

typedef volatile uint32_t regtype;

typedef uint8_t regsize;


uint32_t chrono;

regtype _port;

regsize _bit;


void setup()





  // put your setup code here, to run once:

  Serial.println("digitalWrite() - cbi() sbi() Comparison");

  Serial.println("Times in ms");


  Serial.print("digitalWrite() ");

  pinMode(_pin, OUTPUT);


  chrono = millis();

  for (uint32_t i=0; i<LOOPS; i++) {

    digitalWrite(_pin, HIGH);

    digitalWrite(_pin, LOW);


  chrono = millis() - chrono;


  Serial.println(chrono, DEC);



  Serial.print("cbi() sbi()    ");

  _port   = portOutputRegister(digitalPinToPort(_pin));

  _bit    = digitalPinToBitMask(_pin);


  chrono = millis();

  for (uint32_t i=0; i<LOOPS; i++) {

    sbi(_port, _bit);

    cbi(_port, _bit);


  chrono = millis() - chrono;


  Serial.println(chrono, DEC);




void loop()





Link to post
Share on other sites

Well, you moved most of the overhead of digitalWrite/Read outside of the loop with sbi/cbi. Include the calculation of _port and _mask into the second loop and we're probably getting closer.


When you look at the source of StellarisWare / driverlib, you will see that GPIOPinWrite/Read are a single command:

return(HWREG(ulPort + (GPIO_O_DATA + (ucPins << 2))));


HWREG(ulPort + (GPIO_O_DATA + (ucPins << 2))) = ucVal;



I'd venture to guess that moving those lines directly into wiring_digital.c instead of calling ROM_GPIO already makes a big difference. Given that we save a call, the size penalty should be small.


Another thing I notice is, that we first create a bitmask out of the single pin and that mask then get's shifted by 2 in the code above. Maybe the two operations can be combined. Though I guess creating another look-up table just for this minor improvement is not worth the trade-off.

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Create New...