Jump to content
43oh

WS2812 Matrix


Recommended Posts

Hi everyone,

 

So i've been trying to control ws2812/ws2812B led strips with my Tiva launchpad, the tm4c1294xl. First i will explain what i've been doing. Later when i have a clean code for you all to read i will post it. I use only the WS2812B led strip.

 

 

I wanted to make a big RGB matrix so i wanted alot of outputs with the least processor usage, taking advantage of the ARM peripherals.

First i tried using the SSI module, it worked but it could be better, plus it used alot of RAM. Here it is working:

 

Then i saw alot of controlers using DMA transfers, some change PWM duty values and others just changed the state of GPIO. I went for the second aproach of sending data to the GPIO.

It's the same method as the teensy uses. The idea is to send 3 values per bit. A 0xFF, data values, 0x00. This should explain better:

https://www.pjrc.com/teensy/td_libs_OctoWS2811.html

 

Well this uses 2 timers interrupt and a GPIO interrupt. Well the guys at TI E2E teached me that the TIVA PWM module has inverting capabilities with 2 comparators. So what do i do? Well i just use 1 PWM output and 1 GPIO interrupt. The PWM inverts the PWM state (HIGH or LOW) at 0.4uS, 0.8uS and 1,25uS (end of PWM period). The GPIO triggers the DMA for both edges so it always sends the 3 values needed at the right timing.

 

With this i can control 8 outputs for the WS2812B. But wait! The tm4c1294 has 15 GPIOs! Unfortunaly just 4 of them have the 8 pins available in the breakout. So i use the same PWM signal and 3 more GPIO pins for interrupt. With this i control 32 outputs using only 4 GPIO interrupts and 1 PWM module output. So if you use 512 LEDs per output like he teensy 3.1 then you have control over 16384 WS2812B. 

 

 

Well, now problems:

This method uses 1 byte values, since it sends the 8 bits for the GPIO pins right? But i need 3 values per each brightness bit (0xFF, 0xXX, 0x00) so i require 24*3 bytes to control 1 WS2812B per 8 output (so total of 8 WS2812B are being controled). This method uses alot of RAM.

Second problem, the Tiva DMA can only transfer 1024 itens per transfer set. So that means it can only control 14 WS2812B before the processor needs to set the transfer again. Since this takes alot of time (relative to the timing of the ws2812b), i am going to implement DMA ping-pong mode to solve this.(alredy solved)

 

 

TODO:

Do the code to receive new data and update, possibly from UART or USB.

Optimize the control with Scatter-Gather, this would solve both problems i have with the control but it's realy complex and there isn't much information about Scatter-Gather.

 

 

 

Hope it wasn't too boring to read the explanation and here is the code to control 8 outputs with 14 LEDs each:

/*


This code uses 8 outputs to control ws2812B led strips.

It uses still only DMA basic mode so it can't reliably (if at all) control
more than 14 LEDs


If you have any sugestions or need any explanation, feel free to do so.

*/
void
GPIOPortFIntHandler(void);
void
InituDMA(void);
void
InitGPIO(void);
void
InitPWM(void);
void setup();
void loop(void);
void SendData();


#include <stdint.h>
#include <stdbool.h>
#include "stdlib.h"
#include "inc/hw_ints.h"
#include "inc/hw_memmap.h"
#include "inc/hw_uart.h"
#include "inc/hw_gpio.h"
#include "inc/hw_pwm.h"
#include "inc/hw_types.h"
#include "driverlib/interrupt.c"
#include "driverlib/sysctl.c"
#include "driverlib/timer.c"
#include "driverlib/udma.c"
#include "driverlib/gpio.c"
#include "driverlib/pwm.c"
#include "driverlib/interrupt.h"
#include "driverlib/pin_map.h"
#include "driverlib/rom.h"
#include "driverlib/rom_map.h"
#include "driverlib/sysctl.h"
#include "driverlib/uart.h"
#include "driverlib/udma.h"
#include "driverlib/pwm.h"
#include <string.h>



/*
 * OUTPUTs definitions
 *
 * Outputs Tested ( x means they are working)
 * PA: 0[x], 1[x],2[x]3[x],4[x],5[x],6[x],7[x] 
 *
 *(Yay they all work)
 */

/*
These are the Outputs for the WS2812B
You can change it to any from the folowing:
GPIOK, GPIOA, GPIOD, GPIOM,
*/
#define GPIO_BASE_OUTPUT1 GPIO_PORTA_BASE
#define GPIO_PERIPH_OUTPUT1 SYSCTL_PERIPH_GPIOA


/*
 * End of OUTPUTs definitions
 */

//This is the size of each array 24*3 means 1 WS2812,
#define WS2812_BUF_SIZE (24*3*14)

//This is how many times the DMA needs to repeat due to 1024 transfer limit, or 14 LEDs
//(not working right now)
#define multiple 1

/*
This is to set how many transfer per DMA cycle (i had this for testing but now 
it's equal to the Buf Size. It should be that -1, since it's how the transfer
size works (0 is equal to 1 transfer) but for some reason that eludes me, it has 
to be this value
*/
#define CycleSize WS2812_BUF_SIZE
volatile uint32_t g_ui32SysClock;

/*
 * These are the 4 arrays for the outputs
 *  g_ui8TxBuf1A is for the set of OUTPUTs1
 */
static uint8_t g_ui8TxBuf1A[WS2812_BUF_SIZE*multiple];

//This will count how many DMA cycles hapened per transfer. Reset to Start a new transfer
volatile uint8_t DMACycleCount = 0;


//*****************************************************************************
//
// The control table used by the uDMA controller.  This table must be aligned
// to a 1024 byte boundary.
//
//*****************************************************************************
#if defined(ewarm)
#pragma data_alignment=1024
uint8_t pui8ControlTable[1024];
#elif defined(ccs)
#pragma DATA_ALIGN(pui8ControlTable, 1024)
uint8_t pui8ControlTable[1024];
#else
uint8_t pui8ControlTable[1024] __attribute__ ((aligned(1024)));
#endif

//*****************************************************************************
//
// Interrupt Handler when DMA Done is recieved
//
//*****************************************************************************
void
GPIOPortFIntHandler(void)
{
    PWMGenDisable(PWM0_BASE, PWM_GEN_0); 
    GPIOIntClear(GPIO_PORTF_BASE,GPIO_INT_DMA);

}

//*****************************************************************************
//
// Initialize UDMA to send data from Memory to Port E when there is a DMA
// Request
//
//*****************************************************************************
void
InituDMA(void)
{

  //Disable and reset before enabling it 
  SysCtlPeripheralDisable(SYSCTL_PERIPH_UDMA);
  SysCtlPeripheralReset(SYSCTL_PERIPH_UDMA);
  SysCtlPeripheralEnable(SYSCTL_PERIPH_UDMA);

  SysCtlDelay(10);

  uDMAEnable();

  uDMAControlBaseSet(pui8ControlTable);
  
/*
 * This is for seting up the GPIO_BASE_OUTPUT1 with CH15 GPIOF
 */
  uDMAChannelAssign(UDMA_CH15_GPIOF);

  //Set the DMA to a know state by disabling all atributes
  uDMAChannelAttributeDisable(UDMA_CH15_GPIOF,
  UDMA_ATTR_ALTSELECT | UDMA_ATTR_USEBURST |
    UDMA_ATTR_HIGH_PRIORITY |
    UDMA_ATTR_REQMASK);

  /*
  I set the transfer to 8bits with the source address incrementing each transfer
  8 bits. The destination never increments since it's the GPIO address. The arbitration
  is 1, only 1 transfer (8 bits) per trigger resquest.
  */
  uDMAChannelControlSet(UDMA_CH15_GPIOF | UDMA_PRI_SELECT,
  UDMA_SIZE_8 | UDMA_SRC_INC_8 | UDMA_DST_INC_NONE |
    UDMA_ARB_1);

  /*
  Seting the transfer to basic mode with a source being our data array and the
  destination being the output GPIO base + 0x3FC so we go direct to the pins states.
  The transfer size is equal to CycleSize
  */
  uDMAChannelTransferSet(UDMA_CH15_GPIOF | UDMA_PRI_SELECT,
  UDMA_MODE_BASIC,
  g_ui8TxBuf1A, (void *)(GPIO_BASE_OUTPUT1 + 0x3FC),
  CycleSize);
/*
 * End of CH15
 */


  //Enable the DMA chanels
  uDMAChannelEnable(UDMA_CH15_GPIOF);

}

//*****************************************************************************
//
// Initialize the PWM Module for 1.25 us with Down Count and Comparator A set
// to fire at 0.35 us and Comparator B set to fire at 0.7 us. The PWM will
// toggle on Load, CMPA Down and CMPB Down which is then connected to a GPIO
// for uDMA Request
//
//*****************************************************************************
void
InitGPIO(void)
{
  
  /*
  Set up PF1 as the trigger for the DMA with both edges.
  We don't enable the GPIOF peripheral since it was alredy done with the PWM 
  setup
  */
        SysCtlDelay(3);
	GPIOPinTypeGPIOInput(GPIO_PORTF_BASE, GPIO_PIN_1);

	GPIOIntTypeSet(GPIO_PORTF_BASE,GPIO_PIN_1,GPIO_BOTH_EDGES);
	GPIOIntRegister(GPIO_PORTF_BASE,GPIOPortFIntHandler);

	GPIOIntClear(GPIO_PORTF_BASE,0x1FF);

	GPIODMATriggerEnable(GPIO_PORTF_BASE,GPIO_PIN_1);

	GPIOIntEnable(GPIO_PORTF_BASE,GPIO_INT_DMA);

	IntEnable(INT_GPIOF);
  /*
  * End of PF1 setup
  */


/*
 *===================================================
 *
 *
 * Output setups
 *
 *
 *===================================================
 */

	/*
	* Start of GPIO_BASE_OUTPUT1 setup
	*/
	SysCtlPeripheralDisable(GPIO_PERIPH_OUTPUT1);
	SysCtlPeripheralReset(GPIO_PERIPH_OUTPUT1);
	SysCtlPeripheralEnable(GPIO_PERIPH_OUTPUT1);
	SysCtlDelay(10);

	//The folowing GPIO have all the 8 pins in the launchpad:
	// GPIOK, GPIOA, GPIOD, GPIOM,
	//

	GPIOPinTypeGPIOOutput(GPIO_BASE_OUTPUT1, 0xFF);
	GPIOPinWrite(GPIO_BASE_OUTPUT1,0xFF,0x0);
	/*
	* End of GPIO_BASE_OUTPUT1 setup
	*/

}

//*****************************************************************************
//
// Initialize the PWM Module for 1.25 us with Down Count and Comparator A set
// to fire at 0.35 us and Comparator B set to fire at 0.7 us. The PWM will
// toggle on Load, CMPA Down and CMPB Down which is then connected to a GPIO
// for uDMA Request
//
//*****************************************************************************
void
InitPWM(void)
{
  SysCtlPeripheralDisable(SYSCTL_PERIPH_GPIOF);
  SysCtlPeripheralReset(SYSCTL_PERIPH_GPIOF);
  SysCtlPeripheralEnable(SYSCTL_PERIPH_GPIOF);
  SysCtlPeripheralDisable(SYSCTL_PERIPH_PWM0);
  SysCtlPeripheralReset(SYSCTL_PERIPH_PWM0);
  SysCtlPeripheralEnable(SYSCTL_PERIPH_PWM0);
  SysCtlDelay(3);
  //
  // Unlock the Pin PF0 and Set the Commit Bit
  //
  HWREG(GPIO_PORTF_BASE + GPIO_O_LOCK) = GPIO_LOCK_KEY;
  HWREG(GPIO_PORTF_BASE + GPIO_O_CR)   |= 0x01;
  GPIOPinConfigure(GPIO_PF0_M0PWM0);

  //
  // Configure the PWM function for this pin.
  // Consult the data sheet to see which functions are allocated per pin.
  //
  GPIOPinTypePWM(GPIO_PORTF_BASE, GPIO_PIN_0);

  //
  // Set the PWM clock to the system clock.
  //
  PWMClockSet(PWM0_BASE, PWM_SYSCLK_DIV_1);

  //
  // Configure the PWM0 to count down without synchronization.
  //
  PWMGenConfigure(PWM0_BASE, PWM_GEN_0,
  PWM_GEN_MODE_DOWN | PWM_GEN_MODE_NO_SYNC);

  //
  // Set the PWM period to 800KHz.  To calculate the appropriate parameter
  // use the following equation: N = (1 / f) * SysClk.  Where N is the
  // function parameter, f is the desired frequency, and SysClk is the
  // system clock frequency.
  // In this case you get: (1 / 800KHz) * 120MHz = 150 cycles.  Note that
  // the maximum period you can set is 2^16.
  //
  PWMGenPeriodSet(PWM0_BASE, PWM_GEN_0, 150);

  //
  // Set the Comparators for 0.35us and 0.7us
  //
  HWREG(PWM0_BASE+PWM_O_0_CMPA) = 96;
  HWREG(PWM0_BASE+PWM_O_0_CMPB) = 48;
  
  //This sets the PWM to invert at CMPA and CMPB match and reach 0. All in
  //count down mode.
  HWREG(PWM0_BASE+PWM_O_0_GENA) = 0x444;
  
  //I set the counter to 40, not realy needed, just wanted to have the counter
  //at a known value
  HWREG(PWM0_BASE+PWM_O_0_COUNT ) = 40;

  //
  // Enable the PWM0 output signal (PF0).
  //
  PWMOutputState(PWM0_BASE, PWM_OUT_0_BIT, true);


}
//*****************************************************************************
//
// Configure ADC1 for a single-ended input and a single sample.  Once the
// sample is ready, an interrupt flag will be set.  Using a polling method,
// the data will be read then displayed on the console via UART0.
//
//*****************************************************************************



void SendData(){

       //Wait if any transfer is in progress
       while(DMACycleCount < multiple){
       }
       //Delay to asure the reset time of the WS2812B, just for testing purposes
        SysCtlDelay(20000);
       
       //Set outputs to 0
        HWREG(GPIO_BASE_OUTPUT1 + (0xFF << 2))=0x00;
        
        //Reconfigure the transfer to the original state
	uDMAChannelTransferSet(UDMA_CH15_GPIOF | UDMA_PRI_SELECT,
	UDMA_MODE_BASIC,
	g_ui8TxBuf1A, (void *)(GPIO_BASE_OUTPUT1 + 0x3FC),
	CycleSize);

        //Reset counter
        DMACycleCount = 0;
        //Set PWM counter to know state
	HWREG(PWM0_BASE+PWM_O_0_COUNT ) = 40;

        //Enable DMA
	uDMAChannelEnable(UDMA_CH15_GPIOF);

        //Re-enable the PWM 
	PWMGenEnable(PWM0_BASE, PWM_GEN_0);

}

int main(){
  loop();
}

void loop(void)
{
  uint32_t ui32Index;

  //
  // Set the clocking to run at 20 MHz (200 MHz / 10) using the PLL.  
  //
  
  g_ui32SysClock = MAP_SysCtlClockFreqSet((SYSCTL_XTAL_25MHZ |
    SYSCTL_OSC_MAIN | SYSCTL_USE_PLL |
    SYSCTL_CFG_VCO_480), 120000000);

  //Needs to be in this order due to SysPeripheral enable of GPIOF being in Init PWM
  InitPWM();
  InitGPIO();
  InituDMA();

  
  //Set all outputs to 0
  for(ui32Index=0;ui32Index<(WS2812_BUF_SIZE*multiple);ui32Index++)
  {
	if((ui32Index%3) == 2 )
	{
		g_ui8TxBuf1A[ui32Index] = 0x00;
	}
	else if((ui32Index%3) == 1 )
	{
		g_ui8TxBuf1A[ui32Index] = 0x00;//rand()%256;
	}
	else if((ui32Index%3) == 0)
	{
		g_ui8TxBuf1A[ui32Index] = 0xFF;
	}


  }

  //
  // Enables the PWM generator block.
  //
  PWMGenEnable(PWM0_BASE, PWM_GEN_0);

  while(DMACycleCount < multiple){
  }

  
  g_ui8TxBuf1A[0] = 0xFF;
  g_ui8TxBuf1A[1] = 0x00;
  g_ui8TxBuf1A[2] = 0x00;
while(1){
  
  
   while(DMACycleCount < multiple){
   }  
  //Use random values for the LEDs
  for(ui32Index=0;ui32Index<(WS2812_BUF_SIZE*multiple);ui32Index++)
   {
		if((ui32Index%3) == 2 )
		{
			g_ui8TxBuf1A[ui32Index] = 0x00;
		}
		else if((ui32Index%3) == 1 )
		{
			/*g_ui8TxBuf1A[ui32Index] = 0x00;//rand()%256;*/
			g_ui8TxBuf1A[ui32Index] = rand()%255;
		}
		else if((ui32Index%3) == 0 || ui32Index==0)
		{
			g_ui8TxBuf1A[ui32Index] = 0xFF;
		}


	  }

   SendData();

   SysCtlDelay(20000000);
   
    while(DMACycleCount < multiple){
    }
   //Use random values for the LEDs
   for(ui32Index=0;ui32Index<(WS2812_BUF_SIZE*multiple);ui32Index++)
    {
		if((ui32Index%3) == 2 )
		{
			g_ui8TxBuf1A[ui32Index] = 0x00;
		}
		else if((ui32Index%3) == 1 )
		{
			g_ui8TxBuf1A[ui32Index] = rand()%255;
		}
		else if((ui32Index%3) == 0 )
		{
			g_ui8TxBuf1A[ui32Index] = 0xFF;
		}


	  }

    SendData();
    

    SysCtlDelay(20000000);
}





}













Link to post
Share on other sites

This method uses 1 byte values, since it sends the 8 bits for the GPIO pins right? But i need 3 values per each brightness bit (0xFF, 0xXX, 0x00) so i require 24*3 bytes to control 1 WS2812B per 8 output (so total of 8 WS2812B are being controled). This method uses alot of RAM.

 

This should not use 3x RAM. First and last byte should be loaded from the same address. The simplest way to solve it would be to use 3 DMA channels.

Link to post
Share on other sites

This should not use 3x RAM. First and last byte should be loaded from the same address. The simplest way to solve it would be to use 3 DMA channels.

 

I wanted to do this initaly, how did i forget ahah. I kinda discarted that idea because for the 32 ports i would need sometigh like 8 timers and 4 GPIO for that + the 4 GPIO for output. It is possible with this Tiva, but i wanted to try the scatter gather to avoid that, but it has proven to be quite hard.

Remember this method just uses 1 PWM module + 4 GPIO + 4 GPIO for output. It's kinda of a tradeoff, high RAM usage or peripheral usage. But with 14 split timers i think i will do that.

Thanks for reminding me RobG

Link to post
Share on other sites

I wanted to do this initaly, how did i forget ahah. I kinda discarted that idea because for the 32 ports i would need sometigh like 8 timers and 4 GPIO for that + the 4 GPIO for output. Remember this method just uses 1 PWM module + 4 GPIO + 4 GPIO for output. It's kinda of a tradeoff, high RAM usage or peripheral usage. But with 14 split timers i think i will do that.

 

 

 

Thank you, seeing examples helps.
 
Why would it take so many more timers?
 
e.g., Couldn't you do something at least part-way toward what RobG suggests by using 2 DMA channels and 2 PWM (or timers)?
One DMA to output the 0xFF and 0x00 values (triggered on a PWM or timer corresponding to the first and last triggers of 800KHz, sending 1 byte per trigger, fed from a buffer that was just a series of 0xFF, 0x00, 0xFF, ...)
The other channel, fed from your pixel data, sending 1 byte per trigger, triggered by a synchronized PWM (or something of the sort) 
which just initiates 1 trigger at the right offset into the 800KHz wave.
 
That would bring memory down to something like (2048 bytes + pixel data), rather than (3x pixel data), using one extra PWM?
 
I am probably still missing some things about what the DMA controller can and can't do.
Link to post
Share on other sites

There's probably one way i wouldn't need so many timers but simply multiplying the method RobG mentioned would mean that many timers. i can just use how i have it but i would need a complex DMA mode.

 

The thing is that you can set up DMA chanels for 1 configuration at a time. So sending 0xFF everytime from 1 source, then send data from an array and then send 0x00 everytime from 1 source would mean 3 diferent configurations. So you need 3 DMA chanels. So you can use just 2 PWM signals for the method that RobG mentioned but you need also 3 DMA chanels per 8 outputs.

Having 1 DMA to output 0x00 and 0xFF would require the same RAM since you can't make a loop in basic DMA operation

Link to post
Share on other sites

Ah, yes, keep forgetting about it. Back to 3 uDMAs then.

 

you can use just 2 still. Just need to do a max of 512 bits for the WS2812. The big problem is the RAM usage.

Maybe i should realy learn using the flash for this. I would just need to use 1024 bytes in the flash. 

 

Btw, 1 more test :P i'm having fun

Link to post
Share on other sites

How about this:

 

1. use ping pong to send 0xFF, 0x00s. Store 0xFF and 0x00 as constant and do not increment source. This way you could do 1024/1024 and use only 2 bytes in Flash for 0xFF & 0x00 and 1024 in RAM for data.

 

2. use scatter-gather. This would require small const array of 0xFF & 0x00s and several control tables. Source addr in each control table would point to the same address. This way you could have 256 byte array of 0xFF & 0x00s for example, and 8 control tables to get 2048 bytes transfer. Add more control tables and you can do 2048+. Same for data, split data array in few pieces and you could send more than 1024. So, RAM usage is just data, Flash usage is whatever size your array is.

 

BTW, we should move this thread to Stellarisiti.

Link to post
Share on other sites

Well consider this problems:

 

1, This would involve the processor interrupting every transfer so it would take wait the point of using DMA to avoid processor overhead and the DMA method would just be for timing purposes.

 

2 I do want to use scater gather, it's just too complex and there's few documentation so i'm still studying how to do that.

 

 

I posted this because of a conversation with igor in a thread in this forum. But yes, please let's start discussing this here : 

http://forum.stellarisiti.com/topic/2107-ws2812b-matrix/

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...