L.R.A 78 Posted August 31, 2014 Share Posted August 31, 2014 Hi everyone, So i've been trying to control ws2812/ws2812B led strips with my Tiva launchpad, the tm4c1294xl. First i will explain what i've been doing. Later when i have a clean code for you all to read i will post it. I use only the WS2812B led strip. I wanted to make a big RGB matrix so i wanted alot of outputs with the least processor usage, taking advantage of the ARM peripherals. First i tried using the SSI module, it worked but it could be better, plus it used alot of RAM. Here it is working: Then i saw alot of controlers using DMA transfers, some change PWM duty values and others just changed the state of GPIO. I went for the second aproach of sending data to the GPIO. It's the same method as the teensy uses. The idea is to send 3 values per bit. A 0xFF, data values, 0x00. This should explain better: https://www.pjrc.com/teensy/td_libs_OctoWS2811.html Well this uses 2 timers interrupt and a GPIO interrupt. Well the guys at TI E2E teached me that the TIVA PWM module has inverting capabilities with 2 comparators. So what do i do? Well i just use 1 PWM output and 1 GPIO interrupt. The PWM inverts the PWM state (HIGH or LOW) at 0.4uS, 0.8uS and 1,25uS (end of PWM period). The GPIO triggers the DMA for both edges so it always sends the 3 values needed at the right timing. With this i can control 8 outputs for the WS2812B. But wait! The tm4c1294 has 15 GPIOs! Unfortunaly just 4 of them have the 8 pins available in the breakout. So i use the same PWM signal and 3 more GPIO pins for interrupt. With this i control 32 outputs using only 4 GPIO interrupts and 1 PWM module output. So if you use 512 LEDs per output like he teensy 3.1 then you have control over 16384 WS2812B. Well, now problems: This method uses 1 byte values, since it sends the 8 bits for the GPIO pins right? But i need 3 values per each brightness bit (0xFF, 0xXX, 0x00) so i require 24*3 bytes to control 1 WS2812B per 8 output (so total of 8 WS2812B are being controled). This method uses alot of RAM. Second problem, the Tiva DMA can only transfer 1024 itens per transfer set. So that means it can only control 14 WS2812B before the processor needs to set the transfer again. Since this takes alot of time (relative to the timing of the ws2812b), i am going to implement DMA ping-pong mode to solve this.(alredy solved) TODO: Do the code to receive new data and update, possibly from UART or USB. Optimize the control with Scatter-Gather, this would solve both problems i have with the control but it's realy complex and there isn't much information about Scatter-Gather. Hope it wasn't too boring to read the explanation and here is the code to control 8 outputs with 14 LEDs each: /* This code uses 8 outputs to control ws2812B led strips. It uses still only DMA basic mode so it can't reliably (if at all) control more than 14 LEDs If you have any sugestions or need any explanation, feel free to do so. */ void GPIOPortFIntHandler(void); void InituDMA(void); void InitGPIO(void); void InitPWM(void); void setup(); void loop(void); void SendData(); #include <stdint.h> #include <stdbool.h> #include "stdlib.h" #include "inc/hw_ints.h" #include "inc/hw_memmap.h" #include "inc/hw_uart.h" #include "inc/hw_gpio.h" #include "inc/hw_pwm.h" #include "inc/hw_types.h" #include "driverlib/interrupt.c" #include "driverlib/sysctl.c" #include "driverlib/timer.c" #include "driverlib/udma.c" #include "driverlib/gpio.c" #include "driverlib/pwm.c" #include "driverlib/interrupt.h" #include "driverlib/pin_map.h" #include "driverlib/rom.h" #include "driverlib/rom_map.h" #include "driverlib/sysctl.h" #include "driverlib/uart.h" #include "driverlib/udma.h" #include "driverlib/pwm.h" #include <string.h> /* * OUTPUTs definitions * * Outputs Tested ( x means they are working) * PA: 0[x], 1[x],2[x]3[x],4[x],5[x],6[x],7[x] * *(Yay they all work) */ /* These are the Outputs for the WS2812B You can change it to any from the folowing: GPIOK, GPIOA, GPIOD, GPIOM, */ #define GPIO_BASE_OUTPUT1 GPIO_PORTA_BASE #define GPIO_PERIPH_OUTPUT1 SYSCTL_PERIPH_GPIOA /* * End of OUTPUTs definitions */ //This is the size of each array 24*3 means 1 WS2812, #define WS2812_BUF_SIZE (24*3*14) //This is how many times the DMA needs to repeat due to 1024 transfer limit, or 14 LEDs //(not working right now) #define multiple 1 /* This is to set how many transfer per DMA cycle (i had this for testing but now it's equal to the Buf Size. It should be that -1, since it's how the transfer size works (0 is equal to 1 transfer) but for some reason that eludes me, it has to be this value */ #define CycleSize WS2812_BUF_SIZE volatile uint32_t g_ui32SysClock; /* * These are the 4 arrays for the outputs * g_ui8TxBuf1A is for the set of OUTPUTs1 */ static uint8_t g_ui8TxBuf1A[WS2812_BUF_SIZE*multiple]; //This will count how many DMA cycles hapened per transfer. Reset to Start a new transfer volatile uint8_t DMACycleCount = 0; //***************************************************************************** // // The control table used by the uDMA controller. This table must be aligned // to a 1024 byte boundary. // //***************************************************************************** #if defined(ewarm) #pragma data_alignment=1024 uint8_t pui8ControlTable[1024]; #elif defined(ccs) #pragma DATA_ALIGN(pui8ControlTable, 1024) uint8_t pui8ControlTable[1024]; #else uint8_t pui8ControlTable[1024] __attribute__ ((aligned(1024))); #endif //***************************************************************************** // // Interrupt Handler when DMA Done is recieved // //***************************************************************************** void GPIOPortFIntHandler(void) { PWMGenDisable(PWM0_BASE, PWM_GEN_0); GPIOIntClear(GPIO_PORTF_BASE,GPIO_INT_DMA); } //***************************************************************************** // // Initialize UDMA to send data from Memory to Port E when there is a DMA // Request // //***************************************************************************** void InituDMA(void) { //Disable and reset before enabling it SysCtlPeripheralDisable(SYSCTL_PERIPH_UDMA); SysCtlPeripheralReset(SYSCTL_PERIPH_UDMA); SysCtlPeripheralEnable(SYSCTL_PERIPH_UDMA); SysCtlDelay(10); uDMAEnable(); uDMAControlBaseSet(pui8ControlTable); /* * This is for seting up the GPIO_BASE_OUTPUT1 with CH15 GPIOF */ uDMAChannelAssign(UDMA_CH15_GPIOF); //Set the DMA to a know state by disabling all atributes uDMAChannelAttributeDisable(UDMA_CH15_GPIOF, UDMA_ATTR_ALTSELECT | UDMA_ATTR_USEBURST | UDMA_ATTR_HIGH_PRIORITY | UDMA_ATTR_REQMASK); /* I set the transfer to 8bits with the source address incrementing each transfer 8 bits. The destination never increments since it's the GPIO address. The arbitration is 1, only 1 transfer (8 bits) per trigger resquest. */ uDMAChannelControlSet(UDMA_CH15_GPIOF | UDMA_PRI_SELECT, UDMA_SIZE_8 | UDMA_SRC_INC_8 | UDMA_DST_INC_NONE | UDMA_ARB_1); /* Seting the transfer to basic mode with a source being our data array and the destination being the output GPIO base + 0x3FC so we go direct to the pins states. The transfer size is equal to CycleSize */ uDMAChannelTransferSet(UDMA_CH15_GPIOF | UDMA_PRI_SELECT, UDMA_MODE_BASIC, g_ui8TxBuf1A, (void *)(GPIO_BASE_OUTPUT1 + 0x3FC), CycleSize); /* * End of CH15 */ //Enable the DMA chanels uDMAChannelEnable(UDMA_CH15_GPIOF); } //***************************************************************************** // // Initialize the PWM Module for 1.25 us with Down Count and Comparator A set // to fire at 0.35 us and Comparator B set to fire at 0.7 us. The PWM will // toggle on Load, CMPA Down and CMPB Down which is then connected to a GPIO // for uDMA Request // //***************************************************************************** void InitGPIO(void) { /* Set up PF1 as the trigger for the DMA with both edges. We don't enable the GPIOF peripheral since it was alredy done with the PWM setup */ SysCtlDelay(3); GPIOPinTypeGPIOInput(GPIO_PORTF_BASE, GPIO_PIN_1); GPIOIntTypeSet(GPIO_PORTF_BASE,GPIO_PIN_1,GPIO_BOTH_EDGES); GPIOIntRegister(GPIO_PORTF_BASE,GPIOPortFIntHandler); GPIOIntClear(GPIO_PORTF_BASE,0x1FF); GPIODMATriggerEnable(GPIO_PORTF_BASE,GPIO_PIN_1); GPIOIntEnable(GPIO_PORTF_BASE,GPIO_INT_DMA); IntEnable(INT_GPIOF); /* * End of PF1 setup */ /* *=================================================== * * * Output setups * * *=================================================== */ /* * Start of GPIO_BASE_OUTPUT1 setup */ SysCtlPeripheralDisable(GPIO_PERIPH_OUTPUT1); SysCtlPeripheralReset(GPIO_PERIPH_OUTPUT1); SysCtlPeripheralEnable(GPIO_PERIPH_OUTPUT1); SysCtlDelay(10); //The folowing GPIO have all the 8 pins in the launchpad: // GPIOK, GPIOA, GPIOD, GPIOM, // GPIOPinTypeGPIOOutput(GPIO_BASE_OUTPUT1, 0xFF); GPIOPinWrite(GPIO_BASE_OUTPUT1,0xFF,0x0); /* * End of GPIO_BASE_OUTPUT1 setup */ } //***************************************************************************** // // Initialize the PWM Module for 1.25 us with Down Count and Comparator A set // to fire at 0.35 us and Comparator B set to fire at 0.7 us. The PWM will // toggle on Load, CMPA Down and CMPB Down which is then connected to a GPIO // for uDMA Request // //***************************************************************************** void InitPWM(void) { SysCtlPeripheralDisable(SYSCTL_PERIPH_GPIOF); SysCtlPeripheralReset(SYSCTL_PERIPH_GPIOF); SysCtlPeripheralEnable(SYSCTL_PERIPH_GPIOF); SysCtlPeripheralDisable(SYSCTL_PERIPH_PWM0); SysCtlPeripheralReset(SYSCTL_PERIPH_PWM0); SysCtlPeripheralEnable(SYSCTL_PERIPH_PWM0); SysCtlDelay(3); // // Unlock the Pin PF0 and Set the Commit Bit // HWREG(GPIO_PORTF_BASE + GPIO_O_LOCK) = GPIO_LOCK_KEY; HWREG(GPIO_PORTF_BASE + GPIO_O_CR) |= 0x01; GPIOPinConfigure(GPIO_PF0_M0PWM0); // // Configure the PWM function for this pin. // Consult the data sheet to see which functions are allocated per pin. // GPIOPinTypePWM(GPIO_PORTF_BASE, GPIO_PIN_0); // // Set the PWM clock to the system clock. // PWMClockSet(PWM0_BASE, PWM_SYSCLK_DIV_1); // // Configure the PWM0 to count down without synchronization. // PWMGenConfigure(PWM0_BASE, PWM_GEN_0, PWM_GEN_MODE_DOWN | PWM_GEN_MODE_NO_SYNC); // // Set the PWM period to 800KHz. To calculate the appropriate parameter // use the following equation: N = (1 / f) * SysClk. Where N is the // function parameter, f is the desired frequency, and SysClk is the // system clock frequency. // In this case you get: (1 / 800KHz) * 120MHz = 150 cycles. Note that // the maximum period you can set is 2^16. // PWMGenPeriodSet(PWM0_BASE, PWM_GEN_0, 150); // // Set the Comparators for 0.35us and 0.7us // HWREG(PWM0_BASE+PWM_O_0_CMPA) = 96; HWREG(PWM0_BASE+PWM_O_0_CMPB) = 48; //This sets the PWM to invert at CMPA and CMPB match and reach 0. All in //count down mode. HWREG(PWM0_BASE+PWM_O_0_GENA) = 0x444; //I set the counter to 40, not realy needed, just wanted to have the counter //at a known value HWREG(PWM0_BASE+PWM_O_0_COUNT ) = 40; // // Enable the PWM0 output signal (PF0). // PWMOutputState(PWM0_BASE, PWM_OUT_0_BIT, true); } //***************************************************************************** // // Configure ADC1 for a single-ended input and a single sample. Once the // sample is ready, an interrupt flag will be set. Using a polling method, // the data will be read then displayed on the console via UART0. // //***************************************************************************** void SendData(){ //Wait if any transfer is in progress while(DMACycleCount < multiple){ } //Delay to asure the reset time of the WS2812B, just for testing purposes SysCtlDelay(20000); //Set outputs to 0 HWREG(GPIO_BASE_OUTPUT1 + (0xFF << 2))=0x00; //Reconfigure the transfer to the original state uDMAChannelTransferSet(UDMA_CH15_GPIOF | UDMA_PRI_SELECT, UDMA_MODE_BASIC, g_ui8TxBuf1A, (void *)(GPIO_BASE_OUTPUT1 + 0x3FC), CycleSize); //Reset counter DMACycleCount = 0; //Set PWM counter to know state HWREG(PWM0_BASE+PWM_O_0_COUNT ) = 40; //Enable DMA uDMAChannelEnable(UDMA_CH15_GPIOF); //Re-enable the PWM PWMGenEnable(PWM0_BASE, PWM_GEN_0); } int main(){ loop(); } void loop(void) { uint32_t ui32Index; // // Set the clocking to run at 20 MHz (200 MHz / 10) using the PLL. // g_ui32SysClock = MAP_SysCtlClockFreqSet((SYSCTL_XTAL_25MHZ | SYSCTL_OSC_MAIN | SYSCTL_USE_PLL | SYSCTL_CFG_VCO_480), 120000000); //Needs to be in this order due to SysPeripheral enable of GPIOF being in Init PWM InitPWM(); InitGPIO(); InituDMA(); //Set all outputs to 0 for(ui32Index=0;ui32Index<(WS2812_BUF_SIZE*multiple);ui32Index++) { if((ui32Index%3) == 2 ) { g_ui8TxBuf1A[ui32Index] = 0x00; } else if((ui32Index%3) == 1 ) { g_ui8TxBuf1A[ui32Index] = 0x00;//rand()%256; } else if((ui32Index%3) == 0) { g_ui8TxBuf1A[ui32Index] = 0xFF; } } // // Enables the PWM generator block. // PWMGenEnable(PWM0_BASE, PWM_GEN_0); while(DMACycleCount < multiple){ } g_ui8TxBuf1A[0] = 0xFF; g_ui8TxBuf1A[1] = 0x00; g_ui8TxBuf1A[2] = 0x00; while(1){ while(DMACycleCount < multiple){ } //Use random values for the LEDs for(ui32Index=0;ui32Index<(WS2812_BUF_SIZE*multiple);ui32Index++) { if((ui32Index%3) == 2 ) { g_ui8TxBuf1A[ui32Index] = 0x00; } else if((ui32Index%3) == 1 ) { /*g_ui8TxBuf1A[ui32Index] = 0x00;//rand()%256;*/ g_ui8TxBuf1A[ui32Index] = rand()%255; } else if((ui32Index%3) == 0 || ui32Index==0) { g_ui8TxBuf1A[ui32Index] = 0xFF; } } SendData(); SysCtlDelay(20000000); while(DMACycleCount < multiple){ } //Use random values for the LEDs for(ui32Index=0;ui32Index<(WS2812_BUF_SIZE*multiple);ui32Index++) { if((ui32Index%3) == 2 ) { g_ui8TxBuf1A[ui32Index] = 0x00; } else if((ui32Index%3) == 1 ) { g_ui8TxBuf1A[ui32Index] = rand()%255; } else if((ui32Index%3) == 0 ) { g_ui8TxBuf1A[ui32Index] = 0xFF; } } SendData(); SysCtlDelay(20000000); } } Quote Link to post Share on other sites
RobG 1,892 Posted August 31, 2014 Share Posted August 31, 2014 This method uses 1 byte values, since it sends the 8 bits for the GPIO pins right? But i need 3 values per each brightness bit (0xFF, 0xXX, 0x00) so i require 24*3 bytes to control 1 WS2812B per 8 output (so total of 8 WS2812B are being controled). This method uses alot of RAM. This should not use 3x RAM. First and last byte should be loaded from the same address. The simplest way to solve it would be to use 3 DMA channels. L.R.A 1 Quote Link to post Share on other sites
L.R.A 78 Posted August 31, 2014 Author Share Posted August 31, 2014 This should not use 3x RAM. First and last byte should be loaded from the same address. The simplest way to solve it would be to use 3 DMA channels. I wanted to do this initaly, how did i forget ahah. I kinda discarted that idea because for the 32 ports i would need sometigh like 8 timers and 4 GPIO for that + the 4 GPIO for output. It is possible with this Tiva, but i wanted to try the scatter gather to avoid that, but it has proven to be quite hard. Remember this method just uses 1 PWM module + 4 GPIO + 4 GPIO for output. It's kinda of a tradeoff, high RAM usage or peripheral usage. But with 14 split timers i think i will do that. Thanks for reminding me RobG Quote Link to post Share on other sites
L.R.A 78 Posted September 1, 2014 Author Share Posted September 1, 2014 So i implemented the ping-pong mode. Here it is working with a 30 LED 1M strip: Quote Link to post Share on other sites
igor 163 Posted September 1, 2014 Share Posted September 1, 2014 I wanted to do this initaly, how did i forget ahah. I kinda discarted that idea because for the 32 ports i would need sometigh like 8 timers and 4 GPIO for that + the 4 GPIO for output. Remember this method just uses 1 PWM module + 4 GPIO + 4 GPIO for output. It's kinda of a tradeoff, high RAM usage or peripheral usage. But with 14 split timers i think i will do that. Thank you, seeing examples helps. Why would it take so many more timers? e.g., Couldn't you do something at least part-way toward what RobG suggests by using 2 DMA channels and 2 PWM (or timers)? One DMA to output the 0xFF and 0x00 values (triggered on a PWM or timer corresponding to the first and last triggers of 800KHz, sending 1 byte per trigger, fed from a buffer that was just a series of 0xFF, 0x00, 0xFF, ...) The other channel, fed from your pixel data, sending 1 byte per trigger, triggered by a synchronized PWM (or something of the sort) which just initiates 1 trigger at the right offset into the 800KHz wave. That would bring memory down to something like (2048 bytes + pixel data), rather than (3x pixel data), using one extra PWM? I am probably still missing some things about what the DMA controller can and can't do. Quote Link to post Share on other sites
L.R.A 78 Posted September 1, 2014 Author Share Posted September 1, 2014 There's probably one way i wouldn't need so many timers but simply multiplying the method RobG mentioned would mean that many timers. i can just use how i have it but i would need a complex DMA mode. The thing is that you can set up DMA chanels for 1 configuration at a time. So sending 0xFF everytime from 1 source, then send data from an array and then send 0x00 everytime from 1 source would mean 3 diferent configurations. So you need 3 DMA chanels. So you can use just 2 PWM signals for the method that RobG mentioned but you need also 3 DMA chanels per 8 outputs. Having 1 DMA to output 0x00 and 0xFF would require the same RAM since you can't make a loop in basic DMA operation Quote Link to post Share on other sites
RobG 1,892 Posted September 2, 2014 Share Posted September 2, 2014 That's not a bad idea @@igor. You could set uDMA1's count (data) to numOfBytes and uDMA2's count (0xFF & 0x00) to numOfBytes x2. However, I would use Flash instead of RAM to store 2048 0xFF and 0x00s. Quote Link to post Share on other sites
L.R.A 78 Posted September 2, 2014 Author Share Posted September 2, 2014 Yes RobG that would be a better idea but since the DMA can only do 1024 transmitions so just store 512 0xFF and 512 0x00 Quote Link to post Share on other sites
RobG 1,892 Posted September 2, 2014 Share Posted September 2, 2014 Ah, yes, keep forgetting about it. Back to 3 uDMAs then. Quote Link to post Share on other sites
L.R.A 78 Posted September 2, 2014 Author Share Posted September 2, 2014 Ah, yes, keep forgetting about it. Back to 3 uDMAs then. you can use just 2 still. Just need to do a max of 512 bits for the WS2812. The big problem is the RAM usage. Maybe i should realy learn using the flash for this. I would just need to use 1024 bytes in the flash. Btw, 1 more test i'm having fun Quote Link to post Share on other sites
RobG 1,892 Posted September 2, 2014 Share Posted September 2, 2014 How about this: 1. use ping pong to send 0xFF, 0x00s. Store 0xFF and 0x00 as constant and do not increment source. This way you could do 1024/1024 and use only 2 bytes in Flash for 0xFF & 0x00 and 1024 in RAM for data. 2. use scatter-gather. This would require small const array of 0xFF & 0x00s and several control tables. Source addr in each control table would point to the same address. This way you could have 256 byte array of 0xFF & 0x00s for example, and 8 control tables to get 2048 bytes transfer. Add more control tables and you can do 2048+. Same for data, split data array in few pieces and you could send more than 1024. So, RAM usage is just data, Flash usage is whatever size your array is. BTW, we should move this thread to Stellarisiti. Quote Link to post Share on other sites
L.R.A 78 Posted September 2, 2014 Author Share Posted September 2, 2014 Well consider this problems: 1, This would involve the processor interrupting every transfer so it would take wait the point of using DMA to avoid processor overhead and the DMA method would just be for timing purposes. 2 I do want to use scater gather, it's just too complex and there's few documentation so i'm still studying how to do that. I posted this because of a conversation with igor in a thread in this forum. But yes, please let's start discussing this here : http://forum.stellarisiti.com/topic/2107-ws2812b-matrix/ Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.