Jump to content

sq7bti

Members
  • Content Count

    24
  • Joined

  • Last visited

  • Days Won

    5

Reputation Activity

  1. Like
    sq7bti got a reaction from petertux in Newtonian/Dobsonian telescope controller   
    Hello everyone,
     
    There is a couple of similar projects available on the internet. Some of them base on Arduino and PIC performs very basic mount control without math intensive computation implemented in embedded controller. I decided to build my own with the following goals:
    ease of use by an inexperienced amateur astronomer (full automatic operation) precision and resolution of position  last but not least: the price Final, or better say at the moment, design comprises of the following components:
    Stellaris LM4F launchpad central control unit, two ULN2003 unipolar stepper motor driver chips, two 28byj-48 stepper motors one moving in azimuth, and in elevation via gear train, communication module: Bluetooth serial module. It allows sending a coordinate set-point and provides position feedback to Stellarium, GPS module providing position and precise time reference - PPS gives 1us accuracy, Nokia 5110 display unit and joystick for standalone operation, now obsolete mouse (PS/2) modified to provide independent (incremental) position information Resolution that was reached is a single step of approx. 5". Given the size of Jupiter to range from 30" to 50", this positioning resolution makes the view comfortably stable in standard 60° FOV eyepiece at reasonably high magnification, without the need to adjust AZ/ALT continuously.
     
    During the development I made use of several opensource and projects available online, namely:
    AccelStepper for stepper control, TinyGPS++ for NMEA decoding, Arduino telescope controller was my inspiration and reference for Taki's matrix method for coordinates transformation, of course Energia as my IDE Upon power-up the mount is performing:
    homing acquisition of current location (longitude/latitude) and time via NMEA stream moves to 3 brightest (most convenient) stars in succession to perform 3 star alignment procedure - they are selected from the list of over 500 stars in built-in catalog (the  brightest are used for the alignment, tough), once aligned the mount is in tracking mode: it tracks the view to counter the apparent movement of objects in the sky, waiting, either for the user to move to particular object - selected from the library of stars and Messier objects, or awaits connection via Bluetooth from a PC running Stellarium with a plugin and slews to selected object. search for the object that should be visible in the eyepiece and display important information on LCD - I compiled in 500 brightest stars from HYGXYZ and full Messier catalog.  I have very little experience as amateur astronomer so far, so some of the objectives might have been not very obvious for me in the beginning. This project was also a good way to make use of my free time and gain experience in embedded system design.
     
    With kind regards,
    Szymon
     
  2. Like
    sq7bti reacted to VMM in Another 430 Watch   
    Hello.  I figured I would share a project I've been working on since I borrowed a lot of code from this forum.  It's a small watch using a g2553 and the same OLED display as "The Terminal".  Thanks bluehash for the breakout, RobG and gwdeveloper for code, and others.  

  3. Like
    sq7bti reacted to reaper7 in Two SPI modules   
    indeed, inspired by HardwareSerial
    You can manage touch to hardwarespi too
     
    P.S. yes, I push to energia git
    but we have to perform some tests
     
    Here is the next one
    with setModule(module) and setModule(module, ssPin) for "universal" SPI which can be set to (0),(1),(2),(3)
    or, of course, still exists SPI0, SPI1, new SPI2, SPI3 as a "hard" defined
     
    SPI2.zip
     
    we also need to speed up CS inside lib, digitalWrite(ssPin... is not too fast
     
    this is sample code from spi2 and spi0, 8 data bits and many many time to disable CS:

  4. Like
    sq7bti reacted to roadrunner84 in SPI and I2C on same pin   
    No you can't, since your SPI device can send data through P1.6 to your MSP430, you can't make certain that no accidental START token is present on the pins during SPI mode.
    Also, I2C pins are common-collector; they use a pull up resistor and only pull the line down. While SPI is push-pull; it driver the line either high or low. Driving the line high while your I2C device (accidentally, after interpreting a START token) pulls it low might cause damage to the driver on either side of the line.
    A way to maybe make it possible is by having some external hardware to disable SCL while in SPI mode. Since in SPI mode you'd have your CS# line low, you could use a single transistor to have P1.6 disconnect from your SCL net during SPI mode.
    +3v3 | | +-+ |2| |k| |2| +-+ | +-+----------> to device SCL pin | |/ C MSP430 CS# pin--| NPN-transistor |\ E | | MSP430 P1.6 ------+------------< to device MISO pin
  5. Like
    sq7bti got a reaction from gsutton in Educational BoosterPack 8 bit FFT Spectrum Analyzer Attempt   
    Hi,
     
    Having a nice matrix led display laying around with no use of I assembled a test circuit with launchpad and modified the code of @simpleavr to run on it.

    The display is driven by four MAX7219 chips chained on SPI bus. I noticed that the only floating point operation that your code was using is a square root operation in line 257. Once I added a fixed-point square root routine, linking with math lib is not necessary anymore - spared a lot of flash space. The fixed-point routine is also 3 times faster than mathlib floating point one: 50us vs 150us
    #ifdef FLOATING_POINT // sqrt: 150us #include <math.h> #else // FLOATING_POINT // 50us unsigned char sqrt16(unsigned short a) { unsigned short rem = 0, root = 0; int i; for(i = 0; i < 8; ++i) { root <<= 1; rem = ((rem << 2) + (a >> 14)); a <<= 2; ++root; if(root <= rem) { rem -= root; ++root; } else { --root; } } return (unsigned char)(root >> 1); } #endif // FLOATING_POINT Further I've added Hamm windowing to minimize spectral leakage:
    // scilab 255 * window('kr',64,6) //const unsigned short hamming[32] = { 4, 6, 9, 13, 17, 23, 29, 35, 43, 51, 60, 70, 80, 91, 102, 114, 126, 138, 151, 163, 175, 187, 198, 208, 218, 227, 234, 241, 247, 251, 253, 255 }; // scilab 255 * window('kr',64,4) const unsigned short hamming[32] = { 23, 29, 35, 42, 50, 58, 66, 75, 84, 94, 104, 113, 124, 134, 144, 154, 164, 174, 183, 192, 201, 210, 217, 224, 231, 237, 242, 246, 250, 252, 254, 255 }; // scilab 255 * window('kr',64,2) //const unsigned short hamming[32] = { 112, 119, 126, 133, 140, 147, 154, 161, 167, 174, 180, 186, 192, 198, 204, 209, 214, 219, 224, 228, 232, 236, 239, 242, 245, 247, 250, 251, 253, 254, 255, 255 }; Applying windowing on 32 samples uses fixed point math:
    for (i=0;i<Nx;i++) { int hamm = hamming[i<(FFT_SIZE-1)?i:(Nx-1)-i] * data[i]; data[i] = (hamm >> 8); } Finally the display buffer is filled in with output of FFT function:
            unsigned long mask = 1UL;         for (i=0; i<32; ++i, mask <<= 1) {             for(j=0;j<8;++j) {                 if(j<plot[i])                     dbuff.ulongs[j] |= mask;                 else                     dbuff.ulongs[j] &= ~(mask);             }         } where dbuff is display buffer organized as:
    typedef union { unsigned long longs; unsigned int ints[2]; unsigned char chars[4]; } longbytes; union { unsigned char bytes[8*4]; longbytes lbytes[8]; unsigned long ulongs[8]; } dbuff; which make it easy to manipulate:
    unsigned char spibuff[8]; void SPI_Write(unsigned char* array) { P2OUT &= ~LED_CS; __delay_cycles(50); unsigned int h = 8; while(h) { UCB0TXBUF = *array; while (UCB0STAT & UCBUSY); ++array; --h; } P2OUT |= LED_CS; } void update_display() { unsigned char i; for(i = 0; i < 8; ++i) { spibuff[0] = spibuff[2] = spibuff[4] = spibuff[6] = i+1; spibuff[1] = dbuff.lbytes[i].chars[3]; spibuff[3] = dbuff.lbytes[i].chars[2]; spibuff[5] = dbuff.lbytes[i].chars[1]; spibuff[7] = dbuff.lbytes[i].chars[0]; SPI_Write(spibuff); } } BTW I got a second 8x32 led matrix displays from the same ebay offer, and it turned out to be mirrored - LED matrix is turned around 180deg, so the update_display() routine for the other one have to push bytes in order 0 through 3, and bit-shift operations ( << and >> ) yield opposite display reaction (left vs right scroll).
     
    https://youtu.be/Nen6yd5kvZs
     
    s.
     
    Edit: correct photo attachement and source code snippets
    led_fft.tar.bz2
  6. Like
    sq7bti reacted to tonyp12 in tiny msp430 preemptive multitasking system   
    Tested on G2553 Launchpad with IAR, I recommend G2955 with 1K RAM if you want more than 3 task
    #include "msp430.h" #include "common.h" //=========================(C) Tony Philipsson 2016 ======================= funcpnt const taskpnt[]={ task1, task2, task3, // <- PUT YOUR TASKS HERE }; const int stacksize[tasks] = {28}; // a blank value defaults to 24 stack words //========================================================================= int taskstackpnt[tasks]; unsigned int taskdelay[tasks]; char taskrun; int main( void ) { WDTCTL = WDTPW + WDTHOLD; // Stop watchdog timer if (CALBC1_8MHZ != 0xff){ // erased by mistake? BCSCTL1 = CALBC1_8MHZ; // Set DCO to factory calibrate 1MHz DCOCTL = CALDCO_8MHZ; } int* multistack = (int*) __get_SP_register(); int i=0; while(i<tasks-1){ int j = stacksize[i]; if (!j) j = 24; multistack -= j; *(multistack) = (int) taskpnt[++i]; // prefill in PC *(multistack-1) = GIE; // prefill in SR taskstackpnt[i] = (int) multistack-26; // needs 12 dummy push words } WDTCTL = WDTPW+WDTTMSEL+WDTCNTCL; // 4ms interval at 8MHz smclk IE1 |= WDTIE; __bis_SR_register(GIE); asm ("br &taskpnt"); // indirect jmp to first task } //============= TASK SWITCHER ISR ============= #pragma vector = WDT_VECTOR __raw __interrupt void taskswitcher(void) { asm ("push R15\n push R14\n push R13\n push R12\n" "push R11\n push R10\n push R9\n push R8\n" "push R7\n push R6\n push R5\n push R4"); taskstackpnt[taskrun] = __get_SP_register(); if (++taskrun == tasks) taskrun = 0; __set_SP_register(taskstackpnt[taskrun]); asm ("pop R4\n pop R5\n pop R6\n pop R7\n" "pop R8\n pop R9\n pop R10\n pop R11\n" "pop R12\n pop R13\n pop R14\n pop R15"); } #include "msp430.h" #include "common.h" __task void task1(void){ P1DIR |= BIT0; while(1){ __delay_cycles(800000); P1OUT |= BIT0; __delay_cycles(800000); P1OUT &=~BIT0; } } #include "msp430.h" #include "common.h" __task void task2(void){ P1DIR |= BIT6; while(1){ __delay_cycles(1200000); P1OUT |= BIT6; __delay_cycles(1200000); P1OUT &=~BIT6; } } #include "msp430.h" #include "common.h" unsigned int fibo(int); __task void task3(void){ int temp = 0; while(1){ fibo(++temp); } } unsigned int fibo(int n){ if (n < 2) return n; else return (fibo(n-1) + fibo(n-2)); } #ifndef COMMON_H_ #define COMMON_H_ #define tasks (sizeof(taskpnt)/2) __task void task1(void); __task void task2(void); __task void task3(void); typedef __task void (*funcpnt)(void); #endif
  7. Like
    sq7bti reacted to Fmilburn in Getting Started with Printed Circuit Board (PCB) Design   
    This is the first PCB that I have designed and sent off to be manufactured.  Yesterday I received the boards, soldered them up, and they work!

    This write-up outlines the process I used in the hope that it will be useful to other hobbyists and builders.  There are links at the end which provide additional detail. 
     
    Selecting a Project
    The project consists of a small board with a MSP430G2553 microcontroller and an nRF24L01 radio.  I started with a radio attached with jumpers to a LaunchPad quite some time back and then built one on a proto-board.  The photograph below shows a G2553 with radio powered by a buck-boost converter attached to a super capacitor and solar panel.  I used it for a while with my weather station which never was quite completed.

    Although I could have started with that, I actually chose to start with something simpler.  The goal was to focus on the PCB design process and to minimize the issues associated with a new or technically challenging project.  The objectives, strategies, and constraints I decided on included the following:
    Inexpensive






  8. Like
    sq7bti got a reaction from tripwire in Newtonian/Dobsonian telescope controller   
    Progress:

    A summary of changes:
    instead of unipolar motor drivers, now I used a bipolar drivers very popular in RepRap projects, here A4988 (or DRV8825) 28byj-48 modified for bipolar cheap HC-05 for bluetooth SPP GPS module U-blox NEO-6m added RTC DS1307 to provide date/time reference even in the first seconds after power-on and 56 of NVRAM bytes added (optional) humidity and temperature sensor DSTH01 added a I2C socket to connect external temperature sensors to provide information about motors temperatures added PCF8574 for microstepping configuration of A4988 drivers added buzzer for audible indication added output for 12Vdc fan of main mirror - PWM controlled Nokia 5110 display replaced with a red back-light
     
    As the software is concerned, there were several improvements as well. The most important is that the motors are now driven by an interrupt driven AccelStepper
     

     
    With kind regards,
    Szymon
  9. Like
    sq7bti got a reaction from tripwire in Newtonian/Dobsonian telescope controller   
    When powered on, the mount moves to the first alignment star. Then, a user provides the correction vector: star just needs to be positioned in the middle of view in eyepiece. First star roughly corrects the misalignment in telescope orientation w.r.t. the north. Second star helps to correct also the leveling error. Third star would improve alignment even further. I did not (yet) implement any periodic error correction. The whole alignment procedure takes couple of minutes, and requires a user to center stars in an eyepiece with an attached joystick, and confirm with fire button. GPS resolves the time/date and location issue during start-up in unknown location.
     
    with kind regards,
    Szymon
  10. Like
    sq7bti got a reaction from tripwire in Newtonian/Dobsonian telescope controller   
    Hello everyone,
     
    There is a couple of similar projects available on the internet. Some of them base on Arduino and PIC performs very basic mount control without math intensive computation implemented in embedded controller. I decided to build my own with the following goals:
    ease of use by an inexperienced amateur astronomer (full automatic operation) precision and resolution of position  last but not least: the price Final, or better say at the moment, design comprises of the following components:
    Stellaris LM4F launchpad central control unit, two ULN2003 unipolar stepper motor driver chips, two 28byj-48 stepper motors one moving in azimuth, and in elevation via gear train, communication module: Bluetooth serial module. It allows sending a coordinate set-point and provides position feedback to Stellarium, GPS module providing position and precise time reference - PPS gives 1us accuracy, Nokia 5110 display unit and joystick for standalone operation, now obsolete mouse (PS/2) modified to provide independent (incremental) position information Resolution that was reached is a single step of approx. 5". Given the size of Jupiter to range from 30" to 50", this positioning resolution makes the view comfortably stable in standard 60° FOV eyepiece at reasonably high magnification, without the need to adjust AZ/ALT continuously.
     
    During the development I made use of several opensource and projects available online, namely:
    AccelStepper for stepper control, TinyGPS++ for NMEA decoding, Arduino telescope controller was my inspiration and reference for Taki's matrix method for coordinates transformation, of course Energia as my IDE Upon power-up the mount is performing:
    homing acquisition of current location (longitude/latitude) and time via NMEA stream moves to 3 brightest (most convenient) stars in succession to perform 3 star alignment procedure - they are selected from the list of over 500 stars in built-in catalog (the  brightest are used for the alignment, tough), once aligned the mount is in tracking mode: it tracks the view to counter the apparent movement of objects in the sky, waiting, either for the user to move to particular object - selected from the library of stars and Messier objects, or awaits connection via Bluetooth from a PC running Stellarium with a plugin and slews to selected object. search for the object that should be visible in the eyepiece and display important information on LCD - I compiled in 500 brightest stars from HYGXYZ and full Messier catalog.  I have very little experience as amateur astronomer so far, so some of the objectives might have been not very obvious for me in the beginning. This project was also a good way to make use of my free time and gain experience in embedded system design.
     
    With kind regards,
    Szymon
     
  11. Like
    sq7bti got a reaction from bluehash in Newtonian/Dobsonian telescope controller   
    Follow-up in https://hackaday.io/project/9268-telescope-controller
     
    s.
  12. Like
    sq7bti got a reaction from Fmilburn in Newtonian/Dobsonian telescope controller   
    Follow-up in https://hackaday.io/project/9268-telescope-controller
     
    s.
  13. Like
    sq7bti reacted to simpleavr in Educational BoosterPack 8 bit FFT Spectrum Analyzer Attempt   
    Happy holidays! It turns out great.
     

     
    Button press cycles through sleep, 1 dot, 2 dots and 3 dots.
     
    Press and hold for cycles through the following 3 menu options
    Hamming windows choice; none, low, mid, high (via short presses)
    Dimmer choice; 0...3
    Sampling rate; 0...7 (0 is fastest)
     
    This is the smallest spectrum analyser I built.
    I managed to use 1 gpio for ADC, and left 15 pins for the 8x8 matrix (63 pixels used).
  14. Like
    sq7bti got a reaction from Fmilburn in Educational BoosterPack 8 bit FFT Spectrum Analyzer Attempt   
    Hi @@simpleavr,
     
    You might try to hand pick values from wider window, but here are Hamming windows size 16 (mind the symmetry):
    // round(255 * window('kr',16,6)) //const unsigned char hamming[8] = ( 4, 18, 46, 86, 136, 187, 228, 252); //, 252, 228, 187, 136, 86, 46, 18, 4 ); // round(255 * window('kr',16,4)) const unsigned char hamming[8] = ( 23, 51, 88, 130, 172, 210, 238, 253); //, 253, 238, 210, 172, 130, 88, 51, 23 ); // round(255 * window('kr',16,2)) //const unsigned char hamming[8] = ( 112, 141, 170, 196, 218, 236, 248, 254); //, 254, 248, 236, 218, 196, 170, 141, 112 ); More information regarding window function you will find in Scilab help.
     
    S.
  15. Like
    sq7bti got a reaction from simpleavr in Educational BoosterPack 8 bit FFT Spectrum Analyzer Attempt   
    Hi @@simpleavr,
     
    You might try to hand pick values from wider window, but here are Hamming windows size 16 (mind the symmetry):
    // round(255 * window('kr',16,6)) //const unsigned char hamming[8] = ( 4, 18, 46, 86, 136, 187, 228, 252); //, 252, 228, 187, 136, 86, 46, 18, 4 ); // round(255 * window('kr',16,4)) const unsigned char hamming[8] = ( 23, 51, 88, 130, 172, 210, 238, 253); //, 253, 238, 210, 172, 130, 88, 51, 23 ); // round(255 * window('kr',16,2)) //const unsigned char hamming[8] = ( 112, 141, 170, 196, 218, 236, 248, 254); //, 254, 248, 236, 218, 196, 170, 141, 112 ); More information regarding window function you will find in Scilab help.
     
    S.
  16. Like
    sq7bti got a reaction from CorB in Educational BoosterPack 8 bit FFT Spectrum Analyzer Attempt   
    Hi @@simpleavr,
     
    You might try to hand pick values from wider window, but here are Hamming windows size 16 (mind the symmetry):
    // round(255 * window('kr',16,6)) //const unsigned char hamming[8] = ( 4, 18, 46, 86, 136, 187, 228, 252); //, 252, 228, 187, 136, 86, 46, 18, 4 ); // round(255 * window('kr',16,4)) const unsigned char hamming[8] = ( 23, 51, 88, 130, 172, 210, 238, 253); //, 253, 238, 210, 172, 130, 88, 51, 23 ); // round(255 * window('kr',16,2)) //const unsigned char hamming[8] = ( 112, 141, 170, 196, 218, 236, 248, 254); //, 254, 248, 236, 218, 196, 170, 141, 112 ); More information regarding window function you will find in Scilab help.
     
    S.
  17. Like
    sq7bti got a reaction from tripwire in Educational BoosterPack 8 bit FFT Spectrum Analyzer Attempt   
    Hi @@simpleavr,
     
    You might try to hand pick values from wider window, but here are Hamming windows size 16 (mind the symmetry):
    // round(255 * window('kr',16,6)) //const unsigned char hamming[8] = ( 4, 18, 46, 86, 136, 187, 228, 252); //, 252, 228, 187, 136, 86, 46, 18, 4 ); // round(255 * window('kr',16,4)) const unsigned char hamming[8] = ( 23, 51, 88, 130, 172, 210, 238, 253); //, 253, 238, 210, 172, 130, 88, 51, 23 ); // round(255 * window('kr',16,2)) //const unsigned char hamming[8] = ( 112, 141, 170, 196, 218, 236, 248, 254); //, 254, 248, 236, 218, 196, 170, 141, 112 ); More information regarding window function you will find in Scilab help.
     
    S.
  18. Like
    sq7bti got a reaction from Fmilburn in Educational BoosterPack 8 bit FFT Spectrum Analyzer Attempt   
    Hi,
     
    Having a nice matrix led display laying around with no use of I assembled a test circuit with launchpad and modified the code of @simpleavr to run on it.

    The display is driven by four MAX7219 chips chained on SPI bus. I noticed that the only floating point operation that your code was using is a square root operation in line 257. Once I added a fixed-point square root routine, linking with math lib is not necessary anymore - spared a lot of flash space. The fixed-point routine is also 3 times faster than mathlib floating point one: 50us vs 150us
    #ifdef FLOATING_POINT // sqrt: 150us #include <math.h> #else // FLOATING_POINT // 50us unsigned char sqrt16(unsigned short a) { unsigned short rem = 0, root = 0; int i; for(i = 0; i < 8; ++i) { root <<= 1; rem = ((rem << 2) + (a >> 14)); a <<= 2; ++root; if(root <= rem) { rem -= root; ++root; } else { --root; } } return (unsigned char)(root >> 1); } #endif // FLOATING_POINT Further I've added Hamm windowing to minimize spectral leakage:
    // scilab 255 * window('kr',64,6) //const unsigned short hamming[32] = { 4, 6, 9, 13, 17, 23, 29, 35, 43, 51, 60, 70, 80, 91, 102, 114, 126, 138, 151, 163, 175, 187, 198, 208, 218, 227, 234, 241, 247, 251, 253, 255 }; // scilab 255 * window('kr',64,4) const unsigned short hamming[32] = { 23, 29, 35, 42, 50, 58, 66, 75, 84, 94, 104, 113, 124, 134, 144, 154, 164, 174, 183, 192, 201, 210, 217, 224, 231, 237, 242, 246, 250, 252, 254, 255 }; // scilab 255 * window('kr',64,2) //const unsigned short hamming[32] = { 112, 119, 126, 133, 140, 147, 154, 161, 167, 174, 180, 186, 192, 198, 204, 209, 214, 219, 224, 228, 232, 236, 239, 242, 245, 247, 250, 251, 253, 254, 255, 255 }; Applying windowing on 32 samples uses fixed point math:
    for (i=0;i<Nx;i++) { int hamm = hamming[i<(FFT_SIZE-1)?i:(Nx-1)-i] * data[i]; data[i] = (hamm >> 8); } Finally the display buffer is filled in with output of FFT function:
            unsigned long mask = 1UL;         for (i=0; i<32; ++i, mask <<= 1) {             for(j=0;j<8;++j) {                 if(j<plot[i])                     dbuff.ulongs[j] |= mask;                 else                     dbuff.ulongs[j] &= ~(mask);             }         } where dbuff is display buffer organized as:
    typedef union { unsigned long longs; unsigned int ints[2]; unsigned char chars[4]; } longbytes; union { unsigned char bytes[8*4]; longbytes lbytes[8]; unsigned long ulongs[8]; } dbuff; which make it easy to manipulate:
    unsigned char spibuff[8]; void SPI_Write(unsigned char* array) { P2OUT &= ~LED_CS; __delay_cycles(50); unsigned int h = 8; while(h) { UCB0TXBUF = *array; while (UCB0STAT & UCBUSY); ++array; --h; } P2OUT |= LED_CS; } void update_display() { unsigned char i; for(i = 0; i < 8; ++i) { spibuff[0] = spibuff[2] = spibuff[4] = spibuff[6] = i+1; spibuff[1] = dbuff.lbytes[i].chars[3]; spibuff[3] = dbuff.lbytes[i].chars[2]; spibuff[5] = dbuff.lbytes[i].chars[1]; spibuff[7] = dbuff.lbytes[i].chars[0]; SPI_Write(spibuff); } } BTW I got a second 8x32 led matrix displays from the same ebay offer, and it turned out to be mirrored - LED matrix is turned around 180deg, so the update_display() routine for the other one have to push bytes in order 0 through 3, and bit-shift operations ( << and >> ) yield opposite display reaction (left vs right scroll).
     
    https://youtu.be/Nen6yd5kvZs
     
    s.
     
    Edit: correct photo attachement and source code snippets
    led_fft.tar.bz2
  19. Like
    sq7bti got a reaction from bluehash in Educational BoosterPack 8 bit FFT Spectrum Analyzer Attempt   
    Hi,
     
    Having a nice matrix led display laying around with no use of I assembled a test circuit with launchpad and modified the code of @simpleavr to run on it.

    The display is driven by four MAX7219 chips chained on SPI bus. I noticed that the only floating point operation that your code was using is a square root operation in line 257. Once I added a fixed-point square root routine, linking with math lib is not necessary anymore - spared a lot of flash space. The fixed-point routine is also 3 times faster than mathlib floating point one: 50us vs 150us
    #ifdef FLOATING_POINT // sqrt: 150us #include <math.h> #else // FLOATING_POINT // 50us unsigned char sqrt16(unsigned short a) { unsigned short rem = 0, root = 0; int i; for(i = 0; i < 8; ++i) { root <<= 1; rem = ((rem << 2) + (a >> 14)); a <<= 2; ++root; if(root <= rem) { rem -= root; ++root; } else { --root; } } return (unsigned char)(root >> 1); } #endif // FLOATING_POINT Further I've added Hamm windowing to minimize spectral leakage:
    // scilab 255 * window('kr',64,6) //const unsigned short hamming[32] = { 4, 6, 9, 13, 17, 23, 29, 35, 43, 51, 60, 70, 80, 91, 102, 114, 126, 138, 151, 163, 175, 187, 198, 208, 218, 227, 234, 241, 247, 251, 253, 255 }; // scilab 255 * window('kr',64,4) const unsigned short hamming[32] = { 23, 29, 35, 42, 50, 58, 66, 75, 84, 94, 104, 113, 124, 134, 144, 154, 164, 174, 183, 192, 201, 210, 217, 224, 231, 237, 242, 246, 250, 252, 254, 255 }; // scilab 255 * window('kr',64,2) //const unsigned short hamming[32] = { 112, 119, 126, 133, 140, 147, 154, 161, 167, 174, 180, 186, 192, 198, 204, 209, 214, 219, 224, 228, 232, 236, 239, 242, 245, 247, 250, 251, 253, 254, 255, 255 }; Applying windowing on 32 samples uses fixed point math:
    for (i=0;i<Nx;i++) { int hamm = hamming[i<(FFT_SIZE-1)?i:(Nx-1)-i] * data[i]; data[i] = (hamm >> 8); } Finally the display buffer is filled in with output of FFT function:
            unsigned long mask = 1UL;         for (i=0; i<32; ++i, mask <<= 1) {             for(j=0;j<8;++j) {                 if(j<plot[i])                     dbuff.ulongs[j] |= mask;                 else                     dbuff.ulongs[j] &= ~(mask);             }         } where dbuff is display buffer organized as:
    typedef union { unsigned long longs; unsigned int ints[2]; unsigned char chars[4]; } longbytes; union { unsigned char bytes[8*4]; longbytes lbytes[8]; unsigned long ulongs[8]; } dbuff; which make it easy to manipulate:
    unsigned char spibuff[8]; void SPI_Write(unsigned char* array) { P2OUT &= ~LED_CS; __delay_cycles(50); unsigned int h = 8; while(h) { UCB0TXBUF = *array; while (UCB0STAT & UCBUSY); ++array; --h; } P2OUT |= LED_CS; } void update_display() { unsigned char i; for(i = 0; i < 8; ++i) { spibuff[0] = spibuff[2] = spibuff[4] = spibuff[6] = i+1; spibuff[1] = dbuff.lbytes[i].chars[3]; spibuff[3] = dbuff.lbytes[i].chars[2]; spibuff[5] = dbuff.lbytes[i].chars[1]; spibuff[7] = dbuff.lbytes[i].chars[0]; SPI_Write(spibuff); } } BTW I got a second 8x32 led matrix displays from the same ebay offer, and it turned out to be mirrored - LED matrix is turned around 180deg, so the update_display() routine for the other one have to push bytes in order 0 through 3, and bit-shift operations ( << and >> ) yield opposite display reaction (left vs right scroll).
     
    https://youtu.be/Nen6yd5kvZs
     
    s.
     
    Edit: correct photo attachement and source code snippets
    led_fft.tar.bz2
  20. Like
    sq7bti got a reaction from oPossum in Educational BoosterPack 8 bit FFT Spectrum Analyzer Attempt   
    Hi,
     
    Having a nice matrix led display laying around with no use of I assembled a test circuit with launchpad and modified the code of @simpleavr to run on it.

    The display is driven by four MAX7219 chips chained on SPI bus. I noticed that the only floating point operation that your code was using is a square root operation in line 257. Once I added a fixed-point square root routine, linking with math lib is not necessary anymore - spared a lot of flash space. The fixed-point routine is also 3 times faster than mathlib floating point one: 50us vs 150us
    #ifdef FLOATING_POINT // sqrt: 150us #include <math.h> #else // FLOATING_POINT // 50us unsigned char sqrt16(unsigned short a) { unsigned short rem = 0, root = 0; int i; for(i = 0; i < 8; ++i) { root <<= 1; rem = ((rem << 2) + (a >> 14)); a <<= 2; ++root; if(root <= rem) { rem -= root; ++root; } else { --root; } } return (unsigned char)(root >> 1); } #endif // FLOATING_POINT Further I've added Hamm windowing to minimize spectral leakage:
    // scilab 255 * window('kr',64,6) //const unsigned short hamming[32] = { 4, 6, 9, 13, 17, 23, 29, 35, 43, 51, 60, 70, 80, 91, 102, 114, 126, 138, 151, 163, 175, 187, 198, 208, 218, 227, 234, 241, 247, 251, 253, 255 }; // scilab 255 * window('kr',64,4) const unsigned short hamming[32] = { 23, 29, 35, 42, 50, 58, 66, 75, 84, 94, 104, 113, 124, 134, 144, 154, 164, 174, 183, 192, 201, 210, 217, 224, 231, 237, 242, 246, 250, 252, 254, 255 }; // scilab 255 * window('kr',64,2) //const unsigned short hamming[32] = { 112, 119, 126, 133, 140, 147, 154, 161, 167, 174, 180, 186, 192, 198, 204, 209, 214, 219, 224, 228, 232, 236, 239, 242, 245, 247, 250, 251, 253, 254, 255, 255 }; Applying windowing on 32 samples uses fixed point math:
    for (i=0;i<Nx;i++) { int hamm = hamming[i<(FFT_SIZE-1)?i:(Nx-1)-i] * data[i]; data[i] = (hamm >> 8); } Finally the display buffer is filled in with output of FFT function:
            unsigned long mask = 1UL;         for (i=0; i<32; ++i, mask <<= 1) {             for(j=0;j<8;++j) {                 if(j<plot[i])                     dbuff.ulongs[j] |= mask;                 else                     dbuff.ulongs[j] &= ~(mask);             }         } where dbuff is display buffer organized as:
    typedef union { unsigned long longs; unsigned int ints[2]; unsigned char chars[4]; } longbytes; union { unsigned char bytes[8*4]; longbytes lbytes[8]; unsigned long ulongs[8]; } dbuff; which make it easy to manipulate:
    unsigned char spibuff[8]; void SPI_Write(unsigned char* array) { P2OUT &= ~LED_CS; __delay_cycles(50); unsigned int h = 8; while(h) { UCB0TXBUF = *array; while (UCB0STAT & UCBUSY); ++array; --h; } P2OUT |= LED_CS; } void update_display() { unsigned char i; for(i = 0; i < 8; ++i) { spibuff[0] = spibuff[2] = spibuff[4] = spibuff[6] = i+1; spibuff[1] = dbuff.lbytes[i].chars[3]; spibuff[3] = dbuff.lbytes[i].chars[2]; spibuff[5] = dbuff.lbytes[i].chars[1]; spibuff[7] = dbuff.lbytes[i].chars[0]; SPI_Write(spibuff); } } BTW I got a second 8x32 led matrix displays from the same ebay offer, and it turned out to be mirrored - LED matrix is turned around 180deg, so the update_display() routine for the other one have to push bytes in order 0 through 3, and bit-shift operations ( << and >> ) yield opposite display reaction (left vs right scroll).
     
    https://youtu.be/Nen6yd5kvZs
     
    s.
     
    Edit: correct photo attachement and source code snippets
    led_fft.tar.bz2
  21. Like
    sq7bti got a reaction from pine in Educational BoosterPack 8 bit FFT Spectrum Analyzer Attempt   
    Hi,
     
    Having a nice matrix led display laying around with no use of I assembled a test circuit with launchpad and modified the code of @simpleavr to run on it.

    The display is driven by four MAX7219 chips chained on SPI bus. I noticed that the only floating point operation that your code was using is a square root operation in line 257. Once I added a fixed-point square root routine, linking with math lib is not necessary anymore - spared a lot of flash space. The fixed-point routine is also 3 times faster than mathlib floating point one: 50us vs 150us
    #ifdef FLOATING_POINT // sqrt: 150us #include <math.h> #else // FLOATING_POINT // 50us unsigned char sqrt16(unsigned short a) { unsigned short rem = 0, root = 0; int i; for(i = 0; i < 8; ++i) { root <<= 1; rem = ((rem << 2) + (a >> 14)); a <<= 2; ++root; if(root <= rem) { rem -= root; ++root; } else { --root; } } return (unsigned char)(root >> 1); } #endif // FLOATING_POINT Further I've added Hamm windowing to minimize spectral leakage:
    // scilab 255 * window('kr',64,6) //const unsigned short hamming[32] = { 4, 6, 9, 13, 17, 23, 29, 35, 43, 51, 60, 70, 80, 91, 102, 114, 126, 138, 151, 163, 175, 187, 198, 208, 218, 227, 234, 241, 247, 251, 253, 255 }; // scilab 255 * window('kr',64,4) const unsigned short hamming[32] = { 23, 29, 35, 42, 50, 58, 66, 75, 84, 94, 104, 113, 124, 134, 144, 154, 164, 174, 183, 192, 201, 210, 217, 224, 231, 237, 242, 246, 250, 252, 254, 255 }; // scilab 255 * window('kr',64,2) //const unsigned short hamming[32] = { 112, 119, 126, 133, 140, 147, 154, 161, 167, 174, 180, 186, 192, 198, 204, 209, 214, 219, 224, 228, 232, 236, 239, 242, 245, 247, 250, 251, 253, 254, 255, 255 }; Applying windowing on 32 samples uses fixed point math:
    for (i=0;i<Nx;i++) { int hamm = hamming[i<(FFT_SIZE-1)?i:(Nx-1)-i] * data[i]; data[i] = (hamm >> 8); } Finally the display buffer is filled in with output of FFT function:
            unsigned long mask = 1UL;         for (i=0; i<32; ++i, mask <<= 1) {             for(j=0;j<8;++j) {                 if(j<plot[i])                     dbuff.ulongs[j] |= mask;                 else                     dbuff.ulongs[j] &= ~(mask);             }         } where dbuff is display buffer organized as:
    typedef union { unsigned long longs; unsigned int ints[2]; unsigned char chars[4]; } longbytes; union { unsigned char bytes[8*4]; longbytes lbytes[8]; unsigned long ulongs[8]; } dbuff; which make it easy to manipulate:
    unsigned char spibuff[8]; void SPI_Write(unsigned char* array) { P2OUT &= ~LED_CS; __delay_cycles(50); unsigned int h = 8; while(h) { UCB0TXBUF = *array; while (UCB0STAT & UCBUSY); ++array; --h; } P2OUT |= LED_CS; } void update_display() { unsigned char i; for(i = 0; i < 8; ++i) { spibuff[0] = spibuff[2] = spibuff[4] = spibuff[6] = i+1; spibuff[1] = dbuff.lbytes[i].chars[3]; spibuff[3] = dbuff.lbytes[i].chars[2]; spibuff[5] = dbuff.lbytes[i].chars[1]; spibuff[7] = dbuff.lbytes[i].chars[0]; SPI_Write(spibuff); } } BTW I got a second 8x32 led matrix displays from the same ebay offer, and it turned out to be mirrored - LED matrix is turned around 180deg, so the update_display() routine for the other one have to push bytes in order 0 through 3, and bit-shift operations ( << and >> ) yield opposite display reaction (left vs right scroll).
     
    https://youtu.be/Nen6yd5kvZs
     
    s.
     
    Edit: correct photo attachement and source code snippets
    led_fft.tar.bz2
  22. Like
    sq7bti got a reaction from CorB in Educational BoosterPack 8 bit FFT Spectrum Analyzer Attempt   
    Hi,
     
    Having a nice matrix led display laying around with no use of I assembled a test circuit with launchpad and modified the code of @simpleavr to run on it.

    The display is driven by four MAX7219 chips chained on SPI bus. I noticed that the only floating point operation that your code was using is a square root operation in line 257. Once I added a fixed-point square root routine, linking with math lib is not necessary anymore - spared a lot of flash space. The fixed-point routine is also 3 times faster than mathlib floating point one: 50us vs 150us
    #ifdef FLOATING_POINT // sqrt: 150us #include <math.h> #else // FLOATING_POINT // 50us unsigned char sqrt16(unsigned short a) { unsigned short rem = 0, root = 0; int i; for(i = 0; i < 8; ++i) { root <<= 1; rem = ((rem << 2) + (a >> 14)); a <<= 2; ++root; if(root <= rem) { rem -= root; ++root; } else { --root; } } return (unsigned char)(root >> 1); } #endif // FLOATING_POINT Further I've added Hamm windowing to minimize spectral leakage:
    // scilab 255 * window('kr',64,6) //const unsigned short hamming[32] = { 4, 6, 9, 13, 17, 23, 29, 35, 43, 51, 60, 70, 80, 91, 102, 114, 126, 138, 151, 163, 175, 187, 198, 208, 218, 227, 234, 241, 247, 251, 253, 255 }; // scilab 255 * window('kr',64,4) const unsigned short hamming[32] = { 23, 29, 35, 42, 50, 58, 66, 75, 84, 94, 104, 113, 124, 134, 144, 154, 164, 174, 183, 192, 201, 210, 217, 224, 231, 237, 242, 246, 250, 252, 254, 255 }; // scilab 255 * window('kr',64,2) //const unsigned short hamming[32] = { 112, 119, 126, 133, 140, 147, 154, 161, 167, 174, 180, 186, 192, 198, 204, 209, 214, 219, 224, 228, 232, 236, 239, 242, 245, 247, 250, 251, 253, 254, 255, 255 }; Applying windowing on 32 samples uses fixed point math:
    for (i=0;i<Nx;i++) { int hamm = hamming[i<(FFT_SIZE-1)?i:(Nx-1)-i] * data[i]; data[i] = (hamm >> 8); } Finally the display buffer is filled in with output of FFT function:
            unsigned long mask = 1UL;         for (i=0; i<32; ++i, mask <<= 1) {             for(j=0;j<8;++j) {                 if(j<plot[i])                     dbuff.ulongs[j] |= mask;                 else                     dbuff.ulongs[j] &= ~(mask);             }         } where dbuff is display buffer organized as:
    typedef union { unsigned long longs; unsigned int ints[2]; unsigned char chars[4]; } longbytes; union { unsigned char bytes[8*4]; longbytes lbytes[8]; unsigned long ulongs[8]; } dbuff; which make it easy to manipulate:
    unsigned char spibuff[8]; void SPI_Write(unsigned char* array) { P2OUT &= ~LED_CS; __delay_cycles(50); unsigned int h = 8; while(h) { UCB0TXBUF = *array; while (UCB0STAT & UCBUSY); ++array; --h; } P2OUT |= LED_CS; } void update_display() { unsigned char i; for(i = 0; i < 8; ++i) { spibuff[0] = spibuff[2] = spibuff[4] = spibuff[6] = i+1; spibuff[1] = dbuff.lbytes[i].chars[3]; spibuff[3] = dbuff.lbytes[i].chars[2]; spibuff[5] = dbuff.lbytes[i].chars[1]; spibuff[7] = dbuff.lbytes[i].chars[0]; SPI_Write(spibuff); } } BTW I got a second 8x32 led matrix displays from the same ebay offer, and it turned out to be mirrored - LED matrix is turned around 180deg, so the update_display() routine for the other one have to push bytes in order 0 through 3, and bit-shift operations ( << and >> ) yield opposite display reaction (left vs right scroll).
     
    https://youtu.be/Nen6yd5kvZs
     
    s.
     
    Edit: correct photo attachement and source code snippets
    led_fft.tar.bz2
  23. Like
    sq7bti got a reaction from tripwire in Educational BoosterPack 8 bit FFT Spectrum Analyzer Attempt   
    Hi,
     
    Having a nice matrix led display laying around with no use of I assembled a test circuit with launchpad and modified the code of @simpleavr to run on it.

    The display is driven by four MAX7219 chips chained on SPI bus. I noticed that the only floating point operation that your code was using is a square root operation in line 257. Once I added a fixed-point square root routine, linking with math lib is not necessary anymore - spared a lot of flash space. The fixed-point routine is also 3 times faster than mathlib floating point one: 50us vs 150us
    #ifdef FLOATING_POINT // sqrt: 150us #include <math.h> #else // FLOATING_POINT // 50us unsigned char sqrt16(unsigned short a) { unsigned short rem = 0, root = 0; int i; for(i = 0; i < 8; ++i) { root <<= 1; rem = ((rem << 2) + (a >> 14)); a <<= 2; ++root; if(root <= rem) { rem -= root; ++root; } else { --root; } } return (unsigned char)(root >> 1); } #endif // FLOATING_POINT Further I've added Hamm windowing to minimize spectral leakage:
    // scilab 255 * window('kr',64,6) //const unsigned short hamming[32] = { 4, 6, 9, 13, 17, 23, 29, 35, 43, 51, 60, 70, 80, 91, 102, 114, 126, 138, 151, 163, 175, 187, 198, 208, 218, 227, 234, 241, 247, 251, 253, 255 }; // scilab 255 * window('kr',64,4) const unsigned short hamming[32] = { 23, 29, 35, 42, 50, 58, 66, 75, 84, 94, 104, 113, 124, 134, 144, 154, 164, 174, 183, 192, 201, 210, 217, 224, 231, 237, 242, 246, 250, 252, 254, 255 }; // scilab 255 * window('kr',64,2) //const unsigned short hamming[32] = { 112, 119, 126, 133, 140, 147, 154, 161, 167, 174, 180, 186, 192, 198, 204, 209, 214, 219, 224, 228, 232, 236, 239, 242, 245, 247, 250, 251, 253, 254, 255, 255 }; Applying windowing on 32 samples uses fixed point math:
    for (i=0;i<Nx;i++) { int hamm = hamming[i<(FFT_SIZE-1)?i:(Nx-1)-i] * data[i]; data[i] = (hamm >> 8); } Finally the display buffer is filled in with output of FFT function:
            unsigned long mask = 1UL;         for (i=0; i<32; ++i, mask <<= 1) {             for(j=0;j<8;++j) {                 if(j<plot[i])                     dbuff.ulongs[j] |= mask;                 else                     dbuff.ulongs[j] &= ~(mask);             }         } where dbuff is display buffer organized as:
    typedef union { unsigned long longs; unsigned int ints[2]; unsigned char chars[4]; } longbytes; union { unsigned char bytes[8*4]; longbytes lbytes[8]; unsigned long ulongs[8]; } dbuff; which make it easy to manipulate:
    unsigned char spibuff[8]; void SPI_Write(unsigned char* array) { P2OUT &= ~LED_CS; __delay_cycles(50); unsigned int h = 8; while(h) { UCB0TXBUF = *array; while (UCB0STAT & UCBUSY); ++array; --h; } P2OUT |= LED_CS; } void update_display() { unsigned char i; for(i = 0; i < 8; ++i) { spibuff[0] = spibuff[2] = spibuff[4] = spibuff[6] = i+1; spibuff[1] = dbuff.lbytes[i].chars[3]; spibuff[3] = dbuff.lbytes[i].chars[2]; spibuff[5] = dbuff.lbytes[i].chars[1]; spibuff[7] = dbuff.lbytes[i].chars[0]; SPI_Write(spibuff); } } BTW I got a second 8x32 led matrix displays from the same ebay offer, and it turned out to be mirrored - LED matrix is turned around 180deg, so the update_display() routine for the other one have to push bytes in order 0 through 3, and bit-shift operations ( << and >> ) yield opposite display reaction (left vs right scroll).
     
    https://youtu.be/Nen6yd5kvZs
     
    s.
     
    Edit: correct photo attachement and source code snippets
    led_fft.tar.bz2
  24. Like
    sq7bti reacted to oPossum in Faster printing of single precision floating point   
    Printf supports the %e, %f and %g format specifiers for printing floating point numbers. Due to the automatic type promotion of varadic functions in C, the printf code must always print double precision floating point. This makes printing single precision floating point numbers slower than optimal. The precision rounding done by printf also imposes a speed penalty. The code presented here will print single precision floating point numbers much faster than printf - up to 75 times faster. It allows the number of significant digits to be 3 to 8 (a maximum of 7 is recommended). The printed number will be normalized to the range to 1.0 to less than 1000.0 and the appropriate SI prefix appended. This is similar to what the printf %e format does (1.0 to less than 10.0), but a bit easier to interpret. The full range of single precision float can not be represented by SI prefixes, so very small and very large numbers will have '?' as the prefix. This code is intended for display only. It should not be used to store values in a file for later conversion back to float due to limitations in rounding and range.
     
    A brief explanation of the code.
    Do a bitwise conversion of the float to a 32 bit unsigned integer. This is done with by getting the address of the float, casting to an unsigned integer pointer, and then dereferencing the pointer. Simply casting to an unsigned integer would not produce a bitwise copy.

    uint32_t s = *(uint32_t *)&f; The msb contains the sign flag. Append a '-' to the string if the floating point number is negative.
    if (s & (1UL << 31)) *a++ = '-'; Extract the 8 bit exponent.
    int e = (s >> 23) & 0xFF; Move the significand to the upper 24 bits. Set the msb to 0.
    s <<= 8; s &= ~(1UL << 31); An exponent of 255 is used for special values of NaN (not a number) and infinity. Handle these special cases and return.
    if (e == 255) { if (s) { strcpy(a, "NaN"); } else { strcpy(a, "Inf"); } return; An exponent of 0 is used to represent a value of 0 and for denormals. A value of 0 is handled by setting the exponent to 127 - the same exponent used to represent 1.0, and leaving the significand as 0. Denormal numbers do not have the implicit msb of 1, so they are normalized by shifting until the leading 1 is in the msb position.
    } else if (e == 0) { if (s) { e = 1; while (!(s & (1UL << 31))) s <<= 1, --e; } else { e = 127; } If the exponent is some other value, then set the msb of the significand.
    } else { s |= (1UL << 31); } Setup a pointer to the SI prefix. This will be adjusted as the value is normalized.
    char const * sp = "???????yzafpnum kMGTPEZY????" + 15; If the value is less than 1.0 it must be multiplied by 1000 until it is 1.0 or greater. Multiplication by 1000 is done by an implicit multiply by 1024 and then subtracting a multiply by 16 and a multiply by 8.
    if (e < 127) { do { s = s - (s >> 6) - (s >> 7); if (!(s & (1UL << 31))) s <<= 1, --e; e += 10; --sp; } while (e < 127); If the value is 1000.0 or more it must be divided by 1000 until it is less than 1000.0. An unrolled floating point divide is used for maximum speed.
    } else if (e > 135) { while (e > (126 + 10) || (e == (126 + 10) && s >= (1000UL << (32 - 10)))) { uint32_t n = s; s = 0; uint32_t d = 1000UL << (32 - 10); if (n >= d) n -= d, s |= (1UL << 31); d >>= 1; if (n >= d) n -= d, s |= (1UL << 30); d >>= 1; if (n >= d) n -= d, s |= (1UL << 29); d >>= 1; if (n >= d) n -= d, s |= (1UL << 28); d >>= 1; if (n >= d) n -= d, s |= (1UL << 27); d >>= 1; if (n >= d) n -= d, s |= (1UL << 26); d >>= 1; if (n >= d) n -= d, s |= (1UL << 25); d >>= 1; if (n >= d) n -= d, s |= (1UL << 24); d >>= 1; if (n >= d) n -= d, s |= (1UL << 23); d >>= 1; if (n >= d) n -= d, s |= (1UL << 22); d >>= 1; if (n >= d) n -= d, s |= (1UL << 21); d >>= 1; if (n >= d) n -= d, s |= (1UL << 20); d >>= 1; if (n >= d) n -= d, s |= (1UL << 19); d >>= 1; if (n >= d) n -= d, s |= (1UL << 18); d >>= 1; if (n >= d) n -= d, s |= (1UL << 17); d >>= 1; if (n >= d) n -= d, s |= (1UL << 16); d >>= 1; if (n >= d) n -= d, s |= (1UL << 15); d >>= 1; if (n >= d) n -= d, s |= (1UL << 14); d >>= 1; if (n >= d) n -= d, s |= (1UL << 13); d >>= 1; if (n >= d) n -= d, s |= (1UL << 12); d >>= 1; if (n >= d) n -= d, s |= (1UL << 11); d >>= 1; if (n >= d) n -= d, s |= (1UL << 10); d >>= 1; if (n >= d) n -= d, s |= (1UL << 9); d >>= 1; if (n >= d) n -= d, s |= (1UL << 8); d >>= 1; if (n >= d) s += (1UL << 8); if (!(s & (1UL << 31))) s <<= 1, --e; e -= 9; ++sp; } The divide code is quite time consuming, so it would be advantageous to quickly reduce very large numbers. A divide by 1,000,000,000,000 is used to improve performance for these large numbers.The preceding multiply code could do the same for very small numbers, but there is no speed advantage due to the multiply by 1000 using 2 shift/subtract operations and multiply by lager values requiring more than 2 per 1000.

    while (e > (150 + 16) || (e == (150 + 16) && s > (999999995904ULL >> 16))) { uint64_t n = s; n <<= 32; s = 0; uint64_t d = 1000000000000ULL << (64 - 40); if (n >= d) n -= d, s |= (1UL << 31); d >>= 1; if (n >= d) n -= d, s |= (1UL << 30); d >>= 1; if (n >= d) n -= d, s |= (1UL << 29); d >>= 1; if (n >= d) n -= d, s |= (1UL << 28); d >>= 1; if (n >= d) n -= d, s |= (1UL << 27); d >>= 1; if (n >= d) n -= d, s |= (1UL << 26); d >>= 1; if (n >= d) n -= d, s |= (1UL << 25); d >>= 1; if (n >= d) n -= d, s |= (1UL << 24); d >>= 1; if (n >= d) n -= d, s |= (1UL << 23); d >>= 1; if (n >= d) n -= d, s |= (1UL << 22); d >>= 1; if (n >= d) n -= d, s |= (1UL << 21); d >>= 1; if (n >= d) n -= d, s |= (1UL << 20); d >>= 1; if (n >= d) n -= d, s |= (1UL << 19); d >>= 1; if (n >= d) n -= d, s |= (1UL << 18); d >>= 1; if (n >= d) n -= d, s |= (1UL << 17); d >>= 1; if (n >= d) n -= d, s |= (1UL << 16); d >>= 1; if (n >= d) n -= d, s |= (1UL << 15); d >>= 1; if (n >= d) n -= d, s |= (1UL << 14); d >>= 1; if (n >= d) n -= d, s |= (1UL << 13); d >>= 1; if (n >= d) n -= d, s |= (1UL << 12); d >>= 1; if (n >= d) n -= d, s |= (1UL << 11); d >>= 1; if (n >= d) n -= d, s |= (1UL << 10); d >>= 1; if (n >= d) n -= d, s |= (1UL << 9); d >>= 1; if (n >= d) n -= d, s |= (1UL << 8); //d >>= 1; //if (n >= d) s += (1UL << 8); if (n) s += (1UL << 8); if (!(s & (1UL << 31))) s <<= 1, --e; e -= 39; sp += 4; } Rounding is the most difficult part of printing floating point numbers. Precalculated float constants are applied based on the value of the float and the specified number of significant digits. This simple method is fast and allows for good results for up to 7 significant digits.
    typedef struct { uint32_t s; int e; } TFR; TFR const r[] = { 0x800000UL << 7, 126 + 1, // 0.5 0xCCCCCDUL << 7, 122 + 1, // 0.05 0xA3D70AUL << 7, 119 + 1, // 0.005 0x83126FUL << 7, 116 + 1, // 0.0005 0xD1B717UL << 7, 112 + 1, // 0.00005 0xA7C5ACUL << 7, 109 + 1, // 0.000005 0x8637BDUL << 7, 106 + 1, // 0.0000005 0xD6BF95UL << 7, 102 + 1 // 0.00000005 }; if (d < 3) d = 3; else if (d > 8) d = 8; if (s) { TFR const *pr = &r[d - 3]; if (e < (126 + 4) || (e == (126 + 4) && s < (10UL << (32 - 4)))) { // < 10 pr += 2; } else if (e < (126 + 7) || (e == (126 + 7) && s < (100UL << (32 - 7)))) { // < 100 ++pr; } s += (pr->s >> (e - pr->e)); if (e == (126 + 10) && s >= (1000UL << (32 - 10))) s = (1UL << 31), e = 127, ++sp; else if (!(s & (1UL << 31))) s >>= 1, s |= (1UL << 31), ++e; } The integer part is printed using iterative subtraction of base 10 constants. This is typically faster than the common divide/modulus method.
    unsigned i = s >> 16; i >>= (136 - e); unsigned id = 1; char c; if (i >= (100 << 6)) { ++id; c = '0'; while (i >= (100 << 6)) i -= (100 << 6), ++c; *a++ = c; } if (id == 2 || i >= (10 << 6)) { ++id; c = '0'; while (i >= (10 << 6)) i -= (10 << 6), ++c; *a++ = c; } c = '0'; while (i >= (1 << 6)) i -= (1 << 6), ++c; *a++ = c; The fractional part is printed by iterative multiplication by 10.
    *a++ = '.'; if (e < 130) s >>= (130 - e); else s <<= (e - 130); d -= id; while (d) { s &= ((1UL << 28) - 1); s = (s << 3) + (s << 1); *a++ = '0' + (s >> 28); --d; } The SI prefix is appended and the string is terminated.
    *a++ = *sp; *a = 0; The resulting performance increase was more than I expected. 

    TI 4.4.4 GCC 4.9.1 --------------------------------------- %e 16402 6.47 s non-functional %f 16380 5.78 s non-functional %g 16402 4.69 s non-functional ftoas(7) 8480 0.12 s 11892 0.27 s ftoas(3) 8480 0.09 s 11892 0.17 s Results of the test code using ftoas(7), %e, %g, and %f
    0.000000 0.000000e+00 0 0.000000 1.401298? 0.000000e+00 0 0.000000 1.401298? 0.000000e+00 0 0.000000 9.809089? 0.000000e+00 0 0.000000 99.49219? 0.000000e+00 0 0.000000 1.000527? 0.000000e+00 0 0.000000 9.999666? 0.000000e+00 0 0.000000 99.99946? 0.000000e+00 0 0.000000 1.000000? 0.000000e+00 0 0.000000 12.00000? 1.200000e-38 1.2e-38 0.000000 100.0000? 1.000000e-37 1e-37 0.000000 1.000000? 1.000000e-36 1e-36 0.000000 10.00000? 1.000000e-35 1e-35 0.000000 100.0000? 1.000000e-34 1e-34 0.000000 1.000000? 1.000000e-33 1e-33 0.000000 10.00000? 1.000000e-32 1e-32 0.000000 100.0000? 1.000000e-31 1e-31 0.000000 1.000000? 1.000000e-30 1e-30 0.000000 10.00000? 1.000000e-29 1e-29 0.000000 100.0000? 1.000000e-28 1e-28 0.000000 1.000000? 1.000000e-27 1e-27 0.000000 10.00000? 1.000000e-26 1e-26 0.000000 100.0000? 1.000000e-25 1e-25 0.000000 1.000000y 1.000000e-24 1e-24 0.000000 10.00000y 1.000000e-23 1e-23 0.000000 100.0000y 1.000000e-22 1e-22 0.000000 1.000000z 1.000000e-21 1e-21 0.000000 10.00000z 1.000000e-20 1e-20 0.000000 100.0000z 1.000000e-19 1e-19 0.000000 1.000000a 1.000000e-18 1e-18 0.000000 10.00000a 1.000000e-17 1e-17 0.000000 100.0000a 1.000000e-16 1e-16 0.000000 1.000000f 1.000000e-15 1e-15 0.000000 10.00000f 1.000000e-14 1e-14 0.000000 100.0000f 1.000000e-13 1e-13 0.000000 1.000000p 1.000000e-12 1e-12 0.000000 10.00000p 1.000000e-11 1e-11 0.000000 100.0000p 1.000000e-10 1e-10 0.000000 1.000000n 1.000000e-09 1e-09 0.000000 10.00000n 1.000000e-08 1e-08 0.000000 100.0000n 1.000000e-07 1e-07 0.000000 1.000000u 1.000000e-06 1e-06 0.000001 10.00000u 1.000000e-05 1e-05 0.000010 100.0000u 1.000000e-04 0.0001 0.000100 1.000000m 1.000000e-03 0.001 0.001000 10.00000m 1.000000e-02 0.01 0.010000 100.0000m 1.000000e-01 0.1 0.100000 1.000000 1.000000e+00 1 1.000000 1.234568 1.234568e+00 1.23457 1.234568 10.00000 1.000000e+01 10 10.000000 100.0000 1.000000e+02 100 100.000000 1.000000k 1.000000e+03 1000 1000.000000 10.00000k 1.000000e+04 10000 10000.000000 100.0000k 1.000000e+05 100000 100000.000000 1.000000M 1.000000e+06 1e+06 1000000.000000 10.00000M 1.000000e+07 1e+07 10000000.000000 100.0000M 1.000000e+08 1e+08 100000000.000000 1.000000G 1.000000e+09 1e+09 1000000000.000000 10.00000G 1.000000e+10 1e+10 10000000000.000000 100.0000G 1.000000e+11 1e+11 99999997952.000010 1.000000T 1.000000e+12 1e+12 999999995903.999925 10.00000T 1.000000e+13 1e+13 9999999827968.000174 100.0000T 1.000000e+14 1e+14 100000000376832.008362 1.000000P 1.000000e+15 1e+15 999999986991104.125977 10.00000P 1.000000e+16 1e+16 10000000272564222.812653 100.0000P 1.000000e+17 1e+17 99999998430674934.387207 1.000000E 1.000000e+18 1e+18 999999984306749343.872070 10.00000E 1.000000e+19 1e+19 9999999980506448745.727539 100.0000E 1.000000e+20 1e+20 100000002004087710380.554199 1.000000Z 1.000000e+21 1e+21 1000000020040877103805.541992 10.00000Z 1.000000e+22 1e+22 9999999778196308612823.486328 100.0000Z 1.000000e+23 1e+23 99999997781963086128234.863281 1.000000Y 1.000000e+24 1e+24 1000000013848427772521972.656250 10.00000Y 1.000000e+25 1e+25 9999999562023527622222900.390625 100.0000Y 1.000000e+26 1e+26 100000002537764322757720947.265625 1.000000? 1.000000e+27 1e+27 999999988484154701232910156.250000 10.00000? 9.999999e+27 1e+28 9999999442119691371917724609.375000 100.0000? 1.000000e+29 1e+29 100000001504746651649475097656.250000 1.000000? 1.000000e+30 1e+30 1000000015047466516494750976562.500000 10.00000? 1.000000e+31 1e+31 9999999848243210315704345703125.000000 100.0000? 1.000000e+32 1e+32 100000003318135333061218261718750.000000 1.000000? 1.000000e+33 1e+33 999999994495727896690368652343750.000000 10.00000? 1.000000e+34 1e+34 9999999790214771032333374023437500.000000 100.0000? 1.000000e+35 1e+35 100000004091847860813140869140625000.000000 1.000000? 1.000000e+36 1e+36 999999961690316438674926757812500000.000000 10.00000? 1.000000e+37 1e+37 9999999933815813064575195312500000000.000000 100.0000? 1.000000e+38 1e+38 99999996802856898307800292968750000000.000000 340.0001? 3.400000e+38 3.4e+38 339999995214436411857604980468750000000.000000 NaN nan nan nan Inf +inf +inf +inf Complete code with test case.
    #include <msp430.h> #include <stdio.h> #include <stdint.h> #include <string.h> #include <math.h> static void print(char const *s) { while(*s) { while(!(UCA1IFG & UCTXIFG)); UCA1TXBUF = *s++; } } typedef struct { uint32_t s; int e; } TFR; TFR const r[] = { 0x800000UL << 7, 126 + 1, // 0.5 0xCCCCCDUL << 7, 122 + 1, // 0.05 0xA3D70AUL << 7, 119 + 1, // 0.005 0x83126FUL << 7, 116 + 1, // 0.0005 0xD1B717UL << 7, 112 + 1, // 0.00005 0xA7C5ACUL << 7, 109 + 1, // 0.000005 0x8637BDUL << 7, 106 + 1, // 0.0000005 0xD6BF95UL << 7, 102 + 1 // 0.00000005 }; void ftoas(char *a, float const f, unsigned d) { uint32_t s = *(uint32_t *)&f; if (s & (1UL << 31)) *a++ = '-'; int e = (s >> 23) & 0xFF; s <<= 8; s &= ~(1UL << 31); if (e == 255) { if (s) { strcpy(a, "NaN"); } else { strcpy(a, "Inf"); } return; } else if (e == 0) { if (s) { e = 1; while (!(s & (1UL << 31))) s <<= 1, --e; } else { e = 127; } } else { s |= (1UL << 31); } char const * sp = "???????yzafpnum kMGTPEZY????" + 15; if (e < 127) { do { s = s - (s >> 6) - (s >> 7); if (!(s & (1UL << 31))) s <<= 1, --e; e += 10; --sp; } while (e < 127); } else if (e > 135) { while (e > (150 + 16) || (e == (150 + 16) && s > (999999995904ULL >> 16))) { uint64_t n = s; n <<= 32; s = 0; uint64_t d = 1000000000000ULL << (64 - 40); if (n >= d) n -= d, s |= (1UL << 31); d >>= 1; if (n >= d) n -= d, s |= (1UL << 30); d >>= 1; if (n >= d) n -= d, s |= (1UL << 29); d >>= 1; if (n >= d) n -= d, s |= (1UL << 28); d >>= 1; if (n >= d) n -= d, s |= (1UL << 27); d >>= 1; if (n >= d) n -= d, s |= (1UL << 26); d >>= 1; if (n >= d) n -= d, s |= (1UL << 25); d >>= 1; if (n >= d) n -= d, s |= (1UL << 24); d >>= 1; if (n >= d) n -= d, s |= (1UL << 23); d >>= 1; if (n >= d) n -= d, s |= (1UL << 22); d >>= 1; if (n >= d) n -= d, s |= (1UL << 21); d >>= 1; if (n >= d) n -= d, s |= (1UL << 20); d >>= 1; if (n >= d) n -= d, s |= (1UL << 19); d >>= 1; if (n >= d) n -= d, s |= (1UL << 18); d >>= 1; if (n >= d) n -= d, s |= (1UL << 17); d >>= 1; if (n >= d) n -= d, s |= (1UL << 16); d >>= 1; if (n >= d) n -= d, s |= (1UL << 15); d >>= 1; if (n >= d) n -= d, s |= (1UL << 14); d >>= 1; if (n >= d) n -= d, s |= (1UL << 13); d >>= 1; if (n >= d) n -= d, s |= (1UL << 12); d >>= 1; if (n >= d) n -= d, s |= (1UL << 11); d >>= 1; if (n >= d) n -= d, s |= (1UL << 10); d >>= 1; if (n >= d) n -= d, s |= (1UL << 9); d >>= 1; if (n >= d) n -= d, s |= (1UL << 8); //d >>= 1; //if (n >= d) s += (1UL << 8); if (n) s += (1UL << 8); if (!(s & (1UL << 31))) s <<= 1, --e; e -= 39; sp += 4; } while (e > (126 + 10) || (e == (126 + 10) && s >= (1000UL << (32 - 10)))) { uint32_t n = s; s = 0; uint32_t d = 1000UL << (32 - 10); if (n >= d) n -= d, s |= (1UL << 31); d >>= 1; if (n >= d) n -= d, s |= (1UL << 30); d >>= 1; if (n >= d) n -= d, s |= (1UL << 29); d >>= 1; if (n >= d) n -= d, s |= (1UL << 28); d >>= 1; if (n >= d) n -= d, s |= (1UL << 27); d >>= 1; if (n >= d) n -= d, s |= (1UL << 26); d >>= 1; if (n >= d) n -= d, s |= (1UL << 25); d >>= 1; if (n >= d) n -= d, s |= (1UL << 24); d >>= 1; if (n >= d) n -= d, s |= (1UL << 23); d >>= 1; if (n >= d) n -= d, s |= (1UL << 22); d >>= 1; if (n >= d) n -= d, s |= (1UL << 21); d >>= 1; if (n >= d) n -= d, s |= (1UL << 20); d >>= 1; if (n >= d) n -= d, s |= (1UL << 19); d >>= 1; if (n >= d) n -= d, s |= (1UL << 18); d >>= 1; if (n >= d) n -= d, s |= (1UL << 17); d >>= 1; if (n >= d) n -= d, s |= (1UL << 16); d >>= 1; if (n >= d) n -= d, s |= (1UL << 15); d >>= 1; if (n >= d) n -= d, s |= (1UL << 14); d >>= 1; if (n >= d) n -= d, s |= (1UL << 13); d >>= 1; if (n >= d) n -= d, s |= (1UL << 12); d >>= 1; if (n >= d) n -= d, s |= (1UL << 11); d >>= 1; if (n >= d) n -= d, s |= (1UL << 10); d >>= 1; if (n >= d) n -= d, s |= (1UL << 9); d >>= 1; if (n >= d) n -= d, s |= (1UL << 8); d >>= 1; if (n >= d) s += (1UL << 8); if (!(s & (1UL << 31))) s <<= 1, --e; e -= 9; ++sp; } } if (d < 3) d = 3; else if (d > 8) d = 8; if (s) { TFR const *pr = &r[d - 3]; if (e < (126 + 4) || (e == (126 + 4) && s < (10UL << (32 - 4)))) { // < 10 pr += 2; } else if (e < (126 + 7) || (e == (126 + 7) && s < (100UL << (32 - 7)))) { // < 100 ++pr; } s += (pr->s >> (e - pr->e)); if (e == (126 + 10) && s >= (1000UL << (32 - 10))) s = (1UL << 31), e = 127, ++sp; else if (!(s & (1UL << 31))) s >>= 1, s |= (1UL << 31), ++e; } unsigned i = s >> 16; i >>= (136 - e); unsigned id = 1; char c; if (i >= (100 << 6)) { ++id; c = '0'; while (i >= (100 << 6)) i -= (100 << 6), ++c; *a++ = c; } if (id == 2 || i >= (10 << 6)) { ++id; c = '0'; while (i >= (10 << 6)) i -= (10 << 6), ++c; *a++ = c; } c = '0'; while (i >= (1 << 6)) i -= (1 << 6), ++c; *a++ = c; *a++ = '.'; if (e < 130) s >>= (130 - e); else s <<= (e - 130); d -= id; while (d) { s &= ((1UL << 28) - 1); s = (s << 3) + (s << 1); *a++ = '0' + (s >> 28); --d; } *a++ = *sp; *a = 0; } #define smclk_freq (32768UL * 31UL) // SMCLK frequency in hertz #define bps (9600UL) // Async serial bit rate int main(void) { WDTCTL = WDTPW | WDTHOLD; // Stop watchdog timer // P4SEL = BIT4 | BIT5; // Enable UART pins P4DIR = BIT4 | BIT5; // // // Initialize UART UCA1CTL1 = UCSWRST; // Hold USCI in reset to allow configuration UCA1CTL0 = 0; // No parity, LSB first, 8 bits, one stop bit, UART (async) const unsigned long brd = (smclk_freq + (bps >> 1)) / bps; // Bit rate divisor UCA1BR1 = (brd >> 12) & 0xFF; // High byte of whole divisor UCA1BR0 = (brd >> 4) & 0xFF; // Low byte of whole divisor UCA1MCTL = ((brd << 4) & 0xF0) | UCOS16; // Fractional divisor, oversampling mode UCA1CTL1 = UCSSEL_2; // Use SMCLK for bit rate generator, release reset char s[32], t[96]; float const tv[] = { 0.0f, 7.1e-46f, 1.0e-45f, 1.0e-44f, 1.0e-43f, 1.0e-42f, 1.0e-41f, 1.0e-40f, 1.0e-39f, 1.2e-38f, 1.0e-37f, 1.0e-36f, 1.0e-35f, 1.0e-34f, 1.0e-33f, 1.0e-32f, 1.0e-31f, 1.0e-30f, 1.0e-29f, 1.0e-28f, 1.0e-27f, 1.0e-26f, 1.0e-25f, 1.0e-24f, 1.0e-23f, 1.0e-22f, 1.0e-21f, 1.0e-20f, 1.0e-19f, 1.0e-18f, 1.0e-17f, 1.0e-16f, 1.0e-15f, 1.0e-14f, 1.0e-13f, 1.0e-12f, 1.0e-11f, 1.0e-10f, 1.0e-9f, 1.0e-8f, 1.0e-7f, 0.000001f, 0.00001f, 0.0001f, 0.001f, 0.01f, 0.1f, 1.0f, 1.23456789f, 10.0f, 100.0f, 1000.0f, 10000.0f, 100000.0f, 1000000.0f, 10000000.0f, 100000000.0f, 1000000000.0f, 1.0e10f, 1.0e11f, 1.0e12f, 1.0e13f, 1.0e14f, 1.0e15f, 1.0e16f, 1.0e17f, 1.0e18f, 1.0e19f, 1.0e20f, 1.0e21f, 1.0e22f, 1.0e23f, 1.0e24f, 1.0e25f, 1.0e26f, 1.0e27f, 1.0e28f, 1.0e29f, 1.0e30f, 1.0e31f, 1.0e32f, 1.0e33f, 1.0e34f, 1.0e35f, 1.0e36f, 1.0e37f, 1.0e38f, 3.4e38f, NAN, INFINITY }; TA0EX0 = 7; TA0CTL = TASSEL_2 | ID_3 | MC_2; TA0CTL |= TACLR; unsigned i; for (i = 0; i < sizeof(tv) / sizeof(tv[0]); ++i) { float const f = tv[i]; ftoas(s, f, 7); //print(s); print("\r\n"); //sprintf(t, "%e", f); //sprintf(t, "%f", f); //sprintf(t, "%g", f); //sprintf(t, "%e %f %g %s\r\n", f, f, f, s); sprintf(t, "%s %e %g %f\r\n", s, f, f, f); print(t); } volatile float et = ((float)TA0R + ((TA0CTL & TAIFG) ? 65536.0 : 0.0)) * 63.0f / 1000000.0f; // Elapsed time in microseconds ftoas(t, et, 5); //sprintf(t, "%f", et); print(t); print("s\r\n"); for(;; return 0; }
  25. Like
    sq7bti reacted to igor in How to use interrupts in Energia?   
    You will probably need to go below the level of Energia to use Timer Interrupts.
    (Or find a timer library.  I was working on one, but haven't done much with it in a while.)
     
    Look at the Tivaware/Stellarisware library documentation on timer interrupts (or ADC).
     
    If you want an example that uses them in creating an Energia library, can look at this LED driver library I adapted to Tiva
    https://github.com/ecdr/RGB-matrix-Panel
    MAP_SysCtlPeripheralEnable( TIMER_SYSCTL ); MAP_TimerConfigure( TIMER_BASE, TIMER_CFG ); IntRegister( TIMER_INT, TmrHandler ); MAP_IntMasterEnable(); MAP_IntEnable( TIMER_INT ); MAP_TimerIntEnable( TIMER_BASE, TIMER_TIMEOUT ); MAP_TimerLoadSet( TIMER_BASE, TIMER_AB, rowtime ); MAP_TimerEnable( TIMER_BASE, TIMER_AB ); ... void TmrHandler(void) { ... TimerIntClear( TIMER_BASE, TIMER_TIMEOUT ); // MAP_ version of library call takes longer // HWREG(TIMER_BASE + TIMER_O_ICR) = TIMER_TIMEOUT; // Inlining timer library code - just in ISR, where speed helpful } In my case timing was critical in the Timer ISR, so I wound up inlining the timer interrupt clear code (commented out above).
    For most uses calling the TimerIntClear library would be better.
     
    Using IntRegister means I do not have to modify Startup_gcc.c - which makes the library more portable/easier for others to use.
    (It may also take a little more space, not sure on speed.)  
    Since I was making a library for general use, the portability was much more important.  Of course your application may be different.
     
    A neater way to handle this (best of both worlds) would be to use Weak default interrupt handlers in Startup_gcc.c
    (as is done for the UARTIntHandlers in Energia's Startup_gcc.c)
     
    Then you could substitute your own handler just by declaring it to have a certain name, and it would override the built in (weak) handler.
    (If the built in Startup_gcc.c had such weak definitions, then further editing of Startup_gcc would not be necessary.)
     
    If you want more examples of weak - I proposed a fix for some initialization issue in Energia using this feature over on 43oh.
    Startup files for some other ARM processors libraries use weak (sorry I don't remember which ones, but I ran into them while working on the code for eLua.)
×
×
  • Create New...