Jump to content
spirilis

[Energia Library] CMSIS-DSP for Energia

Recommended Posts

Hi folks-

I sat down and played around with this today for kicks.  Here's a CMSIS_DSP library (includes the .c files from ARM's CMSIS-DSP distribution, header files, and a single CMSIS_DSP.h that includes arm_math.h to make it simpler) for Energia.

 

CMSIS_DSP_Energia.zip

 

CMSIS-DSP is ARM's standard library for utilizing DSP algorithms on the ARM Cortex-M series chips; in particular it contains optimizations suitable for the Cortex-M4, along with primitives for using the extended instructions.

 

API documentation is located here: http://www.keil.com/pack/doc/CMSIS/DSP/html/index.html

 

The FIR32_TEST example came from something I saw (possibly a CMSIS-DSP stock example) back when I was compiling CMSIS-DSP under Linux for straight C TivaWare use.

 

The library provides 4 datatypes; F32 (float), Q31 (int32), Q15 (int16), Q7 (int8); the latter 3 datatypes can only store fractional (1.0 >= n >= -1.0) numbers, but if your datasets can use that I believe the library makes use of the Cortex-M4's SIMD instructions for parallelism.

 

I couldn't get Energia to include an .a file for the binaries, so I had to copy all the .c files and Energia has to build each one when you use the library; this extends the compile time a bit.

 

Enjoy!

Share this post


Link to post
Share on other sites

Why not include it at core level?

Just for an experiment, I grabbed some files from Teensy3 installation, they are the original CMSIS files with some #define added for compatibility.

Using it it straightforward, instead calling

#include <CMSIS_DSP.h>

you use:

 #include <arm_math.h>

Files to add:

hardware.zip

 

Share this post


Link to post
Share on other sites

Even on Teensy3, in my experiments, I didn't see fantastic speed imprevement using CMSIS library, only slight faster (tested with Sin and Cos) and the rfft create large 32bit lookup tables that uses almost all flash!. I don't know how is the relationship between LM4F internal math coprocessor and CMSIS library but I have the feel it's only related to ARM so the coprocessor it's not working with this library, or I'm wrong? I hope so!

Share this post


Link to post
Share on other sites

Also I believe one thing to keep in mind is the SIMD feature on the Cortex-M4 DSP extensions requires using 16-bit (maybe 8-bit available too?) datatypes so it can stuff the multiple data items into a 32-bit register. I recall seeing that in the CMSIS docs I think.

Share this post


Link to post
Share on other sites

I've done this stupid sin test on 80Mhz stellaris:

#include <CMSIS_DSP.h>

  unsigned long stime, etime,tres;
  float32_t testval;
 
void setup()
{
  Serial.begin(9600);
  delay(1000);
}

void loop()
{
  Serial.println("simple test...");
  stime = micros();
  for (int i=0;i<100;i++){
    testval = arm_sin_f32((float32_t)0.1*i);
  }
  etime = micros();
  tres = etime-stime;
  Serial.print("ArmMath: ");
  Serial.print(tres,DEC);
  Serial.print(" uS\n");
  delay(3000);
  stime = micros();
  for (int i=0;i<100;i++){
    testval = sin((float32_t)0.1*i);
  }
  etime = micros();
  tres = etime-stime;
  Serial.print("Math: ");
  Serial.print(tres,DEC);
  Serial.print(" uS\n");
  delay(3000);
   Serial.println("Value comparison");
    Serial.println("Normal Math...");
  for (int i=0;i<100;i++){
    testval = sin((float32_t)0.1*i);
    Serial.print(testval,3);
    if (i<99){
      Serial.print(",");
    } else {
      Serial.print("\n");
    }
  }
  Serial.println("Arm Math...");
  for (int i=0;i<100;i++){
    testval = arm_sin_f32((float32_t)0.1*i);
    Serial.print(testval,3);
    if (i<99){
      Serial.print(",");
    } else {
      Serial.print("\n");
    }
  }
   Serial.print("---------------------------------\n");
}

Same test on a 96Mhz Teensy3.1

 

Results:

--- Teensy3.1 96Mhz  -----
       ArmMath: 2453 uS
      Math:       4358 uS
--- Stellaris 80Mhz ----
      ArmMath: 373 uS
     Math:        3535 uS

 

Quite interesting, looks like much faster than Teensy3 and good speed over regular math. I would expected much more from an hardware mcop but I'm still not sure it's involved in this example.

Share this post


Link to post
Share on other sites

#include "math.h"
#include <CMSIS_DSP.h>

inline float dspSqrt(float x){
  float result;
  arm_sqrt_f32(x, &result);
  return result;
}


inline float invSqrt(float x) {
  float halfx = 0.5f * x;
  float y = x;
  long i= *(long*)&y;
  i = 0x5f375a86 - (i>>1);
  y = *(float*)&i;
  y = y * (1.5f - (halfx * y * y));
  return y;	
}

inline float dspInvSqrt(float x){
  float result;
  arm_sqrt_f32(x, &result);
  return 1 / result;
}

void setup() {
  Serial.begin(9600);
  delay(1000);
}


void loop(){
  Serial.print("Start\n");
  int time;
  volatile float f;
  time = micros();
  f=0;
  for (int i=0; i<1000; i++) {
    f += dspSqrt(i);
  }
  time = micros() -time;
  Serial.print("\n1000x dspSqrt:");
  Serial.print(time);
  Serial.print("us.  Result:");
  Serial.println(f,7);
  time = micros();
  f=0;
  for (int i=0; i<1000; i++) {
    f += dspSqrt(i);
  }
  Serial.print("1000x sqrt   :");
  Serial.print(time);
  Serial.print("us.  Result:");
  Serial.println(f,7);
  time = micros();
  f=0;
  for (int i=0; i<1000; i++) {
    f += dspInvSqrt(i);
  }
  time = micros() -time;
  Serial.print("\n1000x 1 / dspSqrt:");
  Serial.print(time);
  Serial.print("us.  Result  :");
  Serial.println(f);
  time = micros();
  f=0;
  for (int i=0; i<1000; i++) {
    f += 1/sqrt(i);
  }
  time = micros() -time;
  Serial.print("1000x 1 / sqrt   :");
  Serial.print(time);
  Serial.print("us.  Result  :");
  Serial.println(f);
  time = micros();
  f=0;
  for (int i=0; i<1000; i++) {
    f += invSqrt(i);
  }
  time = micros() -time;
  Serial.print("1000x invSqrt    :");
  Serial.print(time);
  Serial.print("us.  Result:");
  Serial.println(f);
  while(1);
}

Another quick & dirty speed test:

 

Stellaris:------------------------------------------------------------------------------------------------

1000x dspSqrt:      6006us.             Result:   21065.8378906

1000x sqrt   :         1036354us.       Result:   21065.8378906

 

1000x 1 / dspSqrt: 6205us.             Result  :4294967295.21474836472147483647

1000x 1 / sqrt   :    28837us.           Result  :4294967295.21474836472147483647

1000x invSqrt    :   1202us.             Result:  4294967295.21474836472147483647

 

Teensy3:------------------------------------------------------------------------------------------------

1000x dspSqrt:      7580us.             Result: 21065.8378906

1000x sqrt :           11700us.           Result: 21065.8378906

 

1000x 1 / dspSqrt: 9302us.             Result :4294967295.21474836472147483647

1000x 1 / sqrt :      23203us.           Result :4294967295.21474836472147483647

1000x invSqrt :      4576us.             Result:4294967295.21474836472147483647

 

Again, quite interesting! Stellaris outperform Teensy on DSP, this time not so much,  but the sqrt on Stellaris it's quite slow, maybe some investigation needed.

Share this post


Link to post
Share on other sites

Either way, if I'm not mistaken the Teensy doesn't have built in FPU (Cortex-M4, not M4F like Tiva/Stellaris). I'll bet those CMSIS functions start to kill when you make use of SIMD with pure integer math (e.g. q15 or q7 data types with arrays?)

Share this post


Link to post
Share on other sites

Hi sir,

 

 I am quite new to the signal processing area. i have downloaded the above cmsis.zip bt it is showing me error of content is bad or

!   C:\Users\SONY\Downloads\CMSIS_DSP_Energia.zip: The archive is either in unknown format or damaged
 

why it is so???

 

Share this post


Link to post
Share on other sites

Hi sir,

 

 I am quite new to the signal processing area. i have downloaded the above cmsis.zip bt it is showing me error of content is bad or

!   C:\Users\SONY\Downloads\CMSIS_DSP_Energia.zip: The archive is either in unknown format or damaged

 

why it is so???

I have no idea why you're seeing this; I just tried downloading it right now and it unzipped without a problem.  This is under Linux though.  Should be fine in Windows but I haven't checked (never seen an issue zipping files under Linux and showing corruption under Windows...)

 

Maybe delete your copy and try the download again?

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...