Jump to content

DSP Libraries for Cortex-M3 and other ARM microcontrollers

Recommended Posts

I was looking around for a general dsp library to do FFTs when I came across this page.


Has a sh*t ton of stuff, hopefully helpful to you guys.



Four groups of functions:


Windowing function

Fast Fourier Transform

Complex magnitude (absolute value of complex frequency)

Miscellaneous functions: logarithm(x), exp(x), pseudorandom generator


Three library versions

GCC ( Rowley CrossWorks, Raisonance, …)


IAR Embedded Workbench



Windowing functions (e.g. Hamming window)

Windowing is very common step before FFT calculation

Perform speed optimized windowing of input signal before FFT

16 to 32 bit version performs proper scaling of 16 bit signal for 32 bit FFT



FFT functions

Complex and real FFT, 16 and 32bit FFT versions

Radix4/2 FFT – sizes 4,8,16,32,64,128,256,512,1024,2048 and 4096

Inverse FFT available

Real FFT enables much more efficient processing of the real signals

16 bit FFT precision comparable with other fixed point implementation – precision determined by necessary scaling by 0.5 in every FFT stage

32 bit FFT increases dynamic range by 90 dB , needs extra 20% to 50% cycles

Coefficients located in Flash. RAM location means faster FFT for higher latencies.



Magnitude functions

Calculate complex frequency magnitude mag=sqrt (re^2 + im^2)

Based on custom 32 bit square root algorithm (7/13 cycles)

Multiple versions of different speed / precision tradeoffs for 64 bit sqrt



Logarithm and exponent functions

Calculate log2(x) and exp2(x) = 2^x

log2 input, exp2 output: 16q16 unsigned 1/65536 to 65535+65535/65536

log2 output, exp2 input : 5q27 signed -15.99999 to 15.99999

speed: 11/10 cycles ; precision 0.4 ppm / 3 ppm for log2 / exp2

single multiply conversion to log10(x), ln(x), 10^x, e^x and generic base log, exp


Parallel MLS pseudorandom generator for ARM cpu

Maximum Length Sequence generated by Linear Feedback Shift Registers

Periode 2^31-1 to 2^64-1 words (1 to 64 bits wide)

1 to 64 bits generated in parallel

Order of magnitude faster than bit based approach, 3-10 cycles per whole word

Link to post
Share on other sites
  • 1 month later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Create New...