Jump to content
Sign in to follow this  
basil4j

Optimising math

Recommended Posts

Hi All,

 

Hope no-one minds me bombarding the forums with these n00b questions :)

I am writing the math for my rocket altimeter project and have some questions regarding the best way to do it.

 

Im using CCS and MSP430FR5739 which has hardware multiply, and I have included the MSP430_math routines

 

Lets say I need to perform the following equation (as an example)

 

P = (D1 * SENS / 2 - OFF) / 2

D1 = unsigned long

SENSE = signed long

OFF = signed long

 

Obvious optimisations aside (e.g. D1 / 2 before the FP mult to save an FP div...), would it give me faster code to break it into parts (not as concerned about code size)

P = SENS / 2

P = P * D1

P = P - OFF

P = P / 2

or leave it as a single equation as in the example?

 

 

Is it faster to divide or multiply? This example is probably a bad one since div2 would be nice and easy, but lets assume I need to divide by 200 and end up with a floating point result. Would I be better off multiplying by 0.005 instead?

 

I have enabled the hardware multiply. What is the trade off between 16 and 32 bit mult?

 

This next one isnt really MSP430 or C related, but I still need help :) Math has never been my strong point...

 

If n =  5.257, is x ^ (1/n) the same as x ^ (0.1902225)? Or will I need to use logarithmic math? <---This bit rings a bell from high school...

 

0.1902225 being the result of performing 1/n

 

Thanks again in advance!

Share this post


Link to post
Share on other sites

(please excuse crummy typing. Cat on lap takes one hand as he has paws wrapped around it)

 

Last thing first: It is the same. If n is a const, then it really doesn't matter... the compiler will precompute what it can. In a case like this, I use #define for the constant, but a const variable should generally work the same: compiler precomputes where it can, and only makes var if it needs to, such as if you make a pointer to it. Note that to do exponentiation, you need to call a (slow) function. The ^ is exclusive-or.  The underlying function uses log and antilog. If you have a  fixed power, you can speed it up by either expanding as a Taylor series or a continued fraction, since thst it how the log and the antilog are done. This would go faster as Taylor series. If time isn't a major issue (from your previous posts, it looks like it isn't at 25MHz), use the lib function from math.h.

 

Trade off: speed vs.range. 32bit takes 4 times as long, but has much more range (signed is 32000 vs 2000000000). For floating point, you have 32 bit no matter what (for float... In general, you don't use double unless you must n an embedded device)

 

The first questions: What are the types? Both signed and unsigned log are integer types. The div by 2 will be done as a shift if the types are all long and unsigned long. Modern compilers are smart.

 

If One of them is a float, the arithmetic will be done as float when needed, and from then on. What can be done as integer will be done so. Div by two will be optimized by most compilers as a decrement of hte binary exponent, so no worry about floating point divide there, either. Modern compilers are real smart. Don't break it up. The compiler will make it better than you can, unless there is something you haven't said. If you need the result to be float, and ALL Of the vars are integer, you MUST use a cast to force conversion to floating point where you want the conversion done. Use parenthesis to  control exactly when the conversion happens, so it isn't done early. If P is float, and all else is integer, yhe conversion will be done when truncated result i s stored.

Share this post


Link to post
Share on other sites

"P = (D1 * SENS / 2 - OFF) / 2" could become... I think...

 

"P = (((D1 * SENS) >> 1) - OFF) >> 1"

... or ...

"P = ((D1 * (SENS >> 1)) - OFF) >> 1"

 

I don't think the result is any different but I'm not sure what the rules of precedence are when dealing with mult/div and bit shifts.

 

Bit shifting right 1 bit is equivalent to dividing by two, as long as you don't need the remainder.

Share this post


Link to post
Share on other sites

@abecedarian: first one is good. second can lose precision, as the LSB is thrown away before mult. Makes a difference if SENSE is odd. Compiler should do it first way

Share this post


Link to post
Share on other sites

 

Is it faster to divide or multiply? This example is probably a bad one since div2 would be nice and easy, but lets assume I need to divide by 200 and end up with a floating point result. Would I be better off multiplying by 0.005 instead?

 

Since your MSP430 has a hardware multiplier, the multiply would be quicker. But modern divide function don't take too long.

 

One interesting trick, since alot of maths (especially with ADC) will involve multiplying by a fraction. you can often factor out the divide when you use a HW multiply.

X = Y * 2/3;  // original
X = Y * (2*(65536/3))/(3* (65536/3)); // multiply top/bottom to get 65536 on bottom
X = Y * (43691/65536); // the result of this adjustment.

(Y * 43691); // use the HW multi for this
X = RESHI; // take the high word

The trick is to get the result stored entirely within the higher word of the result.

 

To my knowledge, compilers wont do this.

Share this post


Link to post
Share on other sites

Thanks guys, very helpful.

 

I forgot to ask, is the dynamic range of single precision FP enough to hold a result which might range from -1500.00 up to 30000.00?

Im having a bit of trouble understanding the pages ive been reading to learn about FP.

 

For most of my results they only need 1 position before the decimal point, and max 8 after the DP, but I have a few with larger ranges. Fortunatly, I need fewer DP with those.

Share this post


Link to post
Share on other sites

Since your MSP430 has a hardware multiplier, the multiply would be quicker. But modern divide function don't take too long.

 

One interesting trick, since alot of maths (especially with ADC) will involve multiplying by a fraction. you can often factor out the divide when you use a HW multiply.

X = Y * 2/3;  // original
X = Y * (2*(65536/3))/(3* (65536/3)); // multiply top/bottom to get 65536 on bottom
X = Y * (43691/65536); // the result of this adjustment.

(Y * 43691); // use the HW multi for this
X = RESHI; // take the high word

The trick is to get the result stored entirely within the higher word of the result.

 

To my knowledge, compilers wont do this.

 

Interesting. Would this work with :

 

Result = ADC * 0.049?

 

and also

 

P0 = 9085466 (max 16777216)

ADC = something similar P0 (max 16777216)

 

Result = P0 / ADC

 

This is the first part of the pressure to altitude calculation with data from a 24bit barometer :)

Well I think it is, trying to get my head around the hypsometric equations, as I want it to take into account lapse rate for altitudes over 11km and most equations stop there.

Share this post


Link to post
Share on other sites

Interesting. Would this work with :

 

Result = ADC * 0.049?

 

and also

 

P0 = 9085466 (max 16777216)

ADC = something similar P0 (max 16777216)

 

Result = P0 / ADC

 

For the first yes.

 

For the second, no, unfortunately.

It only works when you're denominator can be fixed.

Share this post


Link to post
Share on other sites

Hi, 

 

Thanks for the info, I have read that in detail over the last few months and will tackle Kalman filtering once I have this thing working without it first. Seeing as this is my first time using an MSP device I want to keep it as simple as possible to start with :)

Share this post


Link to post
Share on other sites

Since your MSP430 has a hardware multiplier, the multiply would be quicker. But modern divide function don't take too long.

>snip<

But floats on MSP without FP units add overhead, and thus memory usage?

Share this post


Link to post
Share on other sites

If float is used, moderate memory overhead. Most (all?) operations are implemented as function calls, so te functions used must be included. The difference between one add and 20 adds is minimal, tho. Once the function is in the build, calling it isn't a lot of space.

\

Big thing is time. Software implementations of FP can be slow. A device with hardware mult (integer) can do many FP operations a lot faster than those wihout hardware mult. Hardware div (integer) makes things better yet. A few operations are not going to be a big issue, timewise. The functions that use a lot of operations are the killer, like exponentiation and logs. These can be worth optimizing in many cases. If previous thread hadn't given pretty loose timing for the altitude comp, I would call this a prime candidate for a specialty function for the exponentiation. Might still need it, but my guess is not.

 

Greeg's methods apply to the integer math (or fixed point), and can be used to avoid FP in cases where the final result needed is integer )or, again, fixed point), but intermediate comps may need FP or fraction.

 

 

A question I still have is: Must the altitude be computed in-flight? Or can the sensor date be stored and converted on the ground? Is the altitude needed? Or only some property, such as detecting when max altitude is reached? The answers can make a big difference in what math need be done on MSP430, and on how to doit.

Share this post


Link to post
Share on other sites

Since your MSP430 has a hardware multiplier, the multiply would be quicker. But modern divide function don't take too long.

I'm replying only to this statement, but in the spirit of the topic.

 

MSP430 has an average ALU by MCU-standards.  Shifting is fast.  Addition is pretty fast too.  Multiplication, however, occurs through a peripheral and it is quite slow.  It is faster than using a software multiplier, but it is still pretty slow because it does not work through registers.

 

Anyway, often it is much faster to add than to multiply.  In particular, one algorithm I wrote once had enormous, enormous, ENORMOUS improvement by taking logarithm (I made a table), then doing additions, then doing antilog (again, a table).

 

OK, that was just a vignette, but maybe it applies to you, too.

Share this post


Link to post
Share on other sites

Hi Enl,

 

I have some algorithims which check if altitude is over a certain value, which is configured by the user.

I also calculate velocity based on change in altitude and again, this velocity can be checked against a user setting. 

 

If altitude had a predictable relationship to pressure (i.e ADC value) I could convert those user altitude/velocity values into ADC/deltaADC before the flight (or even by the configuration program on a PC) and save alot of headache.

 

Unfortunately, the altitude/pressure relationship changes depending on the atmospheric conditions at the time of launch, and also needs to be adjusted for differing ground altitude.

 

Maybe I could have the firmware determine these thresholds in realtime while sitting on the pad waiting for launch. It would only need to be done at a VERY low rate, and the threshold would be fixed once launch is detected and I need the headroom for the important stuff...not like the weather will change much in the very short time these rockets take to reach apogee :D

 

This would also work for acceleration/velocity from the accelerometer, however I also have a magnetometer on board which initially I will be using to determine tilt (for safety), but ultimately I would like to integrate the magnetometer data with accelerometer to get the vertical component of velocity for no vertical flights...

 

Any thoughts are appreciated. As usual my posts tend to wander, but I guess this is still on subject, as in the end it is optimising/reducing math :)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×