Floating Point
-
Just pushed the float changes.
To free up some bits in header for the new fixed point types (and simplify things) I'm considering reducing the commandTypes to just 3 values (SET, REQ, INTERNAL) the rest (PRESENTATION, STREAM) will be moved to be INTERNAL messages.
I could make serial interface unaffected by this change. But I'd rather remove it there as well.@Yveaux . Regarding remove the unsigned variant (e.g. ULONG). It is actually good to keep this. As there actually are some sensors reporting large numbers like meter-ticks which can be huge.
-
I 'ported' libfixmath to the Arduino (see https://github.com/Yveaux/Arduino_fixpt)
Ran into a lot of internal compiler error issues and had to convert the unittests to C++ variants (no need to test C-implementation only) which revealed some issues in the C++ wrapper.Anyway, stuff is running now and I have some preliminary benchmark results.
I create a benchmark sketch which runs a number of multiplications/divisions/additions/subtractions/sqrt.Code size results:
Bare (no float/fix16) 450 double 2514 fix16 2956So code size slightly increases with fixed point calculations.
Calculation performance results:
with overflow detection and rounding:
Op double fixpt speed improvement fixpt over double Mult 870448 1708304 50.95% Div 2303348 2986484 77.13% Add 858280 296760 289.22% Sub 858108 296644 289.27% Sqrt 13164 15444 85.24%without overflow detection and rounding (FIXMATH_NO_OVERFLOW & FIXMATH_NO_ROUNDING defined)
Op double fixpt speed improvement fixpt over double Mult 870452 1699392 51.22% Div 2303348 4794636 48.04% Add 858280 82568 1039.48% Sub 858108 72776 1179.11% Sqrt 13168 15528 84.80%So additions & subtractions are significantly faster using Fix16 (we can probably live without overflow detection & rounding) and mul/div/sqrt are slower...
First conclusion: we don't gain in flash code space and don't gain (on average) in code execution speed. Maybe very specific applications can benefit from Fix16 implementation on AVR, but I seriously doubt if it's worth all the effort....
Seems like the AVR floating point library is very efficient, both in code size and execution speed.
Please review my code as I might be missing something...
-
I 'ported' libfixmath to the Arduino (see https://github.com/Yveaux/Arduino_fixpt)
Ran into a lot of internal compiler error issues and had to convert the unittests to C++ variants (no need to test C-implementation only) which revealed some issues in the C++ wrapper.Anyway, stuff is running now and I have some preliminary benchmark results.
I create a benchmark sketch which runs a number of multiplications/divisions/additions/subtractions/sqrt.Code size results:
Bare (no float/fix16) 450 double 2514 fix16 2956So code size slightly increases with fixed point calculations.
Calculation performance results:
with overflow detection and rounding:
Op double fixpt speed improvement fixpt over double Mult 870448 1708304 50.95% Div 2303348 2986484 77.13% Add 858280 296760 289.22% Sub 858108 296644 289.27% Sqrt 13164 15444 85.24%without overflow detection and rounding (FIXMATH_NO_OVERFLOW & FIXMATH_NO_ROUNDING defined)
Op double fixpt speed improvement fixpt over double Mult 870452 1699392 51.22% Div 2303348 4794636 48.04% Add 858280 82568 1039.48% Sub 858108 72776 1179.11% Sqrt 13168 15528 84.80%So additions & subtractions are significantly faster using Fix16 (we can probably live without overflow detection & rounding) and mul/div/sqrt are slower...
First conclusion: we don't gain in flash code space and don't gain (on average) in code execution speed. Maybe very specific applications can benefit from Fix16 implementation on AVR, but I seriously doubt if it's worth all the effort....
Seems like the AVR floating point library is very efficient, both in code size and execution speed.
Please review my code as I might be missing something...
-
@hek I implemented a basic version of the 8.8 fixed point version of the library.
It doesn't pass the unittests completely yet, but here are the first results (sqrt not implemented yet):Code size results:
Bare (no float/fix8) 450 double 1214 fix16 2062So, again code size slightly increases with fixed point calculations.
with overflow detection and rounding:
Op double fixpt speed improvement fixpt over double Mult 840016 291008 288.66% Div 2240552 812044 275.92% Add 792564 133344 594.38% Sub 801984 136676 586.78%without overflow detection and rounding (FIXMATH_NO_OVERFLOW & FIXMATH_NO_ROUNDING defined)
Op double fixpt speed improvement fixpt over double Mult 840012 312440 268.86% Div 2240548 790992 283.26% Add 792564 56204 1410.16% Sub 801972 52960 1514.30%This is a very nice speed increase in all cases, especially when ignoring overflow detection and rounding.
Conclusion: Using 8.8 fixed point can definately bring the calculation time and therefore power consumption down! The use-case for 8.8 values is however limited, but for e.g. a temperature or humidity sensors with limited range & accuracy it seems usable.
Using 16.16 fixed point values has no clear advantage over using floating point values. -
@hek I implemented a basic version of the 8.8 fixed point version of the library.
It doesn't pass the unittests completely yet, but here are the first results (sqrt not implemented yet):Code size results:
Bare (no float/fix8) 450 double 1214 fix16 2062So, again code size slightly increases with fixed point calculations.
with overflow detection and rounding:
Op double fixpt speed improvement fixpt over double Mult 840016 291008 288.66% Div 2240552 812044 275.92% Add 792564 133344 594.38% Sub 801984 136676 586.78%without overflow detection and rounding (FIXMATH_NO_OVERFLOW & FIXMATH_NO_ROUNDING defined)
Op double fixpt speed improvement fixpt over double Mult 840012 312440 268.86% Div 2240548 790992 283.26% Add 792564 56204 1410.16% Sub 801972 52960 1514.30%This is a very nice speed increase in all cases, especially when ignoring overflow detection and rounding.
Conclusion: Using 8.8 fixed point can definately bring the calculation time and therefore power consumption down! The use-case for 8.8 values is however limited, but for e.g. a temperature or humidity sensors with limited range & accuracy it seems usable.
Using 16.16 fixed point values has no clear advantage over using floating point values. -
@hek Yep.
So, what shall I do? Invest some more time and get the 8.8 in (and modify some MySensors examples/libraries), or shall I park it for possible future usage? -
Maybe we could park this for now (if you don't have an super urge to get it in to 1.4). We can all see the benefits and have it in the pipe for future versions.
-
Too bad the 16p16 is so slow and large, cool that 8p8 is better.
I suspect that either could be far faster if done in assembly, as floating point must be.@Yveaux - can you share the test program you used?
Let's not throw the baby out with the bathwater. There are two problem domains here.
-
Calculations using non-integer values as operands and results (plus/minus/divide/multiply/sqrt).
-
Representing non-integer values over a network, and converting between this representation and the textual display used by humans (and whatever format is used by sensors).
Fixed point can be great for the first domain, although in this case the C/C++ implementation has some tradeoffs in size and speed compared to the compiler supported floating point.
But fixed point math is not as well targeted at the second domain. For a sensor value like 29.7 degrees, fixed point has an inexact and sometimes cumbersome representation. To see what I mean, follow the whole chain of converting a DHT-22 value into fixed point for OTA transport, and then into a comma separated string representation for consumption by a home automation interface (and display to a user). The time and library size will be dominated not by basic arithmetic operations among fixed point values, but by the conversions into and out of that format. The DHT-22 value can be relatively efficiently converted to a 2 byte integer 297 and a one byte scaling factor 1; and those can be converted to the string "297" and then "29.7" without a large library.
This is NOT to disparage fixed point math where it works well.
-
-
Too bad the 16p16 is so slow and large, cool that 8p8 is better.
I suspect that either could be far faster if done in assembly, as floating point must be.@Yveaux - can you share the test program you used?
Let's not throw the baby out with the bathwater. There are two problem domains here.
-
Calculations using non-integer values as operands and results (plus/minus/divide/multiply/sqrt).
-
Representing non-integer values over a network, and converting between this representation and the textual display used by humans (and whatever format is used by sensors).
Fixed point can be great for the first domain, although in this case the C/C++ implementation has some tradeoffs in size and speed compared to the compiler supported floating point.
But fixed point math is not as well targeted at the second domain. For a sensor value like 29.7 degrees, fixed point has an inexact and sometimes cumbersome representation. To see what I mean, follow the whole chain of converting a DHT-22 value into fixed point for OTA transport, and then into a comma separated string representation for consumption by a home automation interface (and display to a user). The time and library size will be dominated not by basic arithmetic operations among fixed point values, but by the conversions into and out of that format. The DHT-22 value can be relatively efficiently converted to a 2 byte integer 297 and a one byte scaling factor 1; and those can be converted to the string "297" and then "29.7" without a large library.
This is NOT to disparage fixed point math where it works well.
-