Floating Point
-
Just pushed the float changes.
To free up some bits in header for the new fixed point types (and simplify things) I'm considering reducing the commandTypes to just 3 values (SET, REQ, INTERNAL) the rest (PRESENTATION, STREAM) will be moved to be INTERNAL messages.
I could make serial interface unaffected by this change. But I'd rather remove it there as well.@Yveaux . Regarding remove the unsigned variant (e.g. ULONG). It is actually good to keep this. As there actually are some sensors reporting large numbers like meter-ticks which can be huge.
-
Just pushed the float changes.
To free up some bits in header for the new fixed point types (and simplify things) I'm considering reducing the commandTypes to just 3 values (SET, REQ, INTERNAL) the rest (PRESENTATION, STREAM) will be moved to be INTERNAL messages.
I could make serial interface unaffected by this change. But I'd rather remove it there as well.@Yveaux . Regarding remove the unsigned variant (e.g. ULONG). It is actually good to keep this. As there actually are some sensors reporting large numbers like meter-ticks which can be huge.
@hek Maybe not for 1.4, but you should consider removing a lot of the data from the header and leave only the routing info & message type in.
Depending on message type you then get a 'nested' header which tells you about the message-type specifics.
This also will help in the struggle to store data format types, for which you've now reserved 3 bits. They are always sent, also when there's no SET/GET data present in the message. Then you simply reserve e.g. a byte which will go a long way... -
@hek Maybe not for 1.4, but you should consider removing a lot of the data from the header and leave only the routing info & message type in.
Depending on message type you then get a 'nested' header which tells you about the message-type specifics.
This also will help in the struggle to store data format types, for which you've now reserved 3 bits. They are always sent, also when there's no SET/GET data present in the message. Then you simply reserve e.g. a byte which will go a long way...@Yveaux said:
This also will help in the struggle to store data format types, for which you've now reserved 3 bits. They are always sent, also when there's no SET/GET data present in the message. Then you simply reserve e.g. a byte which will go a long way...
Darn... you are so right..
-
@Yveaux said:
This also will help in the struggle to store data format types, for which you've now reserved 3 bits. They are always sent, also when there's no SET/GET data present in the message. Then you simply reserve e.g. a byte which will go a long way...
Darn... you are so right..
-
@Zeph said:
(the decimals parameter is needed for this like for floats, as described earlier)
I don't see why the decimals parameter is needed. Currently it is used for the amount of decimals converted to textual presentation. This is not required for fixed point presentation (unless you want a scaling factor).
IMHO scaling just complicates things too much -- you also need to exchange the scaling factor with the gateway.This implies creating new C++ types, in this example "fix8p8", which is basically a int16_t with an implicit radix point in the middle.
Adding is simple. Multiply of fix8p8 is easy because you can use a long as temp before renormalizing, but multiply of fix16p16 gets trickier, of course.
My idea is to just wrap the new types in a class library, which allows for easy conversion and maths with these new fixedpt types.
Getting the library right and educating users is going to be some work, tho. I use fixed point math fairly often, but it definitely has some gotchas that we are biting off.
The library should shield regular users from the internals and pitfalls of fixed point. Most sketches just get a value from a sensor library and pass it on to MySensors, without modifying the value.
As part of this exercise we also have to modify these libraries which return their values in float-format, as it doesn't make sense to keeps floats in partly...@Yveaux said:
@Zeph said:
(the decimals parameter is needed for this like for floats, as described earlier)
I don't see why the decimals parameter is needed. Currently it is used for the amount of decimals converted to textual presentation.
I was think of when packets are converted to the comma separated textual representation for the API.
Convert 25.7 degrees to fixed point at the node, then at the gateway convert the fixed point it to a text string, and you'll see what I mean. It's not for the scaling option.
IMHO scaling just complicates things too much -- you also need to exchange the scaling factor with the gateway.
There are tradeoffs either way. In the current architecture, to support scaling you'd need to at least tell the gateway the scaling factor as part of the one-time presentation configuration. (Alternately, the gateway could retrieve the scaling factor along with type and name from local configuration, rather than receiving all of those OTA, but that's another discussion)
Beyond that there's no need for new libraries, and it's easy to explain.
However, I understand that you are excited by the fixed point functionality (which could be useful for more than just OTA encoding). I don't want to discourage that exploration. I look forward to some examples of encoding at the node end, and decoding at the gateway end, for some sensors like the DHT-22 or 18B20.
This implies creating new C++ types, in this example "fix8p8", which is basically a int16_t with an implicit radix point in the middle.
Adding is simple. Multiply of fix8p8 is easy because you can use a long as temp before renormalizing, but multiply of fix16p16 gets trickier, of course.
My idea is to just wrap the new types in a class library, which allows for easy conversion and maths with these new fixedpt types.
Yes, that was what I was guessing. Go for it!
As part of this exercise we also have to modify these libraries which return their values in float-format, as it doesn't make sense to keeps floats in partly...
Agreed. It would be nice if we rarely needed to even link the floating point library in nodes. (And it would make an ATtiny based node more feasible someday).
-
@Zeph said:
The payload types would be enhanced.
typedef enum { P_STRING, P_BYTE, P_INT16, P_UINT16, P_LONG32, P_ULONG32, P_CUSTOM } payload;Darn, just realized we only got 3 bits to describe payload type. We need another one to fit the new ones.
MyMessage& set(double value, uint8_t decimals);Shouldn't this be set(float, uint8_t). Wouldn't it be confusing to have double-argument when only sending 32-bit float?
@hek said:
MyMessage& set(double value, uint8_t decimals);Shouldn't this be set(float, uint8_t). Wouldn't it be confusing to have double-argument when only sending 32-bit float?
I was quoting an excerpt of the current system. I would tend to agree with changing that to float, just for clarity of intent, even though they are the same in GCC for the AVR
-
@Yveaux said:
@Zeph said:
(the decimals parameter is needed for this like for floats, as described earlier)
I don't see why the decimals parameter is needed. Currently it is used for the amount of decimals converted to textual presentation.
I was think of when packets are converted to the comma separated textual representation for the API.
Convert 25.7 degrees to fixed point at the node, then at the gateway convert the fixed point it to a text string, and you'll see what I mean. It's not for the scaling option.
IMHO scaling just complicates things too much -- you also need to exchange the scaling factor with the gateway.
There are tradeoffs either way. In the current architecture, to support scaling you'd need to at least tell the gateway the scaling factor as part of the one-time presentation configuration. (Alternately, the gateway could retrieve the scaling factor along with type and name from local configuration, rather than receiving all of those OTA, but that's another discussion)
Beyond that there's no need for new libraries, and it's easy to explain.
However, I understand that you are excited by the fixed point functionality (which could be useful for more than just OTA encoding). I don't want to discourage that exploration. I look forward to some examples of encoding at the node end, and decoding at the gateway end, for some sensors like the DHT-22 or 18B20.
This implies creating new C++ types, in this example "fix8p8", which is basically a int16_t with an implicit radix point in the middle.
Adding is simple. Multiply of fix8p8 is easy because you can use a long as temp before renormalizing, but multiply of fix16p16 gets trickier, of course.
My idea is to just wrap the new types in a class library, which allows for easy conversion and maths with these new fixedpt types.
Yes, that was what I was guessing. Go for it!
As part of this exercise we also have to modify these libraries which return their values in float-format, as it doesn't make sense to keeps floats in partly...
Agreed. It would be nice if we rarely needed to even link the floating point library in nodes. (And it would make an ATtiny based node more feasible someday).
@Zeph said:
to support scaling you'd need to at least tell the gateway the scaling factor as part of the one-time presentation configuration
This method (and the same holds for the decimals-parameter) seems attrictive, but has a few drawbacks:
- The presentation message currently has no 'guaranteed' delivery; we would need to change that as without this info the gateway cannot interpret the incoming data
- It also has to be sent to the sensor (actuator actually) nodes from the gateway when data goes the other way. No 'presentation' mechanism from gateway to sensor currently exists.
- It places an administration burden on the gateway, and possibly on actuators
Beyond that there's no need for new libraries, and it's easy to explain.
Possibly, but when you start mixing up values with different scaling factors or want to do (simple) maths on them the story changes completely...
-
I've been google'ing around looking for existing fixed point c++ libraries which fit the ATMega and tend to try out the following: https://code.google.com/p/libfixmath
It has regular updates, a unit test suite, impressive performance advantages (especially addition/substract, see https://code.google.com/p/libfixmath/wiki/Benchmarks), has been tested on ATMega and uses an MIT license.
It only supports 16.16, but other derivates like 8.8 seems doable.Rolling my own from start is too much work for me, as implementation is tricky at some points (unit tests are a requirement IMHO)
Anyone has a better suggestion?
-
I've been google'ing around looking for existing fixed point c++ libraries which fit the ATMega and tend to try out the following: https://code.google.com/p/libfixmath
It has regular updates, a unit test suite, impressive performance advantages (especially addition/substract, see https://code.google.com/p/libfixmath/wiki/Benchmarks), has been tested on ATMega and uses an MIT license.
It only supports 16.16, but other derivates like 8.8 seems doable.Rolling my own from start is too much work for me, as implementation is tricky at some points (unit tests are a requirement IMHO)
Anyone has a better suggestion?
-
Just pushed the float changes.
To free up some bits in header for the new fixed point types (and simplify things) I'm considering reducing the commandTypes to just 3 values (SET, REQ, INTERNAL) the rest (PRESENTATION, STREAM) will be moved to be INTERNAL messages.
I could make serial interface unaffected by this change. But I'd rather remove it there as well.@Yveaux . Regarding remove the unsigned variant (e.g. ULONG). It is actually good to keep this. As there actually are some sensors reporting large numbers like meter-ticks which can be huge.
-
I 'ported' libfixmath to the Arduino (see https://github.com/Yveaux/Arduino_fixpt)
Ran into a lot of internal compiler error issues and had to convert the unittests to C++ variants (no need to test C-implementation only) which revealed some issues in the C++ wrapper.Anyway, stuff is running now and I have some preliminary benchmark results.
I create a benchmark sketch which runs a number of multiplications/divisions/additions/subtractions/sqrt.Code size results:
Bare (no float/fix16) 450 double 2514 fix16 2956So code size slightly increases with fixed point calculations.
Calculation performance results:
with overflow detection and rounding:
Op double fixpt speed improvement fixpt over double Mult 870448 1708304 50.95% Div 2303348 2986484 77.13% Add 858280 296760 289.22% Sub 858108 296644 289.27% Sqrt 13164 15444 85.24%without overflow detection and rounding (FIXMATH_NO_OVERFLOW & FIXMATH_NO_ROUNDING defined)
Op double fixpt speed improvement fixpt over double Mult 870452 1699392 51.22% Div 2303348 4794636 48.04% Add 858280 82568 1039.48% Sub 858108 72776 1179.11% Sqrt 13168 15528 84.80%So additions & subtractions are significantly faster using Fix16 (we can probably live without overflow detection & rounding) and mul/div/sqrt are slower...
First conclusion: we don't gain in flash code space and don't gain (on average) in code execution speed. Maybe very specific applications can benefit from Fix16 implementation on AVR, but I seriously doubt if it's worth all the effort....
Seems like the AVR floating point library is very efficient, both in code size and execution speed.
Please review my code as I might be missing something...
-
I 'ported' libfixmath to the Arduino (see https://github.com/Yveaux/Arduino_fixpt)
Ran into a lot of internal compiler error issues and had to convert the unittests to C++ variants (no need to test C-implementation only) which revealed some issues in the C++ wrapper.Anyway, stuff is running now and I have some preliminary benchmark results.
I create a benchmark sketch which runs a number of multiplications/divisions/additions/subtractions/sqrt.Code size results:
Bare (no float/fix16) 450 double 2514 fix16 2956So code size slightly increases with fixed point calculations.
Calculation performance results:
with overflow detection and rounding:
Op double fixpt speed improvement fixpt over double Mult 870448 1708304 50.95% Div 2303348 2986484 77.13% Add 858280 296760 289.22% Sub 858108 296644 289.27% Sqrt 13164 15444 85.24%without overflow detection and rounding (FIXMATH_NO_OVERFLOW & FIXMATH_NO_ROUNDING defined)
Op double fixpt speed improvement fixpt over double Mult 870452 1699392 51.22% Div 2303348 4794636 48.04% Add 858280 82568 1039.48% Sub 858108 72776 1179.11% Sqrt 13168 15528 84.80%So additions & subtractions are significantly faster using Fix16 (we can probably live without overflow detection & rounding) and mul/div/sqrt are slower...
First conclusion: we don't gain in flash code space and don't gain (on average) in code execution speed. Maybe very specific applications can benefit from Fix16 implementation on AVR, but I seriously doubt if it's worth all the effort....
Seems like the AVR floating point library is very efficient, both in code size and execution speed.
Please review my code as I might be missing something...
-
@hek I implemented a basic version of the 8.8 fixed point version of the library.
It doesn't pass the unittests completely yet, but here are the first results (sqrt not implemented yet):Code size results:
Bare (no float/fix8) 450 double 1214 fix16 2062So, again code size slightly increases with fixed point calculations.
with overflow detection and rounding:
Op double fixpt speed improvement fixpt over double Mult 840016 291008 288.66% Div 2240552 812044 275.92% Add 792564 133344 594.38% Sub 801984 136676 586.78%without overflow detection and rounding (FIXMATH_NO_OVERFLOW & FIXMATH_NO_ROUNDING defined)
Op double fixpt speed improvement fixpt over double Mult 840012 312440 268.86% Div 2240548 790992 283.26% Add 792564 56204 1410.16% Sub 801972 52960 1514.30%This is a very nice speed increase in all cases, especially when ignoring overflow detection and rounding.
Conclusion: Using 8.8 fixed point can definately bring the calculation time and therefore power consumption down! The use-case for 8.8 values is however limited, but for e.g. a temperature or humidity sensors with limited range & accuracy it seems usable.
Using 16.16 fixed point values has no clear advantage over using floating point values. -
@hek I implemented a basic version of the 8.8 fixed point version of the library.
It doesn't pass the unittests completely yet, but here are the first results (sqrt not implemented yet):Code size results:
Bare (no float/fix8) 450 double 1214 fix16 2062So, again code size slightly increases with fixed point calculations.
with overflow detection and rounding:
Op double fixpt speed improvement fixpt over double Mult 840016 291008 288.66% Div 2240552 812044 275.92% Add 792564 133344 594.38% Sub 801984 136676 586.78%without overflow detection and rounding (FIXMATH_NO_OVERFLOW & FIXMATH_NO_ROUNDING defined)
Op double fixpt speed improvement fixpt over double Mult 840012 312440 268.86% Div 2240548 790992 283.26% Add 792564 56204 1410.16% Sub 801972 52960 1514.30%This is a very nice speed increase in all cases, especially when ignoring overflow detection and rounding.
Conclusion: Using 8.8 fixed point can definately bring the calculation time and therefore power consumption down! The use-case for 8.8 values is however limited, but for e.g. a temperature or humidity sensors with limited range & accuracy it seems usable.
Using 16.16 fixed point values has no clear advantage over using floating point values. -
@hek Yep.
So, what shall I do? Invest some more time and get the 8.8 in (and modify some MySensors examples/libraries), or shall I park it for possible future usage? -
Maybe we could park this for now (if you don't have an super urge to get it in to 1.4). We can all see the benefits and have it in the pipe for future versions.
-
Too bad the 16p16 is so slow and large, cool that 8p8 is better.
I suspect that either could be far faster if done in assembly, as floating point must be.@Yveaux - can you share the test program you used?
Let's not throw the baby out with the bathwater. There are two problem domains here.
-
Calculations using non-integer values as operands and results (plus/minus/divide/multiply/sqrt).
-
Representing non-integer values over a network, and converting between this representation and the textual display used by humans (and whatever format is used by sensors).
Fixed point can be great for the first domain, although in this case the C/C++ implementation has some tradeoffs in size and speed compared to the compiler supported floating point.
But fixed point math is not as well targeted at the second domain. For a sensor value like 29.7 degrees, fixed point has an inexact and sometimes cumbersome representation. To see what I mean, follow the whole chain of converting a DHT-22 value into fixed point for OTA transport, and then into a comma separated string representation for consumption by a home automation interface (and display to a user). The time and library size will be dominated not by basic arithmetic operations among fixed point values, but by the conversions into and out of that format. The DHT-22 value can be relatively efficiently converted to a 2 byte integer 297 and a one byte scaling factor 1; and those can be converted to the string "297" and then "29.7" without a large library.
This is NOT to disparage fixed point math where it works well.
-
-
Too bad the 16p16 is so slow and large, cool that 8p8 is better.
I suspect that either could be far faster if done in assembly, as floating point must be.@Yveaux - can you share the test program you used?
Let's not throw the baby out with the bathwater. There are two problem domains here.
-
Calculations using non-integer values as operands and results (plus/minus/divide/multiply/sqrt).
-
Representing non-integer values over a network, and converting between this representation and the textual display used by humans (and whatever format is used by sensors).
Fixed point can be great for the first domain, although in this case the C/C++ implementation has some tradeoffs in size and speed compared to the compiler supported floating point.
But fixed point math is not as well targeted at the second domain. For a sensor value like 29.7 degrees, fixed point has an inexact and sometimes cumbersome representation. To see what I mean, follow the whole chain of converting a DHT-22 value into fixed point for OTA transport, and then into a comma separated string representation for consumption by a home automation interface (and display to a user). The time and library size will be dominated not by basic arithmetic operations among fixed point values, but by the conversions into and out of that format. The DHT-22 value can be relatively efficiently converted to a 2 byte integer 297 and a one byte scaling factor 1; and those can be converted to the string "297" and then "29.7" without a large library.
This is NOT to disparage fixed point math where it works well.
-