Build retry funtionality into the mysensors library
-
Currently it's possible to request an echo from the controller, like so:
send(message.setSensor(CHILD_ID).set(some_value),1);
I know this has been discussed before, but how about building retry functionality into the library itself?
E.g.
send(message.setSensor(CHILD_ID).set(some_value),6);
Where there
6
indicates there will be six attempts, with exponentially increasing time between them. This could be standardised, for example:1 -> ask echo now
2 -> retry in 10 milliseconds
3 -> retry in 100 milliseconds (and the above)
4 -> retry in 1 second (and the above)
5 -> retry in 10 seconds (and the above)
6 -> retry in 100 seconds (and the above)
7 -> retry in 1000 seconds (and the above)
etc.Then the question becomes: how does the library know when the echo has been received? It would have to keep track of this.
Alternatively, how about something simpler like:
send(message.setSensor(CHILD_ID).set(some_value),&boolean);
Where the re-send will be attempted as long as the boolean is true. In the receive function the code then only has to set the boolean to false.
-
@alowhum MySensors already retries multiple times (how many depends on which radio and some configurable defines. For nrf24 the default is 15 times).
I am not sure how retrying even more times will help anything. Retrying more times could destroy user experience, since the node would seem to be frozen (not responding to input) until a message is eventually decidedly delivered.
A long and unfortunately not very fruitful discussion on the same topic: https://forum.mysensors.org/topic/3346/discussion-reliable-delivery/
-
I wonder about this. It's a neat idea by @alowhum but my experiences with 'ack' have been wildly frustrating. Even without 'ack' I sometimes get instant response from node to a command, other times it can take minutes (yes minutes) until the command is actually performed. More retries will cause more traffic on the airwaves and increase the possibility of message collisions meaning even more retries are required. It could be a downward sprial from there.
-
@skywatch said in Build retry funtionality into the mysensors library:
it can take minutes (yes minutes) until the command is actually performed
All retries on send() calls should reach their maximum number of retries within seconds, so I don't see how this can be caused by MySensors. Very interested to hear about your setup.
@alowhum Would you expect the send() call to return and keep retrying in the background, or block until the message is sent or maximum number of retries exceeded?
-
@yveaux
My setup is pi3B+ (boot from ssd so no SD card) with uno as GW.
GW RF is cdebyte NRF24L01+ pa/lna.
Been through many iterations of power supply, pi, nrfboards, gw (promini and uno) to try and get it stable, without success, but that's another story.
I think the issue is to do with the way mysensors/MyController handle the ack requests.
With a new release of MyController next year (hopefully) then I was waiting to see what happened then.
It is just bizarre that sometimes a light comes on as soon as a button is pressed and sometimes it can take up to 2 minutes. Any commands in between the first are cached somewhere and flood through when the node starts to respond.
-
@skywatch said in Build retry funtionality into the mysensors library:
It is just bizarre that sometimes a light comes on as soon as a button is pressed and sometimes it can take up to 2 minutes. Any commands in between the first are cached somewhere and flood through when the node starts to respond.
I agree, but as I said I don't see how the delay could be introduced by the MySensors library. It is probably an effect of the controller retrying to send messages for a few minutes. That doesn't solve your issue, but that also makes it hard to help...
-
@yveaux I concur with that. That's why I will wait and see how it goes in MyController V2.0
-
@alowhum this is (almost) exactly how I setup my MySensors network. The only difference is that I rely on "hardware" ack (the return value of the send function) and do not have any repeater nodes in my network - this prevents message flood on the network.
This approach works quite well.@mfalkvidd
"I am not sure how retrying even more times will help anything. Retrying more times could destroy user experience, since the node would seem to be frozen (not responding to input) until a message is eventually decidedly delivered."Retry mechanism should be implemented in a non-blocking way. I implemented it in a way, that every sensor/actuator has statically defined message and handling of sending/retrying of this message is done in the "background" in non-blocking manner. one call to send in every master loop iteration, and the result of the send is stored between loop iterations.
Not doing retryies until the message is sent is very bad idea.
For example, You have a remote button that will toggle play/stop of your music player. Between button press and music starting to play there is some time, like 5 seconds. If the messege from button does not reach the gateway, the user will realise this after something like 10 seconds. This is not acceptable. The button should banging the same message over and over until it reaches the gateway - this is the only think that this button does and it should do it 100% realiable (I'm not counting hardware failures here).
-
@rozpruwacz that sounds like an interesting solution. Do you statically allocate the ram needed to store the transmitted values, or are you able to do it dynamically without causing fragmentation? How much extra ram is required? Is there support for larger messages, such as V_TEXT? How do you handle if the same actuator is changed multiple times before the earlier changes have been reliably delivered?
Would be awesome if we could work towards a pull request that could add this functionality to MySensors. Dropping support for repeaters would be a very big drawback for some uses though. Many users already complain about the ram requirements, so we'd have to consider our options regarding ram as well.
-
@yveaux Ideally the send function would try to re-send in the background. Otherwise the code would block the repeater functionality?
@rozpruwacz You're making quite curious about your code
@mfalkvidd Perhaps the size of the message buffer could be defined somehow? To avoid fragmentation I've taken to just defining all my variable outside of the main loop as much as possible. Sure, it's not efficient, but if you have the room for it, it makes the node nice and stable by avoiding memory fragmentation.
-
@alowhum said in Build retry funtionality into the mysensors library:
Ideally the send function would try to re-send in the background.
I agree. The send function should be fire and forget, with e.g. a callback to indicate success/failure. The number of retries or delivery timeout should be configurable so the stack can handle the resending in the background.
Most users will probably not be interested in the result because what can you do if delivery eventually failed after many retries?
This however requires buffering messages, increasing RAM usage.Otherwise the code would block the repeater functionality?
Not necessarily I guess. The stack is handling the send and could also handle repeater function while doing so.
-
@mfalkvidd said in Build retry funtionality into the mysensors library:
Do you statically allocate the ram needed to store the transmitted values, or are you able to do it dynamically without causing fragmentation?
For example You have a node with two sensors, then there are two MyMessage objects defined statically.
How much extra ram is required?
The size of MyMessage for every message
Is there support for larger messages, such as V_TEXT?
To My knowledge in MySensors there is only one size of MyMessage object which is a union of multiple represenstations of the message content and the size of MyMessage is MAX_MESSAGE_LENGTH. So I don't understand the questionHow do you handle if the same actuator is changed multiple times before the earlier changes have been reliably delivered?
The last value of the actuator is taken. You can imagine that there are two threads. First is setting the atomic value shared between threads, and second is reading this value and sending it to the gateway. If first thread writes two times before second thread sends first time, the first value is not sent.
My approach is memory hungry if there is a lot of distinct values to send to the gateway. May be it could be improved to reuse single message in the background.
One important thing is that there is a differenc in approach to battery powered (sleeping) nodes and powered nodes.
In powered nodes, there is not problem going to forever loop trying to deliver message to the gateway.
My algorithm is essantial for battery powered nodes, where banging messages in forever loop will drain the battery quickly if for example gateway goes down for a while.You can look into my code here:
https://github.com/mczerski/MySensorsToolkit - this is the toolkit library with all the logic implemented
https://github.com/mczerski/MyMultiSensor- this is example use of the toolkit library
-
Thanks for explaining. For some use cases (few messages per node) this could be a good solution. If I understand the code correctly, each MyMessage needs 40 bytes. For atmega328 this could be quite limiting (many sketches are tight on ram already), but for devices with more ram 40 bytes per message won't be a problem.
@rozpruwacz said in Build retry funtionality into the mysensors library:
So I don't understand the question
Since you're creating a full instance of the MyMessage, my question doesn't make sense. I thought you were doing something more "clever" to minimize ram usage, hence the question.
-
@mfalkvidd I believe that my code have a lot of room for optimisations. From the ram perspective having multiple union structures that can hold also text up to 32 chars (?) is not optimal. One way of optimising the ram usage would be to keep one MyMessage instance, and initialise this message just before sending it with a required value. This way the ram usage would be 1 MyMessage + sum(sizeof(value) for value in sensor_values) where sensor_values are current sensors/actuator readings.
-
@rozpruwacz yes that should be sufficient, at least if we only plan to support messages to the gateway (not to other nodes) and ditch repeater support.
I guess some sort of timestamp when the last send attempt occurred, and a retry counter is needed to be stored per message as well?
-
@mfalkvidd said in Build retry funtionality into the mysensors library:
I guess some sort of timestamp when the last send attempt occurred, and a retry counter is needed to be stored per message as well?
Yes, could be useful for debugging purposes. In the ideal setup retry counter should be always 0
-
@rozpruwacz you use a constant interval between retries? (No exponential backoff)?
-
@mfalkvidd said in Build retry funtionality into the mysensors library:
(No exponential backoff)?
its expotential, but no more than hardcoded value (don't remember what value)
-
@mfalkvidd said in Build retry funtionality into the mysensors library:
How do you handle if the same actuator is changed multiple times before the earlier changes have been reliably delivered?
How MQTT qos 1 does? I think it should be similar to it.
Also I don't think changes in between should be dropped. That would be like dropouts in a metered system (logged to influx, fEx) and probably do strange things with scenes and group switching.
-
@sergio-rius said in Build retry funtionality into the mysensors library:
How MQTT qos 1 does? I think it should be similar to it.
Also I don't think changes in between should be dropped. That would be like dropouts in a metered system (logged to influx, fEx) and probably do strange things with scenes and group switching.But MQTT broker is not running on the 1kB ram mcu. Having lots of memory available it is not a brain teaser to implement such functionality. For a small mcu You have to make some compromise. In one case droping messages is completely ok, but for other is not. It may turn out that there is no ONE algorithm that will fit all cases ...
-
@rozpruwacz said in Build retry funtionality into the mysensors library:
It may turn out that there is no ONE algorithm that will fit all cases ...
Agreed. It seems to me that a small home with about 10 to 15 devices should be able to run on Arduino Nano's and offer a decent level of predictability. If all devices send, on average, one message per minute, then this would mean one message every 70 milliseconds. Most of these devices will be connected to the gateway directly. Let's say half of them require
So in a basic home situation a buffer for two messages would be fine, and three would be a luxury.
If you want to run a large scale sensor network, then it makes sense to upgrade your parts too (bigger antenna on the gateway, more ram on nodes that extend the network).
Retry functionality, in the normal home scenario, probably isn't so much about how many messages are buffered, but about how long people turn on the microwave oven, which disrupts the network. So for me it's about having control over the (exponential) time period that the node keeps retrying. The memory/buffering capacity of my home network is fine, it's just that I don't want to implement this "keep retrying for longer" routine myself.
Which
#define
values could I already change today to get the nodes to keep retrying for longer than just a few seconds?
-
@alowhum said in Build retry funtionality into the mysensors library:
So in a basic home situation a buffer for two messages would be fine, and three would be a luxury.
Each nrf24 has this luxury of a 3 message hardware buffer
-
@rozpruwacz said in Build retry funtionality into the mysensors library:
But MQTT broker is not running on the 1kB ram mcu
I'm not saying to make a mqtt broker run on an arduino. Just picking the process logic as a guideline.
MQTT is not a protocol made for raspberries, it's only that often run on them.
-
The main thing mqtt has (that MySensors doesn’t) is ”packet identifier” in mqtt lingo. It is similar to tcp’s sequence number.
Mqtt also assumes that
- the broker (I guess the analogue is the MySensors gateway) has persistent storage, or at least sufficient ram to buffer all messages until they have been marked as delivered
- everything is single hop (no repeaters)
- there is a DUP flag
- clients have sufficient ram to buffer all outgoing messages until they have been marked as delivered, and has timers to retransmit messages
I am not sure how mqtt handles ordering of messages. I think mqtt doesn’t guarantee ordering, regardless of qos level.
-
Voilá.
And it maintains a live inventory of clients.
Some points are so difficult to achieve. But I was looking at message identification as a response to the sequence switch problem.
-
I am not sure how retrying even more times will help anything