Multiple messages often fail
-
I am a delighted user of a MySensors network, and I take take this opportunity to express respect to authors and contributors.
My nodes are based on Arduino Nano, with RPi ethernet gateway, using RF24 radios with standard settings (freq & speed), some with PA, and some without (all with 10uF ceramic capacitors and external 3v3 voltage controller on RPi radio module). The 2.3.2 version is in use.
I started with collecting hot water temperatures, electric energy, electric current, motion sensors - total 18 variables from 4 nodes. All works fine with very occasional lost measurements. I also use FOTA.
Recently I added additional 3 nodes with actuators (lights, garage gate and entry gate) and I set up Home Assistant as a controller. The overall experience is absolutely fantastic!
The problem is that a single lost message aimed at actuator is noticeable and usually painful. Although the communication is very reliable, the problem takes place very often when I trigger multiple switches at a time (using HA group switch, or triggering garage gate + garage lights by saying 'Hey google. Turn on the garage'). In most such cases only some switches are triggered.
I tested this by building a new node with 3 x V_STATUS, connected to LED's and by triggering them by switching a group:
Triggering individual switches works fine and reliable even through walls but switching three at time often (20% of cases) results in the following behaviour:
This is an attempt to switch off all three switches:
-
First message is sent to child is sent, then second message is sent.
-
Then first message is acknowledged.
-
The second message is not acknowledged, so HA shows it as OFF, but the LED on node is off, so only ack failed.
-
Then finally third message fails even to be sent.
-
Result: first and second LED is OFF on node & only thirst switch is OFF in Home Assistant.
The gateway log below:
Dec 27 20:46:33 DEBUG GWT:RFC:C=1,MSG=100;1;1;1;2;0 Dec 27 20:46:33 DEBUG TSF:MSG:SEND,0-0-100-100,s=1,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=OK:0 Dec 27 20:46:33 DEBUG GWT:RFC:C=1,MSG=100;2;1;1;2;0 Dec 27 20:46:33 DEBUG TSF:MSG:SEND,0-0-100-100,s=2,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=OK:0 Dec 27 20:46:33 DEBUG TSF:MSG:READ,100-100-0,s=1,c=1,t=2,pt=0,l=1,sg=0:0 Dec 27 20:46:33 DEBUG TSF:MSG:ECHO Dec 27 20:46:33 DEBUG GWT:RFC:C=1,MSG=100;3;1;1;2;0 Dec 27 20:46:33 DEBUG !TSF:MSG:SEND,0-0-100-100,s=3,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:0
In successful cases the second echo appears before the third message arrives from the Controller, and third message is then also successful.
I suspected radio jam, and or collisions or perhaps collisions due to auto retransmissions caused by radio jam..
I tried to set up the gateway and the tester node (just this one node) to channel 86 and to 2Mbps to try to troubleshoot both potential causes + getting rid of any other MySensors traffic, but the result is the same.
I will be grateful if you share solutions to such problem, or at least ideas how to solve such a problem. I see 15 retransmissions after 1500us are hardcoded in the MyS library. Would you advise changing the delay, perhaps?
-
-
Welcome to the forum @BlueArrow
Thanks for your detailed post, and the testing you did to isolate the problem.
Yes, multiple messages often cause problems. We've especially seen it when presenting multiple child sensors.
Using the echo feature likely makes the situation much worse. Without echo, each message is (ideally) sent once and an acknowledgement is sent once. With echo, each message is (ideally) sent twice (once in each direction), with two corresponding ackowledgements. In a less than ideal situation, with lost messages, there will be much more traffic.
If a single node has multiple lights attached, a solution could be to add "virtual" light representing all combinations. Example: Node 5 has three lights; A, B and C. Then you could have something like this:
Child ID Function 1. Turn on/off A 2. Turn on/off B 3. Turn on/off C 4. Turn on/off A+B 5. Turn on/off B+C 6. Turn on/off A+C 7. Turn on/off A+B+C
This would make it possible to switch all 3 light with a single message. Using a bit field, any combination of 7 lights would fit in a single message. But this solution doesn't work if the lights are connected to different nodes.
The only useful workaround I've seen is adding a delay between each message. But with many nodes, that might be impractical because the entire command would take too much time.
-
nrf24 supports multicast, but ack must be disabled which will cause other problems instead. Reference: https://forum.arduino.cc/t/nrf24l01-multicast-receiver-requirements/495782/
Adding multicast support to MySensors might be a lot of work. And I'm not sure how repeaters should handle multicast. But it would be cool to have multicast support. Multicast FOTA, while probably not useful in practice, would be super cool.
-
Thank you for swift reply.
Given the problem takes place in ~20% of cases it seems the problem is really caused by forcing echo. Since MySensors can cope with 80% of triple messages I believe cutting the number of messages by half would solve the problem. I do not observe issues during presentation though my heaviest node has 9 children.
It also seems to be a better way than engaging in multicast feature.
So the problem becomes more Home Assistant related... Given the ack (Enhaced Shockburst) mechanism, forcing echo seems to be redundant. Instead, the failure of message sending should be reported by the gateway - is there any mechanism possible? I only noticed the send() function returns bool as an advice but did not notice an equivalent returning feedback from gateway via ehternet/serial/mqtt.
The documentation says there is an 'optimistic' option, where "Home Assistant will assume any requested changes (turn on light, open cover) are applied immediately without waiting for feedback from the node", so I believe HA does not require echo.
Unfortunately when trying to set that nothing really changes , and I get the following HA log:
- optimistic option for mysensors is deprecated. Please remove optimistic from your configuration file
I will post a suggestion to HA forum to restore this option, unless you know a (good) reason for removing optimistic from HA?