RFM69 MQTT gateway radio send loop



  • Hello all,

    I recently started a migration process from LowPowerLab library and gateway to MySensors network and Home Assistant controller.

    After playing with a ESP8266 as Mqtt gateway and a few test nodes with RFM69 radios, all working ok, I encountered a problem. Basically if the gateway tries to send 25 messages over the air and none of them are acknowledged the mqtt connection is restarted and the gateway starts resending the messages and enters a loop. Evidently this happens with retained mqtt messages.

    To make things easier I will start explaining with a simple setup:

    Default ESP8266 gateway sketch, only mqtt parameters, radio setup modified.
    A relay node with 25 outputs.
    Everything works ok.
    The mqtt messages look like this:

    2019-10-31 10:56:48.095 mygateway10-in/7/255/3/0/6 M
    2019-10-31 10:56:48.100 mygateway10-in/7/20/1/1/2 1
    2019-10-31 10:56:48.104 mygateway10-in/7/1/1/1/2 1
    2019-10-31 10:56:48.108 mygateway10-in/7/10/1/1/2 1
    2019-10-31 10:56:48.112 mygateway10-in/7/11/1/1/2 1
    2019-10-31 10:56:48.117 mygateway10-in/7/12/1/1/2 1
    2019-10-31 10:56:48.121 mygateway10-in/7/13/1/1/2 1
    2019-10-31 10:56:48.125 mygateway10-in/7/14/1/1/2 1
    2019-10-31 10:56:48.129 mygateway10-in/7/15/1/1/2 1
    2019-10-31 10:56:48.133 mygateway10-in/7/16/1/1/2 1
    2019-10-31 10:56:48.137 mygateway10-in/7/17/1/1/2 1
    2019-10-31 10:56:48.141 mygateway10-in/7/18/1/1/2 1
    2019-10-31 10:56:48.145 mygateway10-in/7/19/1/1/2 1
    2019-10-31 10:56:48.149 mygateway10-in/7/4/1/1/2 1
    2019-10-31 10:56:48.153 mygateway10-in/7/5/1/1/2 1
    2019-10-31 10:56:48.157 mygateway10-in/7/6/1/1/2 1
    2019-10-31 10:56:48.161 mygateway10-in/7/7/1/1/2 1
    2019-10-31 10:56:48.165 mygateway10-in/7/8/1/1/2 1
    2019-10-31 10:56:48.169 mygateway10-in/7/9/1/1/2 1
    2019-10-31 10:56:48.173 mygateway10-in/7/21/1/1/2 1
    2019-10-31 10:56:48.177 mygateway10-in/7/22/1/1/2 1
    2019-10-31 10:56:48.181 mygateway10-in/7/23/1/1/2 1
    2019-10-31 10:56:48.185 mygateway10-in/7/24/1/1/2 1
    2019-10-31 10:56:48.189 mygateway10-in/7/3/1/1/2 1
    2019-10-31 10:56:48.193 mygateway10-in/7/2/1/1/2 1
    

    Power down relay node and reset gateway:

    38 MCO:BGN:INIT GW,CP=RRNGF---,FQ=240,REL=9,VER=2.3.2-beta
    43 TSF:LRT:OK
    47 TSM:INIT
    50 TSF:WUR:MS=0
    57 TSM:INIT:TSP OK
    59 TSM:INIT:GW MODE
    63 TSM:READY:ID=0,PAR=0,DIS=0
    68 MCO:REG:NOT NEEDED
    159 GWT:TPC:CONNECTING...
    1161 MCO:BGN:STP
    1163 MCO:BGN:INIT OK,TSP=1
    1167 GWT:TPC:IP=192.168.1.22
    1172 GWT:RMQ:CONNECTING...
    1191 GWT:RMQ:OK
    1193 GWT:TPS:TOPIC=mygateway10-out/0/255/0/0/18,MSG SENT
    1201 TSM:READY:NWD REQ
    1209 ?TSF:MSG:SEND,0-0-255-255,s=255,c=3,t=20,pt=0,l=0,sg=0,ft=0,st=OK:
    1683 GWT:IMQ:TOPIC=mygateway10-in/7/255/3/0/6, MSG RECEIVED
    1689 !TSF:RTE:7 UNKNOWN
    2917 !TSF:MSG:SEND,0-0-7-7,s=255,c=3,t=6,pt=0,l=1,sg=0,ft=0,st=NACK:M
    2923 GWT:IMQ:TOPIC=mygateway10-in/7/20/1/1/2, MSG RECEIVED
    2930 !TSF:RTE:7 UNKNOWN
    4159 !TSF:MSG:SEND,0-0-7-7,s=20,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    4165 GWT:IMQ:TOPIC=mygateway10-in/7/1/1/1/2, MSG RECEIVED
    4172 !TSF:RTE:7 UNKNOWN
    5401 !TSF:MSG:SEND,0-0-7-7,s=1,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    5407 GWT:IMQ:TOPIC=mygateway10-in/7/10/1/1/2, MSG RECEIVED
    5414 !TSF:RTE:7 UNKNOWN
    6643 !TSF:MSG:SEND,0-0-7-7,s=10,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    6649 GWT:IMQ:TOPIC=mygateway10-in/7/11/1/1/2, MSG RECEIVED
    6656 !TSF:RTE:7 UNKNOWN
    7885 !TSF:MSG:SEND,0-0-7-7,s=11,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    7891 GWT:IMQ:TOPIC=mygateway10-in/7/12/1/1/2, MSG RECEIVED
    7898 !TSF:RTE:7 UNKNOWN
    9127 !TSF:MSG:SEND,0-0-7-7,s=12,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    9133 GWT:IMQ:TOPIC=mygateway10-in/7/13/1/1/2, MSG RECEIVED
    9140 !TSF:RTE:7 UNKNOWN
    10370 !TSF:MSG:SEND,0-0-7-7,s=13,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    10376 GWT:IMQ:TOPIC=mygateway10-in/7/14/1/1/2, MSG RECEIVED
    10383 !TSF:RTE:7 UNKNOWN
    11612 !TSF:MSG:SEND,0-0-7-7,s=14,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    11618 GWT:IMQ:TOPIC=mygateway10-in/7/15/1/1/2, MSG RECEIVED
    11625 !TSF:RTE:7 UNKNOWN
    12854 !TSF:MSG:SEND,0-0-7-7,s=15,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    12860 GWT:IMQ:TOPIC=mygateway10-in/7/16/1/1/2, MSG RECEIVED
    12867 !TSF:RTE:7 UNKNOWN
    14096 !TSF:MSG:SEND,0-0-7-7,s=16,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    14102 GWT:IMQ:TOPIC=mygateway10-in/7/17/1/1/2, MSG RECEIVED
    14109 !TSF:RTE:7 UNKNOWN
    15338 !TSF:MSG:SEND,0-0-7-7,s=17,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    15344 GWT:IMQ:TOPIC=mygateway10-in/7/18/1/1/2, MSG RECEIVED
    15351 !TSF:RTE:7 UNKNOWN
    16580 !TSF:MSG:SEND,0-0-7-7,s=18,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    16589 GWT:IMQ:TOPIC=mygateway10-in/7/19/1/1/2, MSG RECEIVED
    16594 !TSF:RTE:7 UNKNOWN
    17823 !TSF:MSG:SEND,0-0-7-7,s=19,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    17829 GWT:IMQ:TOPIC=mygateway10-in/7/4/1/1/2, MSG RECEIVED
    17836 !TSF:RTE:7 UNKNOWN
    19067 !TSF:MSG:SEND,0-0-7-7,s=4,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    19073 GWT:IMQ:TOPIC=mygateway10-in/7/5/1/1/2, MSG RECEIVED
    19080 !TSF:RTE:7 UNKNOWN
    20309 !TSF:MSG:SEND,0-0-7-7,s=5,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    20315 GWT:IMQ:TOPIC=mygateway10-in/7/6/1/1/2, MSG RECEIVED
    20322 !TSF:RTE:7 UNKNOWN
    21551 !TSF:MSG:SEND,0-0-7-7,s=6,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    21557 GWT:IMQ:TOPIC=mygateway10-in/7/7/1/1/2, MSG RECEIVED
    21564 !TSF:RTE:7 UNKNOWN
    22793 !TSF:MSG:SEND,0-0-7-7,s=7,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    22799 GWT:IMQ:TOPIC=mygateway10-in/7/8/1/1/2, MSG RECEIVED
    22806 !TSF:RTE:7 UNKNOWN
    24035 !TSF:MSG:SEND,0-0-7-7,s=8,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    24041 GWT:IMQ:TOPIC=mygateway10-in/7/9/1/1/2, MSG RECEIVED
    24048 !TSF:RTE:7 UNKNOWN
    25277 !TSF:MSG:SEND,0-0-7-7,s=9,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    25283 GWT:IMQ:TOPIC=mygateway10-in/7/21/1/1/2, MSG RECEIVED
    25290 !TSF:RTE:7 UNKNOWN
    26519 !TSF:MSG:SEND,0-0-7-7,s=21,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    26525 GWT:IMQ:TOPIC=mygateway10-in/7/22/1/1/2, MSG RECEIVED
    26532 !TSF:RTE:7 UNKNOWN
    27761 !TSF:MSG:SEND,0-0-7-7,s=22,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    27767 GWT:IMQ:TOPIC=mygateway10-in/7/23/1/1/2, MSG RECEIVED
    27774 !TSF:RTE:7 UNKNOWN
    29003 !TSF:MSG:SEND,0-0-7-7,s=23,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    29009 GWT:IMQ:TOPIC=mygateway10-in/7/24/1/1/2, MSG RECEIVED
    29016 !TSF:RTE:7 UNKNOWN
    30245 !TSF:MSG:SEND,0-0-7-7,s=24,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    30251 GWT:IMQ:TOPIC=mygateway10-in/7/3/1/1/2, MSG RECEIVED
    30258 !TSF:RTE:7 UNKNOWN
    31487 !TSF:MSG:SEND,0-0-7-7,s=3,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    31493 GWT:IMQ:TOPIC=mygateway10-in/7/2/1/1/2, MSG RECEIVED
    31500 !TSF:RTE:7 UNKNOWN
    32729 !TSF:MSG:SEND,0-0-7-7,s=2,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    32737 GWT:TPC:IP=192.168.1.22
    32739 GWT:RMQ:CONNECTING...
    32868 GWT:RMQ:OK
    32869 GWT:TPS:TOPIC=mygateway10-out/0/255/0/0/18,MSG SENT
    32935 GWT:IMQ:TOPIC=mygateway10-in/7/255/3/0/6, MSG RECEIVED
    32941 !TSF:RTE:7 UNKNOWN
    34170 !TSF:MSG:SEND,0-0-7-7,s=255,c=3,t=6,pt=0,l=1,sg=0,ft=0,st=NACK:M
    34176 GWT:IMQ:TOPIC=mygateway10-in/7/20/1/1/2, MSG RECEIVED
    34183 !TSF:RTE:7 UNKNOWN
    35412 !TSF:MSG:SEND,0-0-7-7,s=20,c=1,t=2,pt=0,l=1,sg=0,ft=0,st=NACK:1
    35418 GWT:IMQ:TOPIC=mygateway10-in/7/1/1/1/2, MSG RECEIVED
    35425 !TSF:RTE:7 UNKNOWN
    

    And it goes in a loop.
    I tested with ESP8266, ESP32s, Mysensors Master branch, Mysensors development branch.
    No change.
    (esp8266 somethimes goes into panic mode)

    In my opinion any amount of failed consecutive radio messages should not force a mqtt server reconnection.
    If there are fewer then 25 messages, lets say 24 messages and the 25th comes after a while, no problem. I have a problem when in mqtt queue there are more then 24 consecutive messages that cannot be acknowledged.



  • After more investigations I believe the problem is caused by a mqtt server connection timeout.
    By default the gateway tries to send the radio message 5 times. If I lower the number of retries , 2 for example, no more loop. Probably now much more then 25 messages cause a loop.

    For testing I added _MQTT_client.loop() in transportSendWrite() (MyTransport.cpp) and, no more loop with default 5 retries.


 

357
Online

8.9k
Users

9.6k
Topics

100.8k
Posts