Instable ESP32 MQTT gateway with RFM69



  • Hello guys,

    I'm facing an issue with an ESP32 MQTT gateway (with RFM69) where it resets every 10-12 hrs with InstrFetchProhibited.

    Is anyone having the same problems? I'm using an ESP32 gateway with NRF24 for a long time already and don't have problems there.

    This is my setup:

    • version 2.3.2 of Mysensors,

    • ESP32DOIT board

    • espressif 1.12.4 (platform IO)

    • Arduino framework 1.0.4

    This is my gateway configuration:

    #define MY_RADIO_RFM69
    #define MY_RFM69_NEW_DRIVER
    #define MY_GATEWAY_ESP32
    #define MY_GATEWAY_MQTT_CLIENT
    

    For details of the exception see the GitHub issue I've made on

    https://github.com/mysensors/MySensors/issues/1435



  • I've done some more debugging and logging and this is a part of the log with some of my comments included.
    The controller is sending a test message every 5 seconds over MQTT. A test node is sending a message every 10 seconds to the controller
    .
    Just before the gateway crashes, it seems MQTT messages are missing, so I suspect this is related and the software is stuck in that process somewhere.
    @tekka @scalz can you think along about this? Help would be greatly appreciated.

    (za 15-08 20:51:38.981) 2625431 THA:DATA:AVAIL                            Message from node every 10 seconds
    (za 15-08 20:51:39.076) 2625486 THA:RCV:MSG=1F1F000A21030A06
    (za 15-08 20:51:39.076) 2625489 THA:RCV:MSG LEN=8
    (za 15-08 20:51:39.076) 2625494 TSF:MSG:READ,31-31-0,s=10,c=1,t=3,pt=1,l=1,sg=0:6
    (za 15-08 20:51:39.076) 2625501 GWT:TPS:TOPIC=ms-rfm-from/31/10/1/0/3,MSG SENT
    (za 15-08 20:51:43.551) 2630003 GWT:IMQ:TOPIC=ms-rfm-to/31/2/2/0/3, MSG RECEIVED  Message from controller every 5 seconds
    (za 15-08 20:51:43.595) 2630008 THA:SND:MSG=00001F0A02030237
    (za 15-08 20:51:43.666) 2630026 THA:SND:MSG LEN=8,RES=1
    (za 15-08 20:51:43.666) 2630028 TSF:MSG:SEND,0-0-31-31,s=2,c=2,t=3,pt=0,l=1,sg=0,ft=0,st=OK:7
    (za 15-08 20:51:43.798) 2630250 THA:DATA:AVAIL
    (za 15-08 20:51:43.898) 2630305 THA:RCV:MSG=1F1F000A21030207
    (za 15-08 20:51:43.898) 2630308 THA:RCV:MSG LEN=8
    (za 15-08 20:51:43.898) 2630313 TSF:MSG:READ,31-31-0,s=2,c=1,t=3,pt=1,l=1,sg=0:7
    (za 15-08 20:51:43.898) 2630320 GWT:TPS:TOPIC=ms-rfm-from/31/2/1/0/3,MSG SENT
    (za 15-08 20:51:43.898) 2630335 THA:DATA:AVAIL
    (za 15-08 20:51:43.940) 2630391 THA:RCV:MSG=1F1F000A21030A07
    (za 15-08 20:51:43.944) 2630394 THA:RCV:MSG LEN=8
    (za 15-08 20:51:43.949) 2630399 TSF:MSG:READ,31-31-0,s=10,c=1,t=3,pt=1,l=1,sg=0:7
    (za 15-08 20:51:43.956) 2630406 GWT:TPS:TOPIC=ms-rfm-from/31/10/1/0/3,MSG SENT
    (za 15-08 20:51:44.190) 2630642 THA:DATA:AVAIL
    (za 15-08 20:51:44.272) 2630723 THA:RCV:MSG=1F1F000A2117161F
    (za 15-08 20:51:44.318) 2630726 THA:RCV:MSG LEN=8
    (za 15-08 20:51:44.347) 2630731 TSF:MSG:READ,31-31-0,s=22,c=1,t=23,pt=1,l=1,sg=0:31
    (za 15-08 20:51:44.347) 2630738 GWT:TPS:TOPIC=ms-rfm-from/31/22/1/0/23,MSG SENT
    (za 15-08 20:51:48.562) 2635014 GWT:IMQ:TOPIC=ms-rfm-to/31/2/2/0/3, MSG RECEIVED  Message from controller every 5 seconds
    (za 15-08 20:51:48.663) 2635019 THA:SND:MSG=00001F0A02030238
    (za 15-08 20:51:48.663) 2635037 THA:SND:MSG LEN=8,RES=1
    (za 15-08 20:51:48.663) 2635039 TSF:MSG:SEND,0-0-31-31,s=2,c=2,t=3,pt=0,l=1,sg=0,ft=0,st=OK:8
    (za 15-08 20:51:48.801) 2635254 THA:DATA:AVAIL                               Message from node every 10 seconds
    (za 15-08 20:51:48.946) 2635309 THA:RCV:MSG=1F1F000A21030208
    (za 15-08 20:51:48.946) 2635312 THA:RCV:MSG LEN=8
    (za 15-08 20:51:48.946) 2635317 TSF:MSG:READ,31-31-0,s=2,c=1,t=3,pt=1,l=1,sg=0:8
    (za 15-08 20:51:48.946) 2635324 GWT:TPS:TOPIC=ms-rfm-from/31/2/1/0/3,MSG SENT
    (za 15-08 20:51:48.946) 2635338 THA:DATA:AVAIL
    (za 15-08 20:51:48.946) 2635394 THA:RCV:MSG=1F1F000A21030A08
    (za 15-08 20:51:48.997) 2635397 THA:RCV:MSG LEN=8
    (za 15-08 20:51:48.997) 2635402 TSF:MSG:READ,31-31-0,s=10,c=1,t=3,pt=1,l=1,sg=0:8
    (za 15-08 20:51:48.997) 2635409 GWT:TPS:TOPIC=ms-rfm-from/31/10/1/0/3,MSG SENT
                                  Somewhere a message from the controller should show up
    (za 15-08 20:51:58.608) 2645060 THA:DATA:AVAIL                                Message from node every 10 seconds
    (za 15-08 20:51:58.695) 2645115 THA:RCV:MSG=1F1F000A21171620
    (za 15-08 20:51:58.695) 2645118 THA:RCV:MSG LEN=8
    (za 15-08 20:51:58.695) 2645123 TSF:MSG:READ,31-31-0,s=22,c=1,t=23,pt=1,l=1,sg=0:32
    (za 15-08 20:51:58.695) 2645130 GWT:TPS:TOPIC=ms-rfm-from/31/22/1/0/23,MSG SENT
                                  Somewhere a message from the controller should show up
    (za 15-08 20:52:08.744) Guru Meditation Error: Core  1 panic'ed (InstrFetchProhibited). Exception was unhandled.<CR>
    (za 15-08 20:52:08.794) Core 1 register dump:<CR>
    (za 15-08 20:52:08.884) PC      : 0x00400825  PS      : 0x00060031  A0      : 0x8008a387  A1      : 0x3ffbe760  <CR>
    (za 15-08 20:52:08.884) A2      : 0x00000000  A3      : 0x00000001  A4      : 0x00000000  A5      : 0x0000ee00  <CR>
    (za 15-08 20:52:08.884) A6      : 0x00050e23  A7      : 0x3ffb8058  A8      : 0x80082064  A9      : 0x3ffc184c  <CR>
    (za 15-08 20:52:08.884) A10     : 0x00000003  A11     : 0x00060023  A12     : 0x00060021  A13     : 0x00000030  <CR>
    (za 15-08 20:52:08.884) A14     : 0x00000015  A15     : 0x00060e23  SAR     : 0x00000016  EXCCAUSE: 0x00000014  <CR>
    (za 15-08 20:52:08.884) EXCVADDR: 0x00400824  LBEG    : 0x4000c2e0  LEND    : 0x4000c2f6  LCOUNT  : 0xffffffff  <CR>
    (za 15-08 20:52:08.884) Core 1 was running in ISR context:<CR>
    (za 15-08 20:52:08.884) EPC1    : 0x00400825  EPC2    : 0x00000000  EPC3    : 0x00000000  EPC4    : 0x40086a3d<CR>
    (za 15-08 20:52:08.884) <CR>
    (za 15-08 20:52:08.884) Backtrace: 0x00400825:0x3ffbe760 0x4008a384:0x3ffbe780 0x40088c7f:0x3ffbe7a0 0x4008b395:0x3ffbe7c0 0x4008478e:0x3ffbe7d0 0x4000c3f0:0x3ffb1ee0 0x400d1e5f:0x3ffb1f00 0x400d2ed2:0x3ffb1f50 0x400d30c7:0x3ffb1f70 0x400d3628:0x3ffb1f90 0x400d36f1:0x3ffb1fb0 0x40088b91:0x3ffb1fd0<CR>
    (za 15-08 20:52:08.884) <CR>
    

    decoded stack:

    0x40088c7f: xPortSysTickHandler at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/port.c:355 (discriminator 1)
    0x4008b395: _frxt_timer_int at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/portasm.S:303
    0x4008478e: _xt_lowint1 at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/xtensa_vectors.S:1154
    0x400d1e5f: PubSubClient::loop() at C:\Users\rik\...\MS_GW default/.pio/libdeps/esp32-rfm69/MySensors/core/MyMessage.cpp:212
    0x400d2ed2: gatewayTransportAvailable() at C:\Users\rik\...\MS_GW default/.pio/libdeps/esp32-rfm69/MySensors/core/MyMessage.cpp:212
    0x400d30c7: gatewayTransportProcess() at C:\Users\rik\...\MS_GW default/.pio/libdeps/esp32-rfm69/MySensors/core/MyMessage.cpp:212
    0x400d3628: _process() at C:\Users\rik\...\MS_GW default/.pio/libdeps/esp32-rfm69/MySensors/core/MyMessage.cpp:212
    0x400d36f1: loopTask(void*) at C:\Users\rik\...\MS_GW default/.pio/libdeps/esp32-rfm69/MySensors/core/MyMessage.cpp:212
    0x40088b91: vPortTaskWrapper at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/port.c:355 (discriminator 1)
    


  • Hi,

    How is you sensors powered?

    I had the same problem, but the gateway was not reseting as frequently. One thing I noticed, but am still testing is that when the batteries on the node go below 2.8v (on a 3.3V pro-mini), for some reason, the gateway halts (i have to reboot it manually). It seems weird, so I'm still testing.



  • The node is powered by an usb port supply and the gateway from an usb port on a laptop, to be able to log all output.


  • Admin

    @electrik Probably this is PubSubClient-related. Please provide additional infos:

    • ESP board definitions version
    • MySensors version
    • GW sketch


  • This is my setup:

    • version 2.3.2 of Mysensors,
    • ESP32DOIT board
    • espressif 1.12.4 (platform IO)
    • Arduino framework 1.0.4

    This is my gateway configuration:

    #define MY_RADIO_RFM69
    #define MY_RFM69_NEW_DRIVER
    #define MY_GATEWAY_ESP32
    #define MY_GATEWAY_MQTT_CLIENT
    

    For details of the exception see the GitHub issue I've made on

    https://github.com/mysensors/MySensors/issues/1435


  • Mod



  • Thanks, yes that seems related. It isn't clear for me if and how that was solved really.
    If someone has a suggestion what to change, I'm open for it!



  • So, I have added a check in the PubSubclient to work around this. Until now it works quite well and when an invalid packet is received this is not handled.

    However, now sometimes the gateway just hangs at some point.

    (za 29-08 13:06:53.126) IH<CR>
    (za 29-08 13:06:59.846) MQTTP topic
    (za 29-08 13:06:59.846) ms-rfm-to/31/2/2/0/3<CR>
    (za 29-08 13:06:59.960) 71554750 GWT:IMQ:TOPIC=ms-rfm-to/31/2/2/0/3, MSG RECEIVED
    (za 29-08 13:06:59.960) 71554758 THA:SND:MSG=00001F120203023237
    (za 29-08 13:06:59.960) IC<CR>
    (za 29-08 13:06:59.960) IH<CR>
    (za 29-08 13:06:59.960) IC<CR>
    (za 29-08 13:06:59.960) IH<CR>
    (za 29-08 13:06:59.960) 71554775 THA:SND:MSG LEN=9,RES=1
    (za 29-08 13:06:59.960) 71554779 TSF:MSG:SEND,0-0-31-31,s=2,c=2,t=3,pt=0,l=2,sg=0,ft=0,st=OK:27
    

    And then nothing happens.
    (The IC and IH are added by me to check on the interrupts, IC is interrupt clear and IH interrupt handle). I've switched on the internal LED on RFM69 interrupt and that is on when this occurs. It is not clear for me if this happens before or after the freeze.

    Now I've added the debug defines below, to get more information on what's happening before the freeze. But after this change the software is running for 47h already. So I suspect there could be some timing issue but not sure if, where and how.

    #define MY_DEBUG_VERBOSE_RFM69
    #define MY_DEBUG_VERBOSE_TRANSPORT_HAL
    #define MY_DEBUG_VERBOSE_RFM69_REGISTERS
    


  • So after more then 67h the gateway hangs again. This is the debug output:

    (di 01-09 09:58:49.398) 243740964 RFM69:SWR:ACK,FROM=31,SEQ=60,RSSI=-98
    (di 01-09 09:58:49.398) 243740969 THA:SND:MSG LEN=9,RES=1
    (di 01-09 09:58:49.398) 243740974 TSF:MSG:SEND,0-0-31-31,s=2,c=2,t=3,pt=0,l=2,sg=0,ft=0,st=OK:11
    (di 01-09 09:58:49.585) IC<CR>
    (di 01-09 09:58:49.585) IH<CR>
    (di 01-09 09:58:49.585) 243741189 THA:DATA:AVAIL
    (di 01-09 09:58:49.641) 243741241 RFM69:SAC:SEND ACK,TO=31,RSSI=-71
    (di 01-09 09:58:49.703) 243741246 RFM69:CSMA:RSSI=-100
    (di 01-09 09:58:49.703) 243741253 THA:RCV:MSG=1F1F000A2103020B
    (di 01-09 09:58:49.703) 243741256 THA:RCV:MSG LEN=8
    (di 01-09 09:58:49.703) 243741261 TSF:MSG:READ,31-31-0,s=2,c=1,t=3,pt=1,l=1,sg=0:11
    (di 01-09 09:58:49.703) 243741268 GWT:TPS:TOPIC=ms-rfm-from/31/2/1/0/3,MSG SENT
    (di 01-09 09:58:49.703) IC<CR>
    (di 01-09 09:58:49.703) IH<CR>
    (di 01-09 09:58:49.703) IC<CR>
    (di 01-09 09:58:49.703) IH<CR>
    (di 01-09 09:58:49.703) 243741282 THA:DATA:AVAIL
    (di 01-09 09:58:49.733) 243741334 RFM69:SAC:SEND ACK,TO=31,RSSI=-68
    (di 01-09 09:58:49.737) 243741339 RFM69:CSMA:RSSI=-96
    (di 01-09 09:58:49.742) 243741345 THA:RCV:MSG=1F1F000A21030A0B
    (di 01-09 09:58:49.747) 243741349 THA:RCV:MSG LEN=8
    (di 01-09 09:58:49.752) 243741354 TSF:MSG:READ,31-31-0,s=10,c=1,t=3,pt=1,l=1,sg=0:11
    (di 01-09 09:58:49.759) 243741361 GWT:TPS:TOPIC=ms-rfm-from/31/10/1/0/3,MSG SENT
    (di 01-09 09:58:49.766) IC<CR>
    (di 01-09 09:58:49.767) IH<CR>
    (di 01-09 09:58:49.862) IC<CR>
    (di 01-09 09:58:49.862) IH<CR>
    (di 01-09 09:58:49.911) 243741466 THA:DATA:AVAIL
    (di 01-09 09:58:49.915) 243741518 RFM69:SAC:SEND ACK,TO=31,RSSI=-69
    (di 01-09 09:58:49.921) 243741523 RFM69:CSMA:RSSI=-97
    (di 01-09 09:58:49.926) 243741529 THA:RCV:MSG=1F1F000A21171642
    (di 01-09 09:58:49.978) 243741533 THA:RCV:MSG LEN=8
    (di 01-09 09:58:49.978) 243741538 TSF:MSG:READ,31-31-0,s=22,c=1,t=23,pt=1,l=1,sg=0:66
    (di 01-09 09:58:49.978) 243741545 GWT:TPS:TOPIC=ms-rfm-from/31/22/1/0/23,MSG SENT
    (di 01-09 09:58:49.978) IC<CR>
    (di 01-09 09:58:49.978) IH<CR>
    (di 01-09 09:58:59.817) IC<CR>
    (di 01-09 09:58:59.817) IH<CR>
    (di 01-09 09:58:59.881) 243751420 THA:DATA:AVAIL
    (di 01-09 09:58:59.881) 243751473 RFM69:SAC:SEND ACK,TO=31,RSSI=-71
    (di 01-09 09:58:59.881) 243751478 RFM69:CSMA:RSSI=-100
    (di 01-09 09:58:59.881) 243751485 THA:RCV:MSG=1F1F000A21171643
    (di 01-09 09:58:59.909) 243751488 THA:RCV:MSG LEN=8
    (di 01-09 09:58:59.909) 243751493 TSF:MSG:READ,31-31-0,s=22,c=1,t=23,pt=1,l=1,sg=0:67
    (di 01-09 09:58:59.909) 243751500 GWT:TPS:TOPIC=ms-rfm-from/31/22/1/0/23,MSG SENT
    (di 01-09 09:58:59.909) IC<CR>
    (di 01-09 09:58:59.909) IH<CR>
    (di 01-09 09:59:09.771) IC<CR>
    (di 01-09 09:59:09.771) IH<CR>
    (di 01-09 09:59:09.771) 243761374 THA:DATA:AVAIL
    (di 01-09 09:59:09.860) 243761427 RFM69:SAC:SEND ACK,TO=31,RSSI=-70
    (di 01-09 09:59:09.907) 243761432 RFM69:CSMA:RSSI=-101
    (di 01-09 09:59:09.907) 243761439 THA:RCV:MSG=1F1F000A21171644
    (di 01-09 09:59:09.907) 243761442 THA:RCV:MSG LEN=8
    (di 01-09 09:59:09.907) 243761447 TSF:MSG:READ,31-31-0,s=22,c=1,t=23,pt=1,l=1,sg=0:68
    (di 01-09 09:59:09.907) 243761454 GWT:TPS:TOPIC=ms-rfm-from/31/22/1/0/23,MSG SENT
    (di 01-09 09:59:09.907) IC<CR>
    (di 01-09 09:59:09.907) IH<CR>
    (di 01-09 09:59:11.447) MQTTP topic
    (di 01-09 09:59:11.578) ms-rfm-to/31/2/2/0/3<CR>
    (di 01-09 09:59:11.578) 243763051 GWT:IMQ:TOPIC=ms-rfm-to/31/2/2/0/3, MSG RECEIVED
    (di 01-09 09:59:11.578) 243763059 THA:SND:MSG=00001F120203023132
    (di 01-09 09:59:11.578) 243763065 RFM69:SWR:SEND,TO=31,SEQ=65,RETRY=0
    (di 01-09 09:59:11.578) 243763072 RFM69:CSMA:RSSI=-98
    (di 01-09 09:59:11.578) IC<CR>
    (di 01-09 09:59:11.578) IH<CR>
    (di 01-09 09:59:11.578) IC<CR>
    (di 01-09 09:59:11.578) IH<CR>
    (di 01-09 09:59:11.578) 243763088 RFM69:SWR:ACK,FROM=31,SEQ=66,RSSI=-96
    (di 01-09 09:59:11.578) 243763092 THA:SND:MSG LEN=9,RES=1
    (di 01-09 09:59:11.578) 243763097 TSF:MSG:SEND,0-0-31-31,s=2,c=2,t=3,pt=0,l=2,sg=0,ft=0,st=OK:12
    

    After this, nothing happens anymore. I'm out of ideas...


Log in to reply
 

Suggested Topics

58
Online

11.4k
Users

11.1k
Topics

112.7k
Posts