Wifi GW gets sluggish after a while

  • For some time I had issues with mysensors communications that prevented the deployment I had planned. After dedicating some time to rebuilding for better debug, I have finnaly found some problems:

    • The use of fake/cloned radio modules. Everything has been said. It masked the problems and prevented me to debug properly.
    • Domoticz unwanted replies that where interpreted as new commands. I never thrust on Domoticz, and after changing for node-red, that allowed me to better debug communications and see that now and then, Domoticz is replying with a message that seems to announce it's capabilities with respect to the received node and value. It's like an ack, but with a relatively big content, with unwanted things like battery management capabilities (even to connected devices). Those messages where interpreted by the node as new commands and, for example, it made my garage door node to stop while it was opening or closing, without any notice (with the problems it represents). Also those messages sometimes interrupted devices presentation.
    • Gateway going lazy after some time.

    The two first problems are easy to solve and I have them under control. But this last problem is what is drawing me nuts. It appears in MYSController as the communication activity becoming slow.
    I have several devices and one of them is an 8-channel power meter that it's embedded inside my home power distribution box. It monitors each phase of the installation and has a small screen for instant measurements. So far has been a great help detecting failing appliances with residual consumption, as I made it with a great low power resolution using adc chips with amplification.

    Well, this device sends a minimum of 6 values in a row each minute. When the problem in the gateway is present, those values appear in the debug window as one every two seconds and after the third all get lost. I noticed that the values for total consumption and kitchen where so deviated, and checked that the screen showed real fresh values and it announced sending the data.
    For temp/hum sensors, gw misses one of the values.

    When this happens, the only solution is getting a stair and reaching the gateway in the ceiling and turning it off-on. After some hours, it happens again.
    Can someone help me to fix this problem? Or sending me advice in how more the gw can be debugged?

    The gateway is a nodemcu Lolin v3 board with a direct soldered wirings to a genuine pa-lna rf24 factory shielded radio. Has a cap. It is powered from a good and clean power source.
    It's burned now with the default v2.3.1 library and I removed ArduinoOTA and the web portal I implemented, so using the default espgw example. As I'm later converting received serial data to mqtt, I also tried with the mqtt sketch in case that induced some problem, but had the same results.
    GW debug doesn't show anything bad. I don't see any problem with wifi. I have good control of my network.

    Any ideas?

  • Seems that I may have been hit again by the radio sync problem:

    613757 RF24:RBR:REG=23,VAL=17
    613788 RF24:RBR:REG=23,VAL=17
    613820 RF24:RBR:REG=23,VAL=17
    613851 RF24:RBR:REG=23,VAL=17
    613883 RF24:RBR:REG=23,VAL=17
    613914 RF24:RBR:REG=23,VAL=17
    613946 RF24:RBR:REG=23,VAL=17
    613977 RF24:RBR:REG=23,VAL=17
    614009 RF24:RBR:REG=23,VAL=17
    614040 RF24:RBR:REG=23,VAL=17
    614072 RF24:RBR:REG=23,VAL=17
    614103 RF24:RBR:REG=23,VAL=17
    614135 RF24:RBR:REG=23,VAL=17
    614166 RF24:RBR:REG=23,VAL=17

    @tekka I've conducted a debug test with the radio timing code and the results can be found there:
    GW_DEBUG: https://drive.google.com/open?id=1Ji-6E_iAcE86gTzxSTjOihVlQREHjwLN
    NODE_DEBUG: https://drive.google.com/open?id=1skaAWlYWqjNRqAjcTAVudw0DAZiAFEqs