[SOLVED] GatewayESP8266MQTTClient - Recovery failure after gateway outage



  • Hello,

    I am using the "GatewayESP8266MQTTClient" and "LightSensor" Sketch with default settings.

    My System: MySensorsNet 2.0.0, Arduino IDE 1.6.10 hourly (22.07.2016), Windows 8.1 64bit

    I tried to find out how robust this setup is so i forcefully reset the Gateway (power off) and waited until the "Light Sensor" node entered failure state:

    **
    "Light Sensor"-Node**

    !TSP:SEND:TNR<\n>
    Send failed!<\r><\n>
    !TSP:SEND:TNR<\n>
    vcc: 3280<\r><\n>
    511, 1017<\r><\n>
    !TSP:SEND:TNR<\n>
    Send failed!<\r><\n>
    !TSP:SEND:TNR<\n>
    vcc: 3280<\r><\n>
    511, 1015<\r><\n>
    !TSP:SEND:TNR<\n>
    Send failed!<\r><\n>
    !TSP:SEND:TNR<\n>
    vcc: 3280<\r><\n>
    511, 1017<\r><\n>
    !TSP:SEND:TNR<\n>
    Send failed!<\r><\n>
    !TSP:SEND:TNR<\n>
    vcc: 3280<\r><\n>
    511, 1015<\r><\n>
    !TSP:SEND:TNR<\n>
    Send failed!<\r><\n>
    TSM:FPAR<\n>
    TSP:MSG:SEND 10-10-255-255 s=255,c=3,t=7,pt=0,l=0,sg=0,ft=0,st=bc:<\n>
    !TSP:SEND:TNR<\n>
    vcc: 3280<\r><\n>
    511, 1018<\r><\n>
    !TSP:SEND:TNR<\n>
    Send failed!<\r><\n>
    !TSP:SEND:TNR<\n>
    vcc: 3280<\r><\n>
    511, 1017<\r><\n>
    !TSP:SEND:TNR<\n>
    Send failed!<\r><\n>
    !TSP:SEND:TNR
    

    After the Gateway is up again the "Light Sensor"-Node still can't find back the Gateway

    And the gateways is somehow doing nothing .. at least I do not understand what it is doing.

    .scandone<\n>
    state: 0 -> 2 (b0)<\n>
    state: 2 -> 3 (0)<\n>
    state: 3 -> 5 (10)<\n>
    add 0<\n>
    aid 1<\n>
    cnt <\n>
    <\n>
    connected with FRITZ!Box Fon WLAN 7390, channel 11<\n>
    dhcp client start...<\n>
    .............ip:192.168.178.67,mask:255.255.255.0,gw:192.168.178.1<\n>
    .IP: 192.168.178.67<\r><\n>
    0;255;3;0;9;No registration required<\n>
    0;255;3;0;9;Init complete, id=0, parent=0, distance=0, registration=1<\n>
    IP: 192.168.178.67<\r><\n>
    0;255;3;0;9;Attempting MQTT connection...<\n>
    0;255;3;0;9;MQTT connected<\n>
    0;255;3;0;9;TSP:MSG:READ 10-10-255 s=255,c=3,t=7,pt=0,l=0,sg=0:<\n>
    0;255;3;0;9;TSP:MSG:BC<\n>
    0;255;3;0;9;TSP:MSG:FPAR REQ (sender=10)<\n>
    0;255;3;0;9;TSP:CHKUPL:OK (FLDCTRL)<\n>
    0;255;3;0;9;TSP:MSG:GWL OK<\n>
    0;255;3;0;9;!TSP:MSG:SEND 0-0-10-10 s=255,c=3,t=8,pt=1,l=1,sg=0,ft=0,st=fail:0<\n>
    pm open,type:2 0<\n>
    0;255;3;0;9;TSP:MSG:READ 10-10-255 s=255,c=3,t=7,pt=0,l=0,sg=0:<\n>
    0;255;3;0;9;TSP:MSG:BC<\n>
    0;255;3;0;9;TSP:MSG:FPAR REQ (sender=10)<\n>
    0;255;3;0;9;TSP:CHKUPL:OK<\n>
    0;255;3;0;9;TSP:MSG:GWL OK<\n>
    0;255;3;0;9;!TSP:MSG:SEND 0-0-10-10 s=255,c=3,t=8,pt=1,l=1,sg=0,ft=0,st=fail:0<\n>
    0;255;3;0;9;TSP:MSG:READ 10-10-255 s=255,c=3,t=7,pt=0,l=0,sg=0:<\n>
    0;255;3;0;9;TSP:MSG:BC<\n>
    0;255;3;0;9;TSP:MSG:FPAR REQ (sender=10)<\n>
    0;255;3;0;9;TSP:CHKUPL:OK<\n>
    0;255;3;0;9;TSP:MSG:GWL OK<\n>
    0;255;3;0;9;!TSP:MSG:SEND 0-0-10-10 s=255,c=3,t=8,pt=1,l=1,sg=0,ft=0,st=fail:0<\n>
    0;255;3;0;9;TSP:SANCHK:OK<\n>
    0;255;3;0;9;TSP:SANCHK:OK<\n>
    0;255;3;0;9;TSP:MSG:READ 10-10-255 s=255,c=3,t=7,pt=0,l=0,sg=0:<\n>
    0;255;3;0;9;TSP:MSG:BC<\n>
    0;255;3;0;9;TSP:MSG:FPAR REQ (sender=10)<\n>
    0;255;3;0;9;TSP:CHKUPL:OK<\n>
    0;255;3;0;9;TSP:MSG:GWL OK<\n>
    0;255;3;0;9;!TSP:MSG:SEND 0-0-10-10 s=255,c=3,t=8,pt=1,l=1,sg=0,ft=0,st=fail:0<\n>
    0;255;3;0;9;TSP:MSG:READ 10-10-255 s=255,c=3,t=7,pt=0,l=0,sg=0:
    

    I can see that the gateway is receiving some messages from the node but it "fails?!" to respond accordingly?!

    Anybody could enlighten me?

    Gatewaycode = 99,99% vanilla except credentials and mqtt broker (mosquitto)

    "light sensor" node code only for completenss, ignore the stuff that is commented out

    #define MY_NODE_ID 10 <--- manual node id is set!

    #include <Streaming.h>
    
    
    /**
     * The MySensors Arduino library handles the wireless radio link and protocol
     * between your home built sensors/actuators and HA controller of choice.
     * The sensors forms a self healing radio network with optional repeaters. Each
     * repeater and gateway builds a routing tables in EEPROM which keeps track of the
     * network topology allowing messages to be routed to nodes.
     *
     * Created by Henrik Ekblad <henrik.ekblad@mysensors.org>
     * Copyright (C) 2013-2015 Sensnology AB
     * Full contributor list: https://github.com/mysensors/Arduino/graphs/contributors
     *
     * Documentation: http://www.mysensors.org
     * Support Forum: http://forum.mysensors.org
     *
     * This program is free software; you can redistribute it and/or
     * modify it under the terms of the GNU General Public License
     * version 2 as published by the Free Software Foundation.
     *
     *******************************
     *
     * REVISION HISTORY
     * Version 1.0 - Henrik EKblad
     * 
     * DESCRIPTION
     * Example sketch showing how to measue light level using a LM393 photo-resistor 
     * http://www.mysensors.org/build/light
     */
    
    #define MY_NODE_ID 10
    #define MY_BAUD_RATE 9600
    
    // Enable debug prints to serial monitor
    #define MY_DEBUG 
    
    // Enable and select radio type attached
    #define MY_RADIO_NRF24
    //#define MY_RADIO_RFM69
    
    #include <SPI.h>
    #include <MySensors.h>  
    
    #define CHILD_ID_LIGHT 0
    #define LIGHT_SENSOR_ANALOG_PIN A3
    
    unsigned long SLEEP_TIME = 1000; // Sleep time between reads (in milliseconds)
    
    MyMessage msg(CHILD_ID_LIGHT, V_LIGHT_LEVEL);
    int lastLightLevel;
    
    //#include "WatchdogAVR.h"
    //typedef WatchdogAVR WatchdogType;
    //WatchdogType Watchdog;
    
    void setup()
    {
      //Watchdog.disable();
      //int countdownMS = Watchdog.enable();
      
      //Serial.print("Enabled the watchdog with max countdown of ");
      //Serial.print(countdownMS, DEC);
      //Serial.println(" milliseconds!");
      //Serial.println();
      
      //Serial.begin(57600);
      Serial.println("setup() begin");
        // LightSensor
      pinMode(A3,INPUT_PULLUP);
      pinMode(A2,OUTPUT);
      digitalWrite(A2,LOW);
    }
    void presentation()  {
      // Send the sketch version information to the gateway and Controller
      sendSketchInfo("Light Sensor", "1.0");
    
      // Register all sensors to gateway (they will be created as child devices)
      present(CHILD_ID_LIGHT, S_LIGHT_LEVEL);
    }
    
    void loop()      
    {     
      static long vcc = readVcc();
      static int vccpercent = map(vcc,2400,3007,0,100);
      sendBatteryLevel(max(min(vccpercent,100),0),false);
      Serial << "vcc: " << vcc << endl;
      // Required for ack
      //wait(100);
      
      analogRead(LIGHT_SENSOR_ANALOG_PIN);
      int lightLevel_raw = analogRead(LIGHT_SENSOR_ANALOG_PIN);
      int lightLevel = (1023-lightLevel_raw)/10.23; // as of 1023 !!
      lightLevel = 511;
      Serial.print(lightLevel);
      Serial.print(", ");
      Serial.println(lightLevel_raw);
      //if (lightLevel != lastLightLevel) {
          if(!send(msg.set(lightLevel),true))
            Serial.println("Send failed!");
          lastLightLevel = lightLevel;
      //}
      //Serial.print("RETR=");
      //Serial.println((0x0F & RF24_retrycount()));
      wait(100);
      sleep(SLEEP_TIME);
    }
    // https://forum.mysensors.org/topic/3463/m_ack_variable-or-m_set_variable/2
    void receive(const MyMessage &message) {
      if (message.isAck()) {
          Serial.println("This is an ack from gateway");
          //Serial.println("Reset Watchdog!");
          //Watchdog.reset();
          }
    }
    

    I got headache from this line:

    0;255;3;0;9;!TSP:MSG:SEND 0-0-10-10 s=255,c=3,t=8,pt=1,l=1,sg=0,ft=0,st=fail:0<\n>
    

    After the node requsts a parent .. isn't the node supposed to listen for the response? Where should this happen in the code?!


  • Admin

    @cimba007 I assume

    wait(100)
    sleep(SLEEP_TIME)
    

    is too short to fully reconnect.

    Try this code instead (this prevents the node from sleeping if transport is not operational)

    if(isTransportOK()){
        wait(100);
        sleep(SLEEP_TIME);
      } 
      else {
        wait(SLEEP_TIME);
      }
    


  • @tekka I will try it at once! I was thinking about something like this too. Thanks for pointing out that using isTransportOK() might be the solution.

    A quick first test is looking very good. 😌

    Is it possible to include this wait in the send-function? The self-healing feature of the network would be completly transparent to the enduser. Maybe add some configurable timeout function or another paremeter to send?



  • I have the exact same problem: the node never reconnects to gateway after a gateway outage.

    In my case it is an ethernet MQTTClientGateway and a humidity node.

    I have experimented with your suggestion and found that 1000ms was still not enough for a full reconnect before the node goes to sleep, so I made it 3000ms. This seems to allow a reconnect all the time (at least in my environment).

      // @TODO remove after testing gateway outage
      if (!isTransportOK()) {
        #ifdef MY_DEBUG
        Serial.print("Transport ERROR. Waiting for a proper transport reconnect");
        #endif
        wait(3000);
      }
    
      // Sleep for a while to save energy
      sleep(UPDATE_INTERVAL); 
    

    I am unsure why MySensors doesn't have this self-healing element built-in. Maybe it is only a bug, maybe it has some deeper reason.


Log in to reply
 

489
Online

6.9k
Users

7.7k
Topics

82.5k
Posts

Looks like your connection to MySensors Forum was lost, please wait while we try to reconnect.