[SOLVED] GatewayESP8266MQTTClient - Recovery failure after gateway outage
-
Hello,
I am using the "GatewayESP8266MQTTClient" and "LightSensor" Sketch with default settings.
My System: MySensorsNet 2.0.0, Arduino IDE 1.6.10 hourly (22.07.2016), Windows 8.1 64bit
I tried to find out how robust this setup is so i forcefully reset the Gateway (power off) and waited until the "Light Sensor" node entered failure state:
**
"Light Sensor"-Node**!TSP:SEND:TNR<\n> Send failed!<\r><\n> !TSP:SEND:TNR<\n> vcc: 3280<\r><\n> 511, 1017<\r><\n> !TSP:SEND:TNR<\n> Send failed!<\r><\n> !TSP:SEND:TNR<\n> vcc: 3280<\r><\n> 511, 1015<\r><\n> !TSP:SEND:TNR<\n> Send failed!<\r><\n> !TSP:SEND:TNR<\n> vcc: 3280<\r><\n> 511, 1017<\r><\n> !TSP:SEND:TNR<\n> Send failed!<\r><\n> !TSP:SEND:TNR<\n> vcc: 3280<\r><\n> 511, 1015<\r><\n> !TSP:SEND:TNR<\n> Send failed!<\r><\n> TSM:FPAR<\n> TSP:MSG:SEND 10-10-255-255 s=255,c=3,t=7,pt=0,l=0,sg=0,ft=0,st=bc:<\n> !TSP:SEND:TNR<\n> vcc: 3280<\r><\n> 511, 1018<\r><\n> !TSP:SEND:TNR<\n> Send failed!<\r><\n> !TSP:SEND:TNR<\n> vcc: 3280<\r><\n> 511, 1017<\r><\n> !TSP:SEND:TNR<\n> Send failed!<\r><\n> !TSP:SEND:TNR
After the Gateway is up again the "Light Sensor"-Node still can't find back the Gateway
And the gateways is somehow doing nothing .. at least I do not understand what it is doing.
.scandone<\n> state: 0 -> 2 (b0)<\n> state: 2 -> 3 (0)<\n> state: 3 -> 5 (10)<\n> add 0<\n> aid 1<\n> cnt <\n> <\n> connected with FRITZ!Box Fon WLAN 7390, channel 11<\n> dhcp client start...<\n> .............ip:192.168.178.67,mask:255.255.255.0,gw:192.168.178.1<\n> .IP: 192.168.178.67<\r><\n> 0;255;3;0;9;No registration required<\n> 0;255;3;0;9;Init complete, id=0, parent=0, distance=0, registration=1<\n> IP: 192.168.178.67<\r><\n> 0;255;3;0;9;Attempting MQTT connection...<\n> 0;255;3;0;9;MQTT connected<\n> 0;255;3;0;9;TSP:MSG:READ 10-10-255 s=255,c=3,t=7,pt=0,l=0,sg=0:<\n> 0;255;3;0;9;TSP:MSG:BC<\n> 0;255;3;0;9;TSP:MSG:FPAR REQ (sender=10)<\n> 0;255;3;0;9;TSP:CHKUPL:OK (FLDCTRL)<\n> 0;255;3;0;9;TSP:MSG:GWL OK<\n> 0;255;3;0;9;!TSP:MSG:SEND 0-0-10-10 s=255,c=3,t=8,pt=1,l=1,sg=0,ft=0,st=fail:0<\n> pm open,type:2 0<\n> 0;255;3;0;9;TSP:MSG:READ 10-10-255 s=255,c=3,t=7,pt=0,l=0,sg=0:<\n> 0;255;3;0;9;TSP:MSG:BC<\n> 0;255;3;0;9;TSP:MSG:FPAR REQ (sender=10)<\n> 0;255;3;0;9;TSP:CHKUPL:OK<\n> 0;255;3;0;9;TSP:MSG:GWL OK<\n> 0;255;3;0;9;!TSP:MSG:SEND 0-0-10-10 s=255,c=3,t=8,pt=1,l=1,sg=0,ft=0,st=fail:0<\n> 0;255;3;0;9;TSP:MSG:READ 10-10-255 s=255,c=3,t=7,pt=0,l=0,sg=0:<\n> 0;255;3;0;9;TSP:MSG:BC<\n> 0;255;3;0;9;TSP:MSG:FPAR REQ (sender=10)<\n> 0;255;3;0;9;TSP:CHKUPL:OK<\n> 0;255;3;0;9;TSP:MSG:GWL OK<\n> 0;255;3;0;9;!TSP:MSG:SEND 0-0-10-10 s=255,c=3,t=8,pt=1,l=1,sg=0,ft=0,st=fail:0<\n> 0;255;3;0;9;TSP:SANCHK:OK<\n> 0;255;3;0;9;TSP:SANCHK:OK<\n> 0;255;3;0;9;TSP:MSG:READ 10-10-255 s=255,c=3,t=7,pt=0,l=0,sg=0:<\n> 0;255;3;0;9;TSP:MSG:BC<\n> 0;255;3;0;9;TSP:MSG:FPAR REQ (sender=10)<\n> 0;255;3;0;9;TSP:CHKUPL:OK<\n> 0;255;3;0;9;TSP:MSG:GWL OK<\n> 0;255;3;0;9;!TSP:MSG:SEND 0-0-10-10 s=255,c=3,t=8,pt=1,l=1,sg=0,ft=0,st=fail:0<\n> 0;255;3;0;9;TSP:MSG:READ 10-10-255 s=255,c=3,t=7,pt=0,l=0,sg=0:
I can see that the gateway is receiving some messages from the node but it "fails?!" to respond accordingly?!
Anybody could enlighten me?
Gatewaycode = 99,99% vanilla except credentials and mqtt broker (mosquitto)
"light sensor" node code only for completenss, ignore the stuff that is commented out
#define MY_NODE_ID 10 <--- manual node id is set!
#include <Streaming.h> /** * The MySensors Arduino library handles the wireless radio link and protocol * between your home built sensors/actuators and HA controller of choice. * The sensors forms a self healing radio network with optional repeaters. Each * repeater and gateway builds a routing tables in EEPROM which keeps track of the * network topology allowing messages to be routed to nodes. * * Created by Henrik Ekblad <henrik.ekblad@mysensors.org> * Copyright (C) 2013-2015 Sensnology AB * Full contributor list: https://github.com/mysensors/Arduino/graphs/contributors * * Documentation: http://www.mysensors.org * Support Forum: http://forum.mysensors.org * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public License * version 2 as published by the Free Software Foundation. * ******************************* * * REVISION HISTORY * Version 1.0 - Henrik EKblad * * DESCRIPTION * Example sketch showing how to measue light level using a LM393 photo-resistor * http://www.mysensors.org/build/light */ #define MY_NODE_ID 10 #define MY_BAUD_RATE 9600 // Enable debug prints to serial monitor #define MY_DEBUG // Enable and select radio type attached #define MY_RADIO_NRF24 //#define MY_RADIO_RFM69 #include <SPI.h> #include <MySensors.h> #define CHILD_ID_LIGHT 0 #define LIGHT_SENSOR_ANALOG_PIN A3 unsigned long SLEEP_TIME = 1000; // Sleep time between reads (in milliseconds) MyMessage msg(CHILD_ID_LIGHT, V_LIGHT_LEVEL); int lastLightLevel; //#include "WatchdogAVR.h" //typedef WatchdogAVR WatchdogType; //WatchdogType Watchdog; void setup() { //Watchdog.disable(); //int countdownMS = Watchdog.enable(); //Serial.print("Enabled the watchdog with max countdown of "); //Serial.print(countdownMS, DEC); //Serial.println(" milliseconds!"); //Serial.println(); //Serial.begin(57600); Serial.println("setup() begin"); // LightSensor pinMode(A3,INPUT_PULLUP); pinMode(A2,OUTPUT); digitalWrite(A2,LOW); } void presentation() { // Send the sketch version information to the gateway and Controller sendSketchInfo("Light Sensor", "1.0"); // Register all sensors to gateway (they will be created as child devices) present(CHILD_ID_LIGHT, S_LIGHT_LEVEL); } void loop() { static long vcc = readVcc(); static int vccpercent = map(vcc,2400,3007,0,100); sendBatteryLevel(max(min(vccpercent,100),0),false); Serial << "vcc: " << vcc << endl; // Required for ack //wait(100); analogRead(LIGHT_SENSOR_ANALOG_PIN); int lightLevel_raw = analogRead(LIGHT_SENSOR_ANALOG_PIN); int lightLevel = (1023-lightLevel_raw)/10.23; // as of 1023 !! lightLevel = 511; Serial.print(lightLevel); Serial.print(", "); Serial.println(lightLevel_raw); //if (lightLevel != lastLightLevel) { if(!send(msg.set(lightLevel),true)) Serial.println("Send failed!"); lastLightLevel = lightLevel; //} //Serial.print("RETR="); //Serial.println((0x0F & RF24_retrycount())); wait(100); sleep(SLEEP_TIME); } // https://forum.mysensors.org/topic/3463/m_ack_variable-or-m_set_variable/2 void receive(const MyMessage &message) { if (message.isAck()) { Serial.println("This is an ack from gateway"); //Serial.println("Reset Watchdog!"); //Watchdog.reset(); } }
I got headache from this line:
0;255;3;0;9;!TSP:MSG:SEND 0-0-10-10 s=255,c=3,t=8,pt=1,l=1,sg=0,ft=0,st=fail:0<\n>
After the node requsts a parent .. isn't the node supposed to listen for the response? Where should this happen in the code?!
-
@cimba007 I assume
wait(100) sleep(SLEEP_TIME)
is too short to fully reconnect.
Try this code instead (this prevents the node from sleeping if transport is not operational)
if(isTransportOK()){ wait(100); sleep(SLEEP_TIME); } else { wait(SLEEP_TIME); }
-
@tekka I will try it at once! I was thinking about something like this too. Thanks for pointing out that using isTransportOK() might be the solution.
A quick first test is looking very good.
Is it possible to include this wait in the send-function? The self-healing feature of the network would be completly transparent to the enduser. Maybe add some configurable timeout function or another paremeter to send?
-
I have the exact same problem: the node never reconnects to gateway after a gateway outage.
In my case it is an ethernet MQTTClientGateway and a humidity node.
I have experimented with your suggestion and found that 1000ms was still not enough for a full reconnect before the node goes to sleep, so I made it 3000ms. This seems to allow a reconnect all the time (at least in my environment).
// @TODO remove after testing gateway outage if (!isTransportOK()) { #ifdef MY_DEBUG Serial.print("Transport ERROR. Waiting for a proper transport reconnect"); #endif wait(3000); } // Sleep for a while to save energy sleep(UPDATE_INTERVAL);
I am unsure why MySensors doesn't have this self-healing element built-in. Maybe it is only a bug, maybe it has some deeper reason.