[SOLVED] ESP8266 MQTT Gateway - Non-Stable Operation, Interrupts after 2-4 hrs



  • I’m still struggling with my ESP8266 (NodeMCU), and I really hope someone could have a precious insight.

    After many attempts I managed to obtain a working MQTT Gateway: the configuration is really simple, I have one BME280 connected and I’m sending its reading over to the MQTT Broker installed on a Raspberry Pi 3.

    /**
     * The MySensors Arduino library handles the wireless radio link and protocol
     * between your home built sensors/actuators and HA controller of choice.
     * The sensors forms a self healing radio network with optional repeaters. Each
     * repeater and gateway builds a routing tables in EEPROM which keeps track of the
     * network topology allowing messages to be routed to nodes.
     *
     * Created by Henrik Ekblad <henrik.ekblad@mysensors.org>
     * Copyright (C) 2013-2015 Sensnology AB
     * Full contributor list: https://github.com/mysensors/Arduino/graphs/contributors
     *
     * Documentation: http://www.mysensors.org
     * Support Forum: http://forum.mysensors.org
     *
     * This program is free software; you can redistribute it and/or
     * modify it under the terms of the GNU General Public License
     * version 2 as published by the Free Software Foundation.
     *
     *******************************
     *
     * REVISION HISTORY
     * Version 1.0 - Henrik Ekblad
     *
     * DESCRIPTION
     * The ESP8266 MQTT gateway sends radio network (or locally attached sensors) data to your MQTT broker.
     * The node also listens to MY_MQTT_TOPIC_PREFIX and sends out those messages to the radio network
     *
     * LED purposes:
     * - To use the feature, uncomment any of the MY_DEFAULT_xx_LED_PINs in your sketch
     * - RX (green) - blink fast on radio message received. In inclusion mode will blink fast only on presentation received
     * - TX (yellow) - blink fast on radio message transmitted. In inclusion mode will blink slowly
     * - ERR (red) - fast blink on error during transmission error or receive crc error
     *
     * See https://www.mysensors.org/build/connect_radio for wiring instructions.
     *
     * If you are using a "barebone" ESP8266, see
     * https://www.mysensors.org/build/esp8266_gateway#wiring-for-barebone-esp8266
     *
     * Inclusion mode button:
     * - Connect GPIO5 (=D1) via switch to GND ('inclusion switch')
     *
     * Hardware SHA204 signing is currently not supported!
     *
     * Make sure to fill in your ssid and WiFi password below for ssid & pass.
     */
    
    // Enable debug prints to serial monitor
    #define MY_DEBUG
    
    // Use a bit lower baudrate for serial prints on ESP8266 than default in MyConfig.h
    #define MY_BAUD_RATE 115200
    
    #define MY_GATEWAY_MQTT_CLIENT
    #define MY_GATEWAY_ESP8266
    
    // Set this node's subscribe and publish topic prefix
    #define MY_MQTT_PUBLISH_TOPIC_PREFIX "domoticz/in/fh1-interno"
    #define MY_MQTT_SUBSCRIBE_TOPIC_PREFIX "domoticz/out/fh1-interno"
    
    // Set MQTT client id
    #define MY_MQTT_CLIENT_ID "INTERNO-FH1"
    
    // Set WIFI SSID and password
    #define MY_WIFI_SSID "MYSSID"
    #define MY_WIFI_PASSWORD "MYPASS"
    
    // Set the hostname for the WiFi Client. This is the hostname
    // it will pass to the DHCP server if not static.
     #define MY_HOSTNAME "INTERNO-FH1"
    
    // Enable MY_IP_ADDRESS here if you want a static ip address (no DHCP)
    #define MY_IP_ADDRESS 192,168,1,55
    
    // If using static ip you can define Gateway and Subnet address as well
    #define MY_IP_GATEWAY_ADDRESS 192,168,1,1
    #define MY_IP_SUBNET_ADDRESS 255,255,255,0
    
    // MQTT broker ip address.
    #define MY_CONTROLLER_IP_ADDRESS 192, 168, 1, 39
    
    //MQTT broker if using URL instead of ip address.
    // #define MY_CONTROLLER_URL_ADDRESS ".mosquitto.org"
    
    // The MQTT broker port to to open
    #define MY_PORT 1883
    
    #define MY_NODE_ID 11
    
    #include <ESP8266WiFi.h>
    #include <MySensors.h>
    #include <SPI.h>
    #include <Wire.h>
    #include <BME280_MOD-1022.h>
    
    #define BARO_CHILD 13
    #define TEMP_CHILD 14
    #define HUM_CHILD 15
    
    const float ALTITUDE = 124; // <-- adapt this value to your location's altitude (in m). Use your smartphone GPS to get an accurate value!
    
    //const unsigned long SLEEP_TIME = 6e+7; 
    
    const char *weather[] = { "stable", "sunny", "cloudy", "unstable", "thunderstorm", "unknown" };
    enum FORECAST
    {
      STABLE = 0,     // "Stable Weather Pattern"
      SUNNY = 1,      // "Slowly rising Good Weather", "Clear/Sunny "
      CLOUDY = 2,     // "Slowly falling L-Pressure ", "Cloudy/Rain "
      UNSTABLE = 3,   // "Quickly rising H-Press",     "Not Stable"
      THUNDERSTORM = 4, // "Quickly falling L-Press",    "Thunderstorm"
      UNKNOWN = 5     // "Unknown (More Time needed)
    };
    
    float lastPressure = -1;
    float lastTemp = -1;
    float lastHum = -1;
    int lastForecast = -1;
    
    const int LAST_SAMPLES_COUNT = 5;
    float lastPressureSamples[LAST_SAMPLES_COUNT];
    
    
    // this CONVERSION_FACTOR is used to convert from Pa to kPa in the forecast algorithm
    // get kPa/h by dividing hPa by 10 
    #define CONVERSION_FACTOR (1.0/10.0)
    
    int minuteCount = 0;
    bool firstRound = true;
    // average value is used in forecast algorithm.
    float pressureAvg;
    // average after 2 hours is used as reference value for the next iteration.
    float pressureAvg2;
    
    float dP_dt;
    boolean metric;
    MyMessage tempMsg(TEMP_CHILD, V_TEMP);
    MyMessage humMsg(HUM_CHILD, V_HUM);
    MyMessage pressureMsg(BARO_CHILD, V_PRESSURE);
    MyMessage forecastMsg(BARO_CHILD, V_FORECAST);
    
    
    float getLastPressureSamplesAverage()
    {
      float lastPressureSamplesAverage = 0;
      for (int i = 0; i < LAST_SAMPLES_COUNT; i++)
      {
        lastPressureSamplesAverage += lastPressureSamples[i];
      }
      lastPressureSamplesAverage /= LAST_SAMPLES_COUNT;
    
      return lastPressureSamplesAverage;
    }
    
    
    // Algorithm found here
    // http://www.freescale.com/files/sensors/doc/app_note/AN3914.pdf
    // Pressure in hPa -->  forecast done by calculating kPa/h
    int sample(float pressure)
    {
      // Calculate the average of the last n minutes.
      int index = minuteCount % LAST_SAMPLES_COUNT;
      lastPressureSamples[index] = pressure;
    
      minuteCount++;
      if (minuteCount > 185)
      {
        minuteCount = 6;
      }
    
      if (minuteCount == 5)
      {
        pressureAvg = getLastPressureSamplesAverage();
      }
      else if (minuteCount == 35)
      {
        float lastPressureAvg = getLastPressureSamplesAverage();
        float change = (lastPressureAvg - pressureAvg) * CONVERSION_FACTOR;
        if (firstRound) // first time initial 3 hour
        {
          dP_dt = change * 2; // note this is for t = 0.5hour
        }
        else
        {
          dP_dt = change / 1.5; // divide by 1.5 as this is the difference in time from 0 value.
        }
      }
      else if (minuteCount == 65)
      {
        float lastPressureAvg = getLastPressureSamplesAverage();
        float change = (lastPressureAvg - pressureAvg) * CONVERSION_FACTOR;
        if (firstRound) //first time initial 3 hour
        {
          dP_dt = change; //note this is for t = 1 hour
        }
        else
        {
          dP_dt = change / 2; //divide by 2 as this is the difference in time from 0 value
        }
      }
      else if (minuteCount == 95)
      {
        float lastPressureAvg = getLastPressureSamplesAverage();
        float change = (lastPressureAvg - pressureAvg) * CONVERSION_FACTOR;
        if (firstRound) // first time initial 3 hour
        {
          dP_dt = change / 1.5; // note this is for t = 1.5 hour
        }
        else
        {
          dP_dt = change / 2.5; // divide by 2.5 as this is the difference in time from 0 value
        }
      }
      else if (minuteCount == 125)
      {
        float lastPressureAvg = getLastPressureSamplesAverage();
        pressureAvg2 = lastPressureAvg; // store for later use.
        float change = (lastPressureAvg - pressureAvg) * CONVERSION_FACTOR;
        if (firstRound) // first time initial 3 hour
        {
          dP_dt = change / 2; // note this is for t = 2 hour
        }
        else
        {
          dP_dt = change / 3; // divide by 3 as this is the difference in time from 0 value
        }
      }
      else if (minuteCount == 155)
      {
        float lastPressureAvg = getLastPressureSamplesAverage();
        float change = (lastPressureAvg - pressureAvg) * CONVERSION_FACTOR;
        if (firstRound) // first time initial 3 hour
        {
          dP_dt = change / 2.5; // note this is for t = 2.5 hour
        }
        else
        {
          dP_dt = change / 3.5; // divide by 3.5 as this is the difference in time from 0 value
        }
      }
      else if (minuteCount == 185)
      {
        float lastPressureAvg = getLastPressureSamplesAverage();
        float change = (lastPressureAvg - pressureAvg) * CONVERSION_FACTOR;
        if (firstRound) // first time initial 3 hour
        {
          dP_dt = change / 3; // note this is for t = 3 hour
        }
        else
        {
          dP_dt = change / 4; // divide by 4 as this is the difference in time from 0 value
        }
        pressureAvg = pressureAvg2; // Equating the pressure at 0 to the pressure at 2 hour after 3 hours have past.
        firstRound = false; // flag to let you know that this is on the past 3 hour mark. Initialized to 0 outside main loop.
      }
    
      int forecast = UNKNOWN;
      if (minuteCount < 35 && firstRound) //if time is less than 35 min on the first 3 hour interval.
      {
        forecast = UNKNOWN;
      }
      else if (dP_dt < (-0.25))
      {
        forecast = THUNDERSTORM;
      }
      else if (dP_dt > 0.25)
      {
        forecast = UNSTABLE;
      }
      else if ((dP_dt > (-0.25)) && (dP_dt < (-0.05)))
      {
        forecast = CLOUDY;
      }
      else if ((dP_dt > 0.05) && (dP_dt < 0.25))
      {
        forecast = SUNNY;
      }
      else if ((dP_dt >(-0.05)) && (dP_dt < 0.05))
      {
        forecast = STABLE;
      }
      else
      {
        forecast = UNKNOWN;
      }
    
      return forecast;
    }
    
    
    void setup() {
      metric = getControllerConfig().isMetric;  // was getConfig().isMetric; before MySensors v2.1.1
      Wire.begin(0x76); // Wire.begin(sda, scl)
    }
    
    void presentation()  {
      // Send the sketch version information to the gateway and Controller
      sendSketchInfo("BME280 Sensor", "1.6");
    
      // Register sensors to gw (they will be created as child devices)
      present(BARO_CHILD, S_BARO);
      present(TEMP_CHILD, S_TEMP);
      present(HUM_CHILD, S_HUM);
    }
    
    // Loop
    void loop() {
      
      // need to read the NVM compensation parameters
      BME280.readCompensationParams();
    
      /*
      // After taking the measurement the chip goes back to sleep, use when battery powered.
      // Oversampling settings (os1x, os2x, os4x, os8x or os16x).
      BME280.writeFilterCoefficient(fc_16);       // IIR Filter coefficient, higher numbers avoid sudden changes to be accounted for (such as slamming a door)
      BME280.writeOversamplingPressure(os16x);    // pressure x16
      BME280.writeOversamplingTemperature(os8x);  // temperature x8
      BME280.writeOversamplingHumidity(os8x);     // humidity x8
    
      BME280.writeMode(smForced);                 // Forced sample.  After taking the measurement the chip goes back to sleep
      
    */
      // Normal mode for regular automatic samples
      BME280.writeStandbyTime(tsb_0p5ms);         // tsb = 0.5ms
      BME280.writeFilterCoefficient(fc_16);       // IIR Filter coefficient 16
      BME280.writeOversamplingPressure(os16x);    // pressure x16
      BME280.writeOversamplingTemperature(os8x);  // temperature x8
      BME280.writeOversamplingHumidity(os8x);     // humidity x8
      BME280.writeMode(smNormal);
      
      while (1) {
        // Just to be sure, wait until sensor is done mesuring  
        while (BME280.isMeasuring()) {
      }
      
      // Read out the data - must do this before calling the getxxxxx routines
      BME280.readMeasurements();
    
      float temperature = BME280.getTemperatureMostAccurate();                    // must get temp first
      float humidity = BME280.getHumidityMostAccurate();
      float pressure_local = BME280.getPressureMostAccurate();                    // Get pressure at current location
      float pressure = pressure_local/pow((1.0 - ( ALTITUDE / 44330.0 )), 5.255); // Adjust to sea level pressure using user altitude
      int forecast = sample(pressure);
      
      if (!metric) 
      {
        // Convert to fahrenheit
        temperature = temperature * 9.0 / 5.0 + 32.0;
      }
      send(tempMsg.set(temperature, 2));
      send(pressureMsg.set(pressure, 3));
      send(humMsg.set(humidity, 2));
      send(forecastMsg.set(weather[forecast]));
      delay(5000);
      
    }
    }
    

    The whole thing works flawlessy for a couple of hours, then the MQTT Broker stops receiving messages: from the Serial Monitor I can see readings from the attached sensor but it seems that the Gateway is not reporting them to the MQQT Broker.

    Altough, according to serial, the message is being sent to the MQTT Broker.
    I’m using MOSQUITTO as a Broker, which in my experience has proven to be quite reliable, and I can’t imagine where could be the problem.

    I tried every combination of Sleep(), Delay(), and even ESP.DeepSleep in the sketch ( but nothing changed ).
    (I know, I know, DeepSleep is not supposed to be active on a GW. It was just a test)


  • Hardware Contributor

    Hi,

    could you show us your logs plz?



  • Hi,

    Thank you very much for your help.

    I have just resetted again the sensor, trying to figure out what is the problem (I've changed the MQTT Topic just for this test).

    This one is the MQTT Subscription:

    2018-08-10 15:04:37.862 Status: MQTT: Connecting to 192.168.1.39:1883
    2018-08-10 15:04:37.963 Status: MQTT: connected to: 192.168.1.39:1883
    2018-08-10 15:04:37.963 Status: MySensorsMQTT: connected to: 192.168.1.39:1883
    2018-08-10 15:04:38.063 Status: MQTT: Subscribed
    2018-08-10 15:04:44.821 Status: MQTT: Worker stopped.
    
    

    This is the MQTT actually working and receiving updates:

    2018-08-10 15:08:59.432 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/12/1/0/0, Message: 29.00
    2018-08-10 15:08:59.436 (fh2-esterno) Temp + Humidity (Esterno)
    2018-08-10 15:08:59.532 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/11/1/0/1, Message: 57.60
    2018-08-10 15:08:59.536 (fh2-esterno) Temp + Humidity (Esterno)
    2018-08-10 15:09:08.283 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/12/1/0/0, Message: 29.00
    2018-08-10 15:09:08.287 (fh2-esterno) Temp + Humidity (Esterno)
    2018-08-10 15:09:08.383 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/11/1/0/1, Message: 56.80
    2018-08-10 15:09:08.387 (fh2-esterno) Temp + Humidity (Esterno)
    2018-08-10 15:09:17.392 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/12/1/0/0, Message: 28.90
    2018-08-10 15:09:17.397 (fh2-esterno) Temp + Humidity (Esterno)
    2018-08-10 15:09:17.493 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/11/1/0/1, Message: 57.70
    2018-08-10 15:09:17.497 (fh2-esterno) Temp + Humidity (Esterno)
    2018-08-10 15:09:26.326 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/12/1/0/0, Message: 28.90
    2018-08-10 15:09:26.329 (fh2-esterno) Temp + Humidity (Esterno)
    2018-08-10 15:09:26.426 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/11/1/0/1, Message: 58.00
    2018-08-10 15:09:26.429 (fh2-esterno) Temp + Humidity (Esterno)
    2018-08-10 15:09:35.436 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/12/1/0/0, Message: 28.80
    2018-08-10 15:09:35.440 (fh2-esterno) Temp + Humidity (Esterno)
    2018-08-10 15:09:35.536 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/11/1/0/1, Message: 57.90
    2018-08-10 15:09:35.541 (fh2-esterno) Temp + Humidity (Esterno)
    2018-08-10 15:09:44.364 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/12/1/0/0, Message: 28.80
    2018-08-10 15:09:44.369 (fh2-esterno) Temp + Humidity (Esterno)
    2018-08-10 15:09:44.465 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/11/1/0/1, Message: 57.90
    2018-08-10 15:09:44.469 (fh2-esterno) Temp + Humidity (Esterno)
    2018-08-10 15:09:53.474 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/12/1/0/0, Message: 28.80
    2018-08-10 15:09:53.479 (fh2-esterno) Temp + Humidity (Esterno)
    2018-08-10 15:09:53.575 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/11/1/0/1, Message: 58.00
    2018-08-10 15:09:53.580 (fh2-esterno) Temp + Humidity (Esterno)
    2018-08-10 15:10:02.408 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/12/1/0/0, Message: 28.80
    2018-08-10 15:10:02.412 (fh2-esterno) Temp + Humidity (Esterno)
    2018-08-10 15:10:02.508 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/11/1/0/1, Message: 58.00
    2018-08-10 15:10:02.513 (fh2-esterno) Temp + Humidity (Esterno)
    
    

    As soon as I got the "freeze" I'll post the LOG.



  • Sorry for the delay, here I am with the "freeze" LOG.

    I have realized that before losing the sensor data there is a middle step: I can see an anomalous message in MQTT, as the two posted below.

    The message value seems to be random, but each time this happens even though the Broker is receiving updates, the controller will mark the sensor as “unresponsive”.

    I am using Domoticz as Controller, and I noticed that when this message appears I am losing the “aggregated data” from the sensor – which appears in My Devices as a “Temp+Humidity WTGR8000” – but I am still able to receive data from the same sensor under two different Devices – a “Temp La Crosse TX3” and a “Humidity LaCrosse TX3”.

    2018-08-11 17:03:26.468 (fh2-esterno) Humidity (Hum)
    2018-08-11 17:03:35.476 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/255/3/0/22, Message: 144446
    2018-08-11 17:03:35.576 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/12/1/0/0, Message: 30.00
    2018-08-11 17:03:35.578 (fh2-esterno) Temp (Temp)
    2018-08-11 17:03:35.676 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/11/1/0/1, Message: 56.70
    2018-08-11 17:03:35.678 (fh2-esterno) Humidity (Hum)
    2018-08-11 17:03:44.422 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/12/1/0/0, Message: 30.00
    2018-08-11 17:03:44.424 (fh2-esterno) Temp (Temp)
    2018-08-11 17:03:44.522 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/11/1/0/1, Message: 56.60
    2018-08-11 17:03:44.524 (fh2-esterno) Humidity (Hum)
    

    And after a RESET the "random value" changed:

    2018-08-11 17:04:47.574 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/255/3/0/22, Message: 216634
    2018-08-11 17:04:47.675 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/12/1/0/0, Message: 30.00
    2018-08-11 17:04:47.677 (fh2-esterno) Temp (Temp)
    2018-08-11 17:04:47.775 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/11/1/0/1, Message: 56.80
    2018-08-11 17:04:47.777 (fh2-esterno) Humidity (Hum)
    2018-08-11 17:04:56.610 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/12/1/0/0, Message: 30.00
    2018-08-11 17:04:56.612 (fh2-esterno) Temp (Temp)
    2018-08-11 17:04:56.711 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/11/1/0/1, Message: 56.50
    2018-08-11 17:04:56.713 (fh2-esterno) Humidity (Hum)
    2018-08-11 17:05:05.719 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/12/1/0/0, Message: 30.00
    2018-08-11 17:05:05.721 (fh2-esterno) Temp (Temp)
    2018-08-11 17:05:05.820 MySensorsMQTT: Topic: domoticz/in/fh2-esterno/0/11/1/0/1, Message: 56.50
    2018-08-11 17:05:05.822 (fh2-esterno) Humidity (Hum)
    
    

    And this is the reference to better understand the "separated" devices issue I have noticed with my Controller.
    alt text


  • Mod

    @neo-mod interesting. Could you see if defining MY_DEBUG_VERBOSE_GATEWAY can shine a light on the problem? (Info on how to define it: https://www.mysensors.org/build/raspberry#advanced )

    There are some debug outputs in the mqtt code that might be useful. They will be activated by defining MY_DEBUG_VERBOSE_GATEWAY



  • After a long troubleshooting (and thanks to the DEBUG activation) I think I have found at least one problem: the DHCP lease renewal was interfering with correct ESP8266 operation.
    Every time the router renewed the lease the board was unable to report/reconnect correctly to the MQTT Broker.

    I have also discovered that in some occasions even though the connection between Gateway and Broker was re-established correctly, the Controller was not acknowledging data received from Broker. In that case a simple “reboot” of the controller was sufficient to establish a new connection. (with Domoticz it was a simple matter of disabling and then enabling the MySensors Hardware).

    With a DHCP Lease Time set to “Infinity” this specific problem has disappeared, but I still feel the project is quite unstable: I will continue working on it with “debug mode” and hopefully I will write an update on my findings.
    Thank you once again for the precious support!


Log in to reply
 

Suggested Topics

64
Online

11.5k
Users

11.1k
Topics

112.7k
Posts