Watchdog on Ethernet Gateway


  • Contest Winner

    I'm still frustrated that my Ethernet Gateway will occaisionally lock up...

    I can't predict it, I have been logging it and it seems to me that I cannot find the cause or culprit.

    So, I was thinking about attaching an attiny45 to it, where it is constantly checking for a pulse... and can (hardware) reset the gateway in case of a lack of signal after a certain timeout. The less than $1.00 cost seems like reasonable insurance to keep things chugging along.

    Has anyone already done something like this (@axillent , @hek calling on your experience)? I've searched here with no luck...


  • Mod

    @BulldogLowell which version of EthernetGateway you are running?
    does ping stops responding?

    I'm running 1.4.1 without any issues

    atmega contains a hardware watchdog without actual need of external one.
    With mysensors 1.3 I made an AVR version with support of the atmel's watchdog
    I think it can be a good idea to add watchdog support to the official arduino version of MySensors library.


  • Contest Winner

    @axillent said:

    @BulldogLowell which version of EthernetGateway you are running?
    does ping stops responding?

    I'm running 1.4.1 without any issues

    atmega contains a hardware watchdog without actual need of external one.
    With mysensors 1.3 I made an AVR version with support of the atmel's watchdog
    I think it can be a good idea to add watchdog support to the official arduino version of MySensors library.

    Also 1.4.1 While it vastly improved my performance across the board, I still get the (every two to three weeks) occasional lock-up. I can ping the gateway, but radio communication is kaput.

    I agree that we should include the watchdog functionality into both the Serial and Internet versions of gateway code. What can I do to help?

    I'll try the:

    wdt_enable(); 
    

    and we shall see if it tis the arduino locking up or the communication somewhere...


  • Mod

    @BulldogLowell wdt_enable alone cannot be sufficient

    first of all you need a bootloader having watchdog support
    official bootloader is not working with watchdog
    optiboot is working
    other alternative is to kill a bootloader but you will need ISP programmer to download sketches

    second wdt_reset need to be embedded into many places across sketch and library sources


  • Contest Winner

    @axillent said:

    second wdt_reset need to be embedded into many places across sketch and library sources

    OK, so for now it's a hardware plan!


  • Mod

    @BulldogLowell why not to play around with your own copy of the sources?

    with success you can be a contributor of this nice feature


  • Admin

    @axillent said:

    second wdt_reset need to be embedded into many places across sketch and library sources

    It's in general good practice to enable the watchdog to avoid deadlocks. What works best in my experience is to set the wdt to 8s and reset the wdt timer at the beginning of the main loop (as long as it takes max. 8s).

    Important: Avoid blocking functions and delay() - substitute delay() for mysensors::wait(), where the wdt is reset.

    The problems with the ethernet shield W5100 are discussed everywhere, resetting the arduino won't help in that case, but rather implementing an external watchdog that power-cycles the shield...


  • Contest Winner

    @axillent said:

    you can be a contributor of this nice feature

    OK, I'll give it a stab.

    @tekka said:

    What works best in my experience is to set the wdt to 8s and reset the wdt timer at the beginning of the main loop (as long as it takes max. 8s).

    you mean generally, or you are using it already on this application?


  • Admin

    @BulldogLowell both, i'm using the wdt with Optiboot and MYSBootloader. Reading MCUSR during bootloading will give a hint on the reset cause...


  • Admin

    Here a simple sketch:

    #include <avr\wdt.h>
    
    void setup() {
      // set watchdog to 8s
      wdt_enable(WDTO_8S);
    }
    
    void loop() {
      // watchdog reset
      wdt_reset();
      
      // do some stuff that does not take longer than 8s
     // else watchdog will trigger
    }

  • Contest Winner

    @tekka said:

    Here a simple sketch:

    yes, got that far 😉 I thought you meant you had it working on a gateway.

    Working with the (as @axillent said) the gateway, with all of its nuances and making it extensible for (the majority?) of the standard sensor sketches would be the challenge.


  • Mod

    @BulldogLowell setting watchdog to timeout after 8 seconds can support you in most situations without a need to dig into library source

    But you will be not guaranteed from all cases, it is possible that your arduino will be reset in normal situation where reset is not needed. But actually the same will happen if you will attach external watchdog as you state in you first post here


  • Contest Winner

    @axillent said:

    But you will be not guaranteed from all cases, it is possible that your arduino will be reset in normal situation where reset is not needed. But actually the same will happen if you will attach external watchdog as you state in you first post here

    I have a lot more flexibility with hardware, i believe.... there is an (8s) limit on the Atmel watchdog, no?


  • Mod

    @BulldogLowell said:

    I have a lot more flexibility with hardware, i believe.... there is an (8s) limit on the Atmel watchdog, no?

    it is a limit of atmega328, other MCU can differ

    but it is still possible to organize longer period by activating WDT ISR handler
    this will be not the best practice but it will work
    the best practice is to set very short timeout and to put wdt_reset() at all required points


  • Admin

    What happens with watchdog while sleeping?


  • Mod

    @hek said:

    What happens with watchdog while sleeping?

    it depends on what do you want 🙂
    if i'm not mistaken LowPower library you are using is using watchdog while sleeping
    actually watchdog is only one timer running while POWER_DOWN
    it is a common way to wake up because of watchdog event from the deepest sleep if you need to wake up by time, not by external event



  • Hi, i would be interested in any more updates along this line. In my case my gateway is mostly stable as far as i can tell. several weeks without issues. My problem is that most of my nodes seem to lockup, so i would be totally interested in making them more robust either by software or if necessary with external chip as overlord watchdog


  • Hero Member

    @BulldogLowell Any progress on this? My Ethernet gateway is quite reliable after solving power supply issues and implementing soft spi. But about once a month it stops and rebooting it makes it operative again. The failure seems to be periodic--once a month. May be a clue there--something reaching a limit? At any rate, any failure rate where I can't start it up without physically rebooting is too much.



  • millis() has a rollover after approximately 50 days. Could that be something?


  • Hero Member

    @maha Don't know. But your question did trigger the thought of using milils() to periodically trigger a reboot. Rather than check for a lockup with a watchdog routine I may try just rebooting it at a shorter time interval, e.g., every 2 weeks, than I have experienced the lockups using millis to measure the interval.


  • Hero Member

    Added a 8 sec Watchdog to my ethernet gateway and it's up and running now. Time will tell if it cures the occaisional lockups by automatically resetting. Will report back the results


  • Admin

    FYI: Watchdog reset is automatically called by process() nowadays in the development-branch.

    https://github.com/mysensors/Arduino/blob/development/libraries/MySensors/MySensor.cpp#L509


  • Hero Member

    @hek I assume this is with the intent to allow a watchdog reset on sensor sketches in the future, but does not apply to the Ethernet gateway sketch?


  • Hero Member

    @hek Please disregard my prior message. I looked further and see that the watchdog reset is part of the sensor wait routine and is in the current 1.4 version of MySensors.cpp


  • Hero Member

    It's been over 6 weeks since I added an 8 sec watchdog to my Ethernet gateway (to Vera) and have not had to reboot the gateway since then. Before adding the watchdog, the most it ever lasted before without rebooting was 4 weeks. So for now I have to assume that the watchdog is doing its job and automatically restoring the gateway. If so, then it is providing the gateway dependability that I wanted. Time will tell.


  • Contest Winner

    @Dan-S.

    great news, thanks for the update!

    can you share how you set it up? Are you using the stock gateway code?


  • Hero Member

    @BulldogLowell

    I got my basic info and code for a watchdog timer from a pdf by Nicolas Larson titled "Basic Watchdog Timer" at this site:

    http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0CB4QFjAA&url=http%3A%2F%2Fforum.arduino.cc%2Findex.php%3Faction%3Ddlattach%3Btopic%3D63651.0%3Battach%3D3585&ei=lc6ZVYiiM8PigwTVx4HYCQ&usg=AFQjCNEBfr1yZ_g44GdxaNVmH4zKyTz3nA&bvm=bv.96952980,d.eXY

    Yes, I am using the stock gateway code with the watchdog code outlined in the pdf added. His code includes a lot of details in the setup that can probably be streamlined but I liked operating at the elementary level to better see what is going on.

    Will post the whole gateway sketch. What's the best way to go about posting it?



  • @Dan-S. To insert code on the forum, use the </> button above the 'compose box'.


  • Hero Member

    @BulldogLowell
    Here's the code. Should be the same as standard except that I didn't use ip address input.

    In
    /*
     * Copyright (C) 2013 Henrik Ekblad <henrik.ekblad@gmail.com>
     * 
     * Contribution by a-lurker
     *
     * This program is free software; you can redistribute it and/or
     * modify it under the terms of the GNU General Public License
     * version 2 as published by the Free Software Foundation.
     * 
     * DESCRIPTION
     * The EthernetGateway sends data received from sensors to the ethernet link. 
     * The gateway also accepts input on ethernet interface, which is then sent out to the radio network.
     *
     * The GW code is designed for Arduino 328p / 16MHz.  ATmega168 does not have enough memory to run this program.
     * 
     *
     * COMPILING WIZNET (W5100) ETHERNET MODULE
     * > Edit RF24_config.h in (libraries\MySensors\utility) to enable softspi (remove // before "#define SOFTSPI").
     *
     * COMPILING ENC28J60 ETHERNET MODULE
     * > Use Arduino IDE 1.5.7 (or later) 
     * > Disable DEBUG in Sensor.h before compiling this sketch. Othervise the sketch will probably not fit in program space when downloading. 
     * > Remove Ethernet.h include below and include UIPEthernet.h 
     * > Remove DigitalIO include 
     * Note that I had to disable UDP and DHCP support in uipethernet-conf.h to reduce space. (which means you have to choose a static IP for that module)
     *
     * VERA CONFIGURATION:
     * Enter "ip-number:port" in the ip-field of the Arduino GW device. This will temporarily override any serial configuration for the Vera plugin. 
     * E.g. If you want to use the defualt values in this sketch enter: 192.168.178.66:5003
     *
     * LED purposes:
     * - RX (green) - blink fast on radio message recieved. In inclusion mode will blink fast only on presentation recieved
     * - TX (yellow) - blink fast on radio message transmitted. In inclusion mode will blink slowly
     * - ERR (red) - fast blink on error during transmission error or recieve crc error  
     * 
     * See http://www.mysensors.org/build/ethernet_gateway for wiring instructions.
     *
     */
    
    #include <DigitalIO.h>     // This include can be removed when using UIPEthernet module  
    #include <SPI.h>  
    #include <MySensor.h>
    #include <MyGateway.h>  
    #include <stdarg.h>
    
    //watchdog version of gateway
    #include <avr/wdt.h>
    
    // Use this if you have attached a Ethernet ENC28J60 shields  
    //#include <UIPEthernet.h>  
    
    // Use this fo WizNET W5100 module and Arduino Ethernet Shield 
    #include <Ethernet.h>   
    
    
    #define INCLUSION_MODE_TIME 1 // Number of minutes inclusion mode is enabled
    #define INCLUSION_MODE_PIN  3 // Digital pin used for inclusion mode button
    
    #define RADIO_CE_PIN        5  // radio chip enable
    #define RADIO_SPI_SS_PIN    6  // radio SPI serial select
    #define RADIO_ERROR_LED_PIN 7  // Error led pin
    #define RADIO_RX_LED_PIN    8  // Receive led pin
    #define RADIO_TX_LED_PIN    9  // the PCB, on board LED
    
    #define IP_PORT 5003        // The port you want to open 
    //IPAddress myIp (192, 168, 1, 14);  // Configure your static ip-address here    COMPILE ERROR HERE? Use Arduino IDE 1.5.7 or later!
    // Commented out IPAddress to use DHCP router assigned address, Cannot check program with serial monitor with this
    //since the DHCP address will not be available till plugged into ethernet.
    // The MAC address can be anything you want but should be unique on your network.
    // Newer boards have a MAC address printed on the underside of the PCB, which you can (optionally) use.
    // Note that most of the Ardunio examples use  "DEAD BEEF FEED" for the MAC address.
    byte mac[] = { 0x00, 0xAA, 0xBB, 0xCC, 0xDE, 0x02 };  
    
    // a R/W server on the port
    EthernetServer server = EthernetServer(IP_PORT);
    
    // No blink or button functionality. Use the vanilla constructor.
    MyGateway gw(RADIO_CE_PIN, RADIO_SPI_SS_PIN, INCLUSION_MODE_TIME);
    
    // Uncomment this constructor if you have leds and include button attached to your gateway 
    //MyGateway gw(RADIO_CE_PIN, RADIO_SPI_SS_PIN, INCLUSION_MODE_TIME, INCLUSION_MODE_PIN, RADIO_RX_LED_PIN, RADIO_TX_LED_PIN, RADIO_ERROR_LED_PIN);
    
    
    char inputString[MAX_RECEIVE_LENGTH] = "";    // A string to hold incoming commands from serial/ethernet interface
    int inputPos = 0;
    
    void setup()  
    { 
      Ethernet.begin(mac);
    
      // give the Ethernet interface a second to initialize
      delay(1000);
    
      // Initialize gateway at maximum PA level, channel 70 and callback for write operations 
      gw.begin(RF24_PA_LEVEL_GW, RF24_CHANNEL, RF24_DATARATE, writeEthernet);
    
      // start listening for clients
      server.begin();
      
      //set the watchdog
      watchdogSetup();
      
    }
    
    // This will be called when data should be written to ethernet 
    void writeEthernet(char *writeBuffer) {
      server.write(writeBuffer);
    }
    
    //WatchDog setup function
    void watchdogSetup(void)
    {
    cli();
    wdt_reset();
    /*
    WDTCSR configuration:
    WDIE = 1: Interrupt Enable
    WDE = 1 :Reset Enable
    
    WDP3 = 1 :For 8000ms Time-out
    WDP2 = 0 :For 8000ms Time-out
    WDP1 = 0 :For 8000ms Time-out
    WDP0 = 1 :For 8000ms Time-out
    */
    // Enter Watchdog Configuration mode:
    WDTCSR |= (1<<WDCE) | (1<<WDE);
    // Set Watchdog settings:
    
    //no interrupt, reset enable
    WDTCSR = (0<<WDIE) | (1<<WDE) |
    
    //8 second timer
    (1<<WDP3) | (0<<WDP2) | (0<<WDP1) |
    (1<<WDP0);
    sei();
    }
    
    
    
    void loop()
    {
      // if an incoming client connects, there will be
      // bytes available to read via the client object
      EthernetClient client = server.available();
    
      if (client) {
          // if got 1 or more bytes
          if (client.available()) {
             // read the bytes incoming from the client
             char inChar = client.read();
    
             if (inputPos<MAX_RECEIVE_LENGTH-1) { 
               // if newline then command is complete
               if (inChar == '\n') {  
                  // a command was issued by the client
                  // we will now try to send it to the actuator
                  inputString[inputPos] = 0;
    
                  // echo the string to the serial port
                  Serial.print(inputString);
    
                  gw.parseAndSend(inputString);
    
                  // clear the string:
                  inputPos = 0;
               } else {  
                 // add it to the inputString:
                 inputString[inputPos] = inChar;
                 inputPos++;
               }
            } else {
               // Incoming message too long. Throw away 
               inputPos = 0;
            }
          }
       }  
       gw.processRadioMessage();  
     //reset watchdog
     wdt_reset();  
    }
    
    `

  • Hero Member

    Would not recommend adding the watchdog until you have a relatively stable gateway, e.g. you have addressed any power/radio issues etc. Otherwise you will have the watchdog constantly tripping in response to issues that are best addressed by other means, i.e., proper troubleshooting.


  • Hero Member

    FWIW, I've also experienced rare intermittent lockups on a clone Mega2560 with a clone ethernet shield. By rare I mean an interval between lockups of anywhere from 2 weeks to 6 months, with a median of around 4 months. i.e. It happens, but not very frequently, and so far I haven't seen a pattern to it.

    Does the arduino's stock brownout protection circuit protect against all possible brownout scenarios, or just some of them? Anyone happen to know? Because I don't know the answer, the next step for me is to put it on a UPS to see if it makes any difference.

    If your arduino is powered via USB, then my impression is the arduino gives you no additional protection against voltage spikes. Most USB power sources are buck converters. Do buck converters alone typically do well at flattening voltage spikes, or would you need to add a voltage regulator to ensure all voltage spikes get flattened out? Anyone happen to know? Depending on the answer, it may be reason to power through the barrel jack, which goes through the onboard voltage regulator, rather than through the USB.

    Regarding use of the built-in watchdog protection, is there a possible brownout/spike scenario that could defeat the watchdog protection in addition to wedge the regular program loop? Or is the watchdog bulletproof in that regard?


  • Hero Member

    @NeverDie All I can say is that after adding the watchdog I have not had a lockup in 2+ months. Before that, like you, I had random lockups. Based on this positive experience I also added a watchdog to the one repeater I have in my system. Had a brief power outage the other day and everything came back online by itself with no problems.



Suggested Topics

0
Online

11.2k
Users

11.1k
Topics

112.5k
Posts