Occasional MySensors network drop outs
-
Occasionally the whole of the MySensors network will drop out and I'm trying to figure out why.
My system:
Raspberry Pi 2 gateway with nothing else running on the Pi.
I have 10 active sensors and most are built using the Newbie PCB running off either batteries or 5v USB power sources. Most sensors are a mix of DHT22, PIRs and MPR121 touch sensors.
I use Home Assistant for automation & data logging.Most of the time everything just works, but on occasion everything MySensors will just stop - meaning the motion sensors that control lighting and the light switches much to the annoyance of the wife who claims my smart house isn't so smart :). I know its MySensors as the other half of my system runs on Tinkerforge sensors which are rock solid. Whilst the system is down if I check MYSController I see no messages on the network. Connection to the gateway Pi doesn't go down (HomeAssistant will log this). After 5 or so minutes (without intervention) it'll come back up.
There are no messages in the MySensors service log and obviously checking the sensor serial logs is not so simple on deployed sensors especially as this problem is quite random. It might happen more often than I know but is far more obvious at night (but its summer so night doesn't start until 2130 so so far not a major problem but come winter's pre 1600 sunsets it'll be more challenging!).How do I go about debugging this situation? I thought about deploying a sensor that checks in every 10 seconds to spot the drop outs. Doesn't solve it though however. I don't feel like my comms is at fault as then it wouldn't affect all sensors. All sensors have a cap on the NRF radio.
Tips greatly appreciated!
-
Is it possible that your network problems are happening when you use a microwave at the same time?
-
Since everything goes down it sounds like a central issue? Did you have the possibility to debug the gateway during stops?
Maybe you could use some sort of sd- card debugging on the nodes but I would set up a computer listening to the gateway as first step.
-
One very easy way to monitor the network health is to send I_HEARTBEAT_REQUEST from the gateway to arbitrary node - the node will respond with I_HEARTBEAT_RESPONSE. Well, this works only for non-sleeping nodes of course. If You are using mqtt gateway it is very easy to write a python script to send heartbeat request for example every 30 seconds and check if the response from the node came (I have written one and can share it with you).
-
I have had the exact same experience, sometimes system is up for days, sometimes only hours. I power off and then on the pi and all is back again. I use MyController just for info.
Here are some things that I need to know....
Is your gateway the pi itself or an arduino attached to it?
For the radio, where is it getting it's power from?
How is the gateway connected to the pi?
What radio module are you using?
Have you added the suggested capacitor to the radio?
Maybe I can help more when I know this!
-
I recommend that you build a WiFi or Ethernet gateway and you will discard several causes in one shot. I had the same problems with the RPI gw and that helped a lot.
Could be problems with the gw service, wiring, but also the RPI2 power limit reached on the USB lane, the power supply not being enough specially if you added WiFi to the RPI, or the system being resource hungry at some moment.Also for diagnosing I would stop some nodes and try. After solving the first point I had this problem. It happen to be a far node sending something weird to the gw due to the long distance.
You can stop the farther nodes or the ones of one kind. You could end with a working system excluding only one node, like in my case.
-
As an informational, I have a regular computer (Intel core 2 duo) with a serial gateway (I have had many different iterations) this one has the NRF24L01+ PA/LNA. I am using openhab2 to control. I have tried changing everything, radios cables, location. Most of the time the system just works perfectly. On occasion and I can't find a pattern it just locks up for like you said about 5 minutes. If I wait for a bit and try again everything goes back to working. Most of the time it seems as though my rules (automation) happens without a hitch, however if a rule is triggered while the system is locked up then for example my TV doesn't turn off at 10 like it should and I wake up a 3 with the tv still on, or the wife can't control her closet led strip. I have mostly considered this just one of those things to live with, until I make the wife mad enough and just move over to zwave for anything critical. In a weird way glad to see that its not just me having this issue. My issue seems to be exactly as you describe, other than I know that when I try to send commands I just get noack's to everything I send out while the "issue" is going on. Either rebooting everything or just waiting solves the issue.
-
@et al.
For what it’s worth, I too have experienced the issue. The only difference I experienced was communication never re-established itself. And in some of my FMA’s I guess I was too impatient to wait. ;-{System under test:
RPi 3B running latest version of Stretch.
MySensors Gateway with MQTT (as service)
Node 1 Teensy 3.2 running simple loop waiting for request for Temp, Humidity, and Battery V.
Node 2 Arduino Uno running simple loop waiting for wind direction queries.
[in case your are interested my direction sensor is also home grown, using a MLX90316. I strongly urge anyone to take a look at this device. Reports complete 360 degrees. A sweet device indeed. Only problem is 5V. The 3V part lead time is toooo long to wait]Both nodes get polled at 30 second intervals with 10 seconds between nodes.
(However - node #2 also transmits unsolicited every 10 seconds to my ‘home grown’ controller running on the same RPi as the gateway. But again, even with node 2 out of the picture, problem still occurs.)
Both nodes reporting TX and RX RSSI nominal. (Distance isn’t a problem – and I doubt the radios are getting ‘front-end’s are getting overloaded.
Radios are all RFM69HCW @ 915
Fortunately (or unfortunately), the ‘lack of comms’ has happened while all nodes were in my office and I could experiment a little.
Here’s what I found:
Powering off any/all the nodes with the gw running to not resolve the problem.Running with only 1 node (either node) at a time did not resolve the problem.
Service status of mosquito always showed it was running (during time of no-comm).
Gateway logs simply stopped – nothing past the point of problem.
My controller s/w continued to poll the nodes with but none responded when issue was in effect.
After power cycling PI everything came back to operational – until it happened again.
Pretty sure Pi is receiving the current it needs during transmit – although I will monitor this as soon as I have time (I’ll need to set up a laptop to continuously record current draw on nodes AND on gateway).In the mean time, I hope this information helps someone discover a potential cause.
My gut feel:
Race condition, or a stuck while(). Again haven’t had time to look through source.
-
@MrRobots that was exactly my first problem, and never find the cause. Just switched to an esp gateway with a good power source.
Then the second issue seems like the gw "banning" the network or a part of it after receiving something it wasn't prepared to. Just my speculation.
Does that last one seems like anything you have?
-
@sergio-rius said in Occasional MySensors network drop outs:
banning
Yes - sounds like it but I haven't had the chance to look much further yet.
The fact that you switched out the radio with a different type (previously RFM69?) and power supply seem to point more towards s/w. I assume your new power is capable of sourcing more current.
That said, when I do get a chance, I'll look at the Pi side of the equation and see if there is something along the lines of 'blocking call' happening (ie. sleep, ISR, etc.) and during that time, an ACK gets transmitted but the g/w never sees it. Sounds unlikely, but just something to think about.Could even be something to do with message queue settings in the mqtt '.conf'. Another place to look at too.
-
@rozpruwacz Thanks! I'd definitely like that so I can be aware of the times the issues are happening.
Thanks for the replies - very helpful.
A bit more info- the gateway I built exactly as here with the NRF radio (inc cap) as an ethernet gateway: https://www.mysensors.org/build/raspberryMy problem happened again just now - suddenly all nodes are unreachable via MYSController and nothing works. 20 or so mins later as I watched it (without intervening) it all came back to life. During the outage I can still communicate to the gateway.
Previously I have had issues with the service crashing when some node spammed it but I don't have nodes like that now.
Perhaps something is interfering with it (someone else's microwave or something- I live in a city centre) but certainly I'm not doing anything to affect it.Maybe I'm better building a gateway then instead of using the Pi - or perhaps using the Sensebender board. Given I'm serious in expanding my sensor network it pays to have a solid core. And a good power supply too I'm sure would help (recommendations?). Is having the Sensebender gateway plugged into the Pi by USB (for wifi access) a good idea or should I go with the esp wifi gateway to keep it all separate? Inclined to do the later.
Thanks for the help so far!
-
@nick-willis said in Occasional MySensors network drop outs:
20 or so mins later as I watched it (without intervening) it all came back to life. During the outage I can still communicate to the gateway.What helped me was to ditch the nrf attached directly to the pi (prevents future HW signing anyway) and use an arduino pro mini as the gateway.
Use a good 3.3V supply for the radio and I soldered most of the connections. Since doing that I've had consistant results.
-
@skywatch How are you connecting the Pro Mini to your LAN? USB to the Pi or via Ethernet?
I'm thinking Ethernet plugged into my router would eliminate the Pi completely. Now to find a way to package it up neatly.. hmmm.
-
It's simply a case of attaching the serial Tx/Rx on the pi to the Tx/Rx on the arduino (use 3.3V arduino or a level shifter if using 5V pro mini)....
Don'e be tempted to use the pi 3.3V supply for the radio or pro mini - take the 5V supply from the pi and use this via a regulator to get cleaner 3.3v...... Originally I used the 662k, but now use AMS1117 3.3V as it is cleaner DC and can give more current for the radio tx pulses.