💬 Building a Raspberry Pi Gateway
-
I've been using your restarting method and its been working ok and has done one restart in 5 days.
However I now see something else where altho cpu use is fine, none of my nodes can communicate to the gateway. My controller can connect to the gateway no problem but there are no messages received. Restarting nodes does not help. Only a manual restart of the gateway gets things going again - at least for a while anyway.
Where is the log output from the gateway to be found? I know it can be run from the bash prompt to write the output to screen but how about logging the installed service output?@nick-willis the log is sent to syslog by default. See https://www.mysensors.org/build/raspberry#troubleshooting for details.
-
@marceloaqno : This is strange. I am using a RPi 1 on this gateway with raspbian Jessie. Beside the GW-service there is only a small python programm which listens on an UPD-Port and plays an mp3 if it receives suitable data (for doorbell). Since March 4, 20:15 there were several restarts due to high cpu usage (after dis-applying your patch):
2018-03-04 23:01:03 cpu (99.7) very high! --> restarting service 2018-03-05 07:31:03 cpu (99.7) very high! --> restarting service 2018-03-05 11:01:03 cpu (98.7) very high! --> restarting service 2018-03-05 13:31:04 cpu (94.7) very high! --> restarting service 2018-03-06 11:31:03 cpu (99.7) very high! --> restarting service 2018-03-06 12:31:03 cpu (98.7) very high! --> restarting service 2018-03-07 06:01:03 cpu (94.8) very high! --> restarting service 2018-03-07 22:31:03 cpu (99.8) very high! --> restarting serviceWith this "auto-restarts" the sensors are working fine, I did not notice any problems.
However, I just upgraded to the recent Stretch (what takes a LONG time on RPi 1 and made me struggle a little bit with the new network names). I also rebuilt the GW of course and will report here what will happen....
@Nick-Willis : did you apply the patch of marceloaqno or not? Did you enable debugging? (see here )
Cheers,
Otto -
@marceloaqno : This is strange. I am using a RPi 1 on this gateway with raspbian Jessie. Beside the GW-service there is only a small python programm which listens on an UPD-Port and plays an mp3 if it receives suitable data (for doorbell). Since March 4, 20:15 there were several restarts due to high cpu usage (after dis-applying your patch):
2018-03-04 23:01:03 cpu (99.7) very high! --> restarting service 2018-03-05 07:31:03 cpu (99.7) very high! --> restarting service 2018-03-05 11:01:03 cpu (98.7) very high! --> restarting service 2018-03-05 13:31:04 cpu (94.7) very high! --> restarting service 2018-03-06 11:31:03 cpu (99.7) very high! --> restarting service 2018-03-06 12:31:03 cpu (98.7) very high! --> restarting service 2018-03-07 06:01:03 cpu (94.8) very high! --> restarting service 2018-03-07 22:31:03 cpu (99.8) very high! --> restarting serviceWith this "auto-restarts" the sensors are working fine, I did not notice any problems.
However, I just upgraded to the recent Stretch (what takes a LONG time on RPi 1 and made me struggle a little bit with the new network names). I also rebuilt the GW of course and will report here what will happen....
@Nick-Willis : did you apply the patch of marceloaqno or not? Did you enable debugging? (see here )
Cheers,
Otto@otto001 In a new attempt to trigger this problem, I am using a script that will force the fhem controller to reconnect every 30s to the gateway (built with mysensors master branch, unpatched), and sends me a note in case of high CPU usage:
I'll leave it running for a few days to see what happens.
Here is the script, if anyone is interested:
#!/bin/sh while true; do (echo "set mysgw disconnect"; echo "quit") | telnet localhost 7072 > /dev/null 2>&1 sleep 1 (echo "set mysgw connect"; echo "quit") | telnet localhost 7072 > /dev/null 2>&1 sleep 1 cpu_percent=$(ps -C "mysgw" -o %cpu=) cpu_percent=${cpu_percent%%.*} if [ "$cpu_percent" -ge 90 ] then echo "ALERT: HIGH CPU USAGE" # pushbullet script from https://gist.github.com/outadoc/189bd3ccbf5d6e0f39e4 /home/pi/pushbullet.sh "RPi1: HIGH CPU USAGE" exit fi sleep 30 done -
@marceloaqno : Thank you!
Just a notice: I do NOT disconnect, I just connect every 30 minutes. Has been working for me since more than 2 years (ancient ms-gw version on RPi).Yesterday I did a dist-upgrade to stretch on the "problematic" gw (as mentioned above) - without your patch but with my monitoring-script running. It is a little bit to early I presume, but no high cpu problems so far. I will report back here in a few days. Maybe this problem is really related to old libraries or something?! If this is running stable for a few days I will apply your patch and see what happens.
Thanks again!Cheers,
Pula -
@otto001 After a few hours running the reconnect script, I noticed several error messages in my gateway log:
accept(): Too many open filesThe problem was that the gateway wasn't releasing the socket descriptor after each disconnection of the controller. After reaching the limit (mine was 1036) the controller can't connect.
I'm not entirely sure that this problem is related to what you're having. You can check how many file descriptors currently opened by the gateway with the command below
sudo lsof -u root |grep mysgw |wc -lThe fix:
https://github.com/marceloaqno/MySensors/commit/a40d4441b7460225100398ff6f2581c2b0df36ea -
How do I update from master branch to development branch?
-
I'm using a raspberry pi gateway in home-assistant.
Sometimes my gateway doesn't work anymore, and when I restart the service, it is running good
I used ./configure --my-transport=nrf24 --my-rf24-irq-pin=15 --my-gateway=ethernet --my-port=5003 -
I'm using a raspberry pi gateway in home-assistant.
Sometimes my gateway doesn't work anymore, and when I restart the service, it is running good
I used ./configure --my-transport=nrf24 --my-rf24-irq-pin=15 --my-gateway=ethernet --my-port=5003@gieljnssns The fixes I mentioned above haven't yet been applied to the development branch. Eventually they will be, but I'm still doing some testing.
-
@marceloaqno :
Thank you!!! Should this patch be applied together with your first patch?
Interesting:
After some days without problems on the newest raspbian packages the gw hung again, but WITHOUT high cpu usage (and WITHOUT your first patch applied). There were just no more reads from the sensors in the log, only the re-connects of fhem. After restarting the service, everything is fine again.
Unfortunately I did see your post after that :-(
But I checked the log and could not find the words "too" or "Too" in the log (running service with -d).
Some minutes after restarting I checked how many open files there were with the command you supplied: 21In the meantime I have created a small udp listening service which restarts the service if a certain string is received, so I can restart the service by double-clicking one of my switches without the need of login to the gw (also for the wife)....
@gieljnssns : I think you are experiencing exactly the same problem we are discussing :-)
Thanks again!
Cheers,
Otto -
@marceloaqno :
Thank you!!! Should this patch be applied together with your first patch?
Interesting:
After some days without problems on the newest raspbian packages the gw hung again, but WITHOUT high cpu usage (and WITHOUT your first patch applied). There were just no more reads from the sensors in the log, only the re-connects of fhem. After restarting the service, everything is fine again.
Unfortunately I did see your post after that :-(
But I checked the log and could not find the words "too" or "Too" in the log (running service with -d).
Some minutes after restarting I checked how many open files there were with the command you supplied: 21In the meantime I have created a small udp listening service which restarts the service if a certain string is received, so I can restart the service by double-clicking one of my switches without the need of login to the gw (also for the wife)....
@gieljnssns : I think you are experiencing exactly the same problem we are discussing :-)
Thanks again!
Cheers,
Otto@otto001 The second patch can be applied without the first one.
-
@marceloaqno :
THANKS! Just applied your patch.
I will report in a few days when I can see if the problem is solved...
Cheers,
Otto -
Can someone tell me how to apply this patch?
-
@gohan :
dietpi looks interesting! did not know about this yet.
what hardware are you using? and what version of mysgw? -
I need help with my Rpi mqtt-gw.
I tried to get dev-branch working but changed now to stable. All the install routines run without problems. The service just doesn't seem to work and I can't figure out why.- Rpi W, Raspbian wheezy, nRF24-PA, MQTT-gw, Mosquitto and Openhab2 on same board
Mosquitto-log:
1521010232: New connection from 127.0.0.1 on port 1883. 1521010232: New client connected from 127.0.0.1 as mygateway1 (c1, k15, u'xxxx'). 1521010232: Socket error on client mygateway1, disconnecting.syslog:
Mar 14 08:51:02 GwMqOH2 systemd[1]: Starting MySensors Gateway daemon... Mar 14 08:51:02 GwMqOH2 systemd[1]: Started MySensors Gateway daemon. Mar 14 08:51:03 GwMqOH2 mysgw: Starting gateway... Mar 14 08:51:03 GwMqOH2 mysgw: Protocol version - 2.2.0 Mar 14 08:51:03 GwMqOH2 mysgw: MCO:BGN:INIT GW,CP=RNNGLSQX,VER=2.2.0 Mar 14 08:51:03 GwMqOH2 mysgw: TSF:LRT:OK Mar 14 08:51:03 GwMqOH2 mysgw: TSM:INIT Mar 14 08:51:03 GwMqOH2 mysgw: TSF:WUR:MS=0 Mar 14 08:51:03 GwMqOH2 mysgw: TSM:INIT:TSP OK Mar 14 08:51:03 GwMqOH2 mysgw: TSM:INIT:GW MODE Mar 14 08:51:03 GwMqOH2 mysgw: TSM:READY:ID=0,PAR=0,DIS=0 Mar 14 08:51:03 GwMqOH2 mysgw: MCO:REG:NOT NEEDED Mar 14 08:51:03 GwMqOH2 mysgw: MCO:BGN:STP Mar 14 08:51:03 GwMqOH2 mysgw: MCO:BGN:INIT OK,TSP=1 Mar 14 08:51:03 GwMqOH2 mysgw: GWT:RMQ:MQTT RECONNECT Mar 14 08:51:03 GwMqOH2 mysgw: connected to 127.0.0.1 Mar 14 08:51:03 GwMqOH2 mysgw: GWT:RMQ:MQTT CONNECTED Mar 14 08:51:03 GwMqOH2 mysgw: GWT:TPS:TOPIC=mysensors-out/0/255/0/0/18,MSG SENT Mar 14 08:51:03 GwMqOH2 systemd[1]: mysgw.service: main process exited, code=killed, status=11/SEGV Mar 14 08:51:03 GwMqOH2 systemd[1]: Unit mysgw.service entered failed state``` GW configure-line:./configure --my-transport=nrf24 --my-gateway=mqtt --my-controller-ip-address=127.0.0.1 --my-mqtt-publish-topic-prefix=mysensors-out --my-mqtt-subscribe-topic-prefix=mysensors-in --my-mqtt-client-id=mygateway1 --my-rf24-irq-pin=15 --my-leds-err-pin=12 --my-leds-rx-pin=16 --my-leds-tx-pin=18 --my-mqtt-user=xxxx --my-mqtt-password=yyyyy --my-signing=password --my-signing-password=zzzzzz```
I have checked mosquitto has correct user-pw combination, but what should I check next? Help!
- Rpi W, Raspbian wheezy, nRF24-PA, MQTT-gw, Mosquitto and Openhab2 on same board
-
@masmat said in 💬 Building a Raspberry Pi Gateway:
--my-mqtt-client-id=mygateway1
Do you have another client named mygateway1 connecting to mqtt? Try changing it to a different name just to play safe