Hey everyone, I have been using a Raspberry Pi 3 as a gateway, and have several sensors around the house built with Arduino. Everything has been working fine for months, until 2 days ago, out of the blue, everything stopped working.
I am using the RFM69W radio in the Arduino nodes, and the RFM69HW in the gateway. For what I can tell, everything is the same between the radio modules, with the difference that the HW variant supports using higher power, so I figured I would use this in the gateway, and the W variant in the nodes, since some are battery powered, in order to save battery. For months, this has been working flawlessly, and suddenly it stopped. It appears to be happening at the gateway, since it is unlikely 10 sensors have all stopped working at the same time. I have tried changing the RFM69HW module, changing the antenna, and even changing the Raspberry Pi altogether... nothing works. I have been using the development branch of mysensors, which has worked fine thus far, and I also have tried using the 2.3.2 release... same issue... Of course, doing a clean build and removing the /etc/mysensors.eeprom between tests...
I have built a simple "ping-pong" node, which I use to test coverage around the house. The arduino code is super simple, it only calls
bool rcv = send(msg.set(seconds), 1);
to get an ACK. If it does it is counted as success, if not, it is counted as failure, it loops endlessly, sleeping 0.5 Seconds between sending, and it will show results in an LCD display, so I can move around the house. This has worked fine when I was deploying the different sensors around the house.
After compiling the mysgw in the rPi, I started and I see this log
root@openhab:/home/pi/MySensors# ./bin/mysgw
Sep 04 21:13:56 INFO Config file /etc/mysensors.conf does not exist, creating new file.
Sep 04 21:13:56 INFO Starting gateway...
Sep 04 21:13:56 INFO Protocol version - 2.3.2
Sep 04 21:13:56 INFO EEPROM file /etc/mysensors.eeprom does not exist, creating new file.
Sep 04 21:13:56 DEBUG MCO:BGN:INIT GW,CP=RPNGL--X,FQ=NA,REL=255,VER=2.3.2
Sep 04 21:13:56 DEBUG TSF:LRT:OK
Sep 04 21:13:56 DEBUG TSM:INIT
Sep 04 21:13:56 DEBUG TSF:WUR:MS=0
Sep 04 21:13:56 DEBUG TSM:INIT:TSP OK
Sep 04 21:13:56 DEBUG TSM:INIT:GW MODE
Sep 04 21:13:56 DEBUG TSM:READY:ID=0,PAR=0,DIS=0
Sep 04 21:13:56 DEBUG MCO:REG:NOT NEEDED
Sep 04 21:13:56 DEBUG Listening for connections on d~:5003
Sep 04 21:13:56 DEBUG MCO:BGN:STP
Sep 04 21:13:56 DEBUG MCO:BGN:INIT OK,TSP=1
Sep 04 21:13:56 DEBUG TSM:READY:NWD REQ
Sep 04 21:13:56 DEBUG ?TSF:MSG:SEND,0-0-255-255,s=255,c=3,t=20,pt=0,l=0,sg=0,ft=0,st=OK:
Everything looks good... Now I turn on my ping-pong node, and I see this:
Sep 04 21:14:06 DEBUG TSF:MSG:READ,220-220-255,s=255,c=3,t=7,pt=0,l=0,sg=0:
Sep 04 21:14:06 DEBUG TSF:MSG:BC
Sep 04 21:14:06 DEBUG TSF:MSG:FPAR REQ,ID=220
Sep 04 21:14:06 DEBUG TSF:PNG:SEND,TO=0
Sep 04 21:14:06 DEBUG TSF:CKU:OK
Sep 04 21:14:06 DEBUG TSF:MSG:GWL OK
Sep 04 21:14:10 DEBUG !TSF:MSG:SEND,0-0-220-220,s=255,c=3,t=8,pt=1,l=1,sg=0,ft=0,st=NACK:0
Sep 04 21:14:11 DEBUG TSF:MSG:READ,220-220-0,s=255,c=3,t=24,pt=1,l=1,sg=0:1
Sep 04 21:14:11 DEBUG TSF:MSG:PINGED,ID=220,HP=1
Sep 04 21:14:15 DEBUG !TSF:MSG:SEND,0-0-220-220,s=255,c=3,t=25,pt=1,l=1,sg=0,ft=0,st=NACK:1
Sep 04 21:14:15 DEBUG TSF:MSG:READ,220-220-0,s=255,c=3,t=15,pt=6,l=2,sg=0:0100
Sep 04 21:14:19 DEBUG !TSF:MSG:SEND,0-0-220-220,s=255,c=3,t=15,pt=6,l=2,sg=0,ft=0,st=NACK:0100
Sep 04 21:14:20 DEBUG TSF:MSG:READ,220-220-0,s=255,c=0,t=17,pt=0,l=5,sg=0:2.3.2
Sep 04 21:14:21 DEBUG TSF:MSG:READ,220-220-0,s=255,c=0,t=17,pt=0,l=5,sg=0:2.3.2
Sep 04 21:14:22 DEBUG TSF:MSG:READ,220-220-0,s=255,c=3,t=6,pt=1,l=1,sg=0:0
Sep 04 21:14:24 DEBUG TSF:MSG:READ,220-220-0,s=255,c=3,t=6,pt=1,l=1,sg=0:0
Sep 04 21:14:25 DEBUG TSF:MSG:READ,220-220-0,s=255,c=3,t=6,pt=1,l=1,sg=0:0
Sep 04 21:14:28 DEBUG TSF:MSG:READ,220-220-0,s=255,c=3,t=11,pt=0,l=14,sg=0:Ping-Pong Node
Sep 04 21:14:29 DEBUG TSF:MSG:READ,220-220-0,s=255,c=3,t=11,pt=0,l=14,sg=0:Ping-Pong Node
Sep 04 21:14:31 DEBUG TSF:MSG:READ,220-220-0,s=255,c=3,t=11,pt=0,l=14,sg=0:Ping-Pong Node
Sep 04 21:14:32 DEBUG TSF:MSG:READ,220-220-0,s=255,c=3,t=12,pt=0,l=3,sg=0:1.0
Sep 04 21:14:34 DEBUG TSF:MSG:READ,220-220-0,s=255,c=3,t=12,pt=0,l=3,sg=0:1.0
Sep 04 21:14:35 DEBUG TSF:MSG:READ,220-220-0,s=0,c=0,t=36,pt=0,l=0,sg=0:
Sep 04 21:14:36 DEBUG TSF:MSG:READ,220-220-0,s=0,c=0,t=36,pt=0,l=0,sg=0:
Sep 04 21:14:38 DEBUG TSF:MSG:READ,220-220-0,s=0,c=0,t=36,pt=0,l=0,sg=0:
So, the node is sending data just fine, and the GW is receiving data fine, and the problem occurs when the GW attempts to send data back to the node... Now, this could mean the RFM69HW module broke down, or the antenna is broken, or even that something in the rPi GPIO pins or whatever broke down... but being that I have pretty much replaced every single part of it with brand new stuff... brand new rpi, brand new RFM69HW, brand new antenna... and I continue to experience this... Does anyone have ANY suggestion on what to try next? I am a bit clueless here... and any advice is greatly appreciated.
Thanks all,
Franco