@kolaf I've been having issues as well. I was running 1.5 and then 1.6-beta and now 2.0.0-beta. My setup is has 6 "SwitchMotes" from LowPowerLab, all of them with the RFM69HW radios. Given they are always powered from mains, it made sense to make them all repeaters. I have several battery powered nodes which are just plain nodes. I moved to 2.0.0-beta due to a routing loop issue in 1.6 (at least that was what the debugs were telling me) hoping that 2.0.0 would have it resolved. I cleared the EEPROMs of all the nodes and then applied the regular scripts. I watched each one boot and they all found the gateway as their parent directly. everything worked well for a few days and now the nodes have completely stopped passing messages to the gateway. I can tell visually (tx/rx led flashing nearly continuously at the SwitchMotes with no activity on the gateway) there is something looping. it happened before and a route loop ended up in the EEPROM tables, so the nodes would reboot into a loop again. I'll pull some debugs to follow this up.
BenCranston
@BenCranston
Best posts made by BenCranston
-
RE: Testing development branch with RF69HW is not working as it should
-
RE: Testing development branch with RF69HW is not working as it should
Greetings! I've been trying a few things and am reporting in....
I added a 5 minute heartbeat to each of my nodes. I can see them checking in now. However the network still melts down within 24 hours. I replaced the gateway Moteino and have had the same result. The patch that @kolaf suggested basically quadrupled the functional time of the network, which is really cool. Looking at the routing each node is offering up "stale" routes to the gateway thereby creating a loop. Graphically something like this:
What I've been able to determine is that the trigger, at least a several times, is related to the gateway basically going to sleep. A power cycle and we are back in business. The cascade of the routing loop is something like this:
Now, that's two issues..
Looking at just the routing stability. does it make sense to do something like a probe to determine a route is valid before installing in the table? I've yet to review the code base, but a Time To Live in a message would also stop the loop after effectively aging out. I'm sure there is a lively discussion archived somewhere on how the routing works...
The other issue is that my gateway RFM69HW radios "appear" to be going to sleep and then i have to power cycle the Moteino to get it back on the network.. I'm wondering if there is something that is putting the radio in some sort of sleep or low power mode that it's getting stuck there...
sorry for the rambling.
Latest posts made by BenCranston
-
HassOS + serial gateway with MyController
Greetings,
I see folks are getting some success in doing OTA firmware updates with a network connected gateway. I've got a serial connection on my gateway node tied directly to the RPI. If I were to use MyController to do firmware updates, can I add a serial node to the network and be able to make it work? I guess the code updates would be node to node communication in this case, but am unsure. I really don't want to try to re-tool the gateway connection to something network based. I also see there is some work by one of the Home Assistant devs to add the OTA firmware options directly, but that was 2018 and it appears no further work has been done nor has it been merged into mainline code.. Github link to repoI'm also trying to not totally mess up the home assistant integration's view of the network, so I'm a bit hesitant to "just try it" and see...
Anyone have success with a serial gateway on Home Assistant and using another controller via a second node to do OTA firmware updates? BTW, I'm running on LowPowerLab's Moteino's with the stock DualOptiBoot and local flash memory, so I should be able to accept the firmware OTA.
I'd even be happy with a CLI tool that can talk via a local serial node to do the OTA firmware update...
Thanks!
-
RE: rfm69 and atc
@Fleischtorte Sweet!! This I can do. I'll make the changes and start testing tonight. thanks for the explicit help.
-Ben
-
RE: rfm69 and atc
@lafleur @Fleischtorte @frencho I think I'm a bit dense today. I don't understand where the code currently sits in terms of something that could be tested. I see that PR440 was closed and referenced to go back to PR437 or open a new PR? I'd love to give ATC code a go on my RFM69HW's. Is it being integrated into the development branch? How would I go about testing at this time? I did a quick look at the codebase in git and there is no mention of ATC in the MySensors dev branch... A quick diff of the RFM69.cpp code from Felix and MySensors shows a few differences, so they are not 100% in sync. Alas, I'm at a loss on how to apply the work already done to test.. I can "git clone" like a banshee, but beyond that I'm lost with Jenkins. Sorry, again, dense today...
Any advice on how I can help is appreciated. I can test pretty easily. All of my nodes are Moteino's with RFM69HW radios.
Thanks again for everyone's efforts and work to make ATC a reality in the MySensors codebase!
-
RE: Direct Node to Node communications with Signing
@Anticimex Excellent, I'll apply the changes manually. I think I understand the process. I'm planning on adding the additional signing presentation call toward the other node in the presentation section of my sketch right after the regular "present()" calls. Will I need to have both nodes present to each other, or will the single call kick it into motion for bi-directional communications?
-
RE: Direct Node to Node communications with Signing
@Anticimex Yes, I'm on the development branch. I'll give it a try and get back to you. Last night I was looking at the debug messages and then realized that the send was unsigned, hence the "Verify fail" on the receiving node. thanks. refreshing from git now....
-
Direct Node to Node communications with Signing
@Anticimex Greetings! I've been re-doing a few things in the home network and decided I want the ability to send commands directly from one node to another without bouncing thru the gateway. That way a scene controller like node can control other nodes while also updating the gateway of those changes. I've got two of my Moteinos setup for testing. the scene node is sending to the action node without signing. the action node picks up the message and does a "verify fail" as the message is not signed and it's required by my config. Is there a way to craft a node to node message that is signed? thanks!
-
RE: Testing development branch with RF69HW is not working as it should
@kolaf excellent! I'll give that a try. thanks for pointing it out. How are you determining the noise floor on the various frequencies? I'm running my network at 915Mhz for what its worth.
What are your thoughts on the routing looping I've been seeing. I've been able to sort of clean it up for a little while if I can re-establish connectivity right after a node reset and then sending an I_CHILDREN with a payload of "C" to clear the route table. Or at least that's what I think I asked them to do based on reading the API.
Currently I gave up on repeaters in the network a few days ago and moved them all to simple nodes. Still having issues with stability..
-
RE: Testing development branch with RF69HW is not working as it should
Greetings! I've been trying a few things and am reporting in....
I added a 5 minute heartbeat to each of my nodes. I can see them checking in now. However the network still melts down within 24 hours. I replaced the gateway Moteino and have had the same result. The patch that @kolaf suggested basically quadrupled the functional time of the network, which is really cool. Looking at the routing each node is offering up "stale" routes to the gateway thereby creating a loop. Graphically something like this:
What I've been able to determine is that the trigger, at least a several times, is related to the gateway basically going to sleep. A power cycle and we are back in business. The cascade of the routing loop is something like this:
Now, that's two issues..
Looking at just the routing stability. does it make sense to do something like a probe to determine a route is valid before installing in the table? I've yet to review the code base, but a Time To Live in a message would also stop the loop after effectively aging out. I'm sure there is a lively discussion archived somewhere on how the routing works...
The other issue is that my gateway RFM69HW radios "appear" to be going to sleep and then i have to power cycle the Moteino to get it back on the network.. I'm wondering if there is something that is putting the radio in some sort of sleep or low power mode that it's getting stuck there...
sorry for the rambling.
-
RE: Testing development branch with RF69HW is not working as it should
@kolaf There is a distinct possibility that it's two different issues. I'll give your patch a try on a few of my nodes? At one time I had tried to take the recent RFM69 library from Felix's github and use it with MySensors. I really liked the ATC idea that is in the current codebase. I wish I had the stuff to dig down into the code like you did. I'm recovering from a major illness (still on disability) and the meds make it very hard to brain for longer than a few hours a day.
@hek should I be sending periodic heartbeats from my mains powered devices? Would that help the network to maintain convergence?
-
RE: Testing development branch with RF69HW is not working as it should
Ok, here's the gateway's view of the transaction..
[0;255;3;0;9;read: 2-6-0 s=255,c=3,t=11,pt=0,l=11,sg=0:SwitchMote3] [2;255;3;0;11;SwitchMote3] [0;255;3;0;9;read: 2-6-0 s=255,c=3,t=16,pt=0,l=0,sg=0:] [0;255;3;0;9;send: 0-0-6-2 s=255,c=3,t=17,pt=6,l=25,sg=0,st=ok:E206B5FF9C387BB7E1B3BB1FD3C88C05CE14FB6BE015DF204F] [0;255;3;0;9;read: 2-6-0 s=255,c=3,t=12,pt=0,l=5,sg=0:1.1.0] [2;255;3;0;12;1.1.0] [0;255;3;0;9;read: 2-6-0 s=1,c=3,t=16,pt=0,l=0,sg=0:] [0;255;3;0;9;send: 0-0-6-2 s=255,c=3,t=17,pt=6,l=25,sg=0,st=ok:A1EC8E72F0B4B834D9DB0A5099942545B9D1A775D35EE05237] [0;255;3;0;9;read: 2-6-0 s=1,c=0,t=3,pt=0,l=0,sg=0:] [2;1;0;0;3;] [0;255;3;0;9;read: 2-6-0 s=2,c=3,t=16,pt=0,l=0,sg=0:] [0;255;3;0;9;send: 0-0-6-2 s=255,c=3,t=17,pt=6,l=25,sg=0,st=ok:76F66319A6DC1BD70A3D0AF214BC445E0947F38D2A0A087DFE] [0;255;3;0;9;read: 2-6-0 s=2,c=0,t=3,pt=0,l=0,sg=0:] [2;2;0;0;3;] [0;255;3;0;9;read: 2-6-0 s=3,c=3,t=16,pt=0,l=0,sg=0:] [0;255;3;0;9;send: 0-0-6-2 s=255,c=3,t=17,pt=6,l=25,sg=0,st=ok:E00D2258F1A1838AA0B6DAC5F5AB612916F04D94681281D322] [0;255;3;0;9;read: 2-6-0 s=3,c=0,t=3,pt=0,l=0,sg=0:] [2;3;0;0;3;] [0;255;3;0;9;read: 2-6-0 s=4,c=3,t=16,pt=0,l=0,sg=0:] [0;255;3;0;9;send: 0-0-6-2 s=255,c=3,t=17,pt=6,l=25,sg=0,st=ok:42EF7EAB7EF35224564ABCBB832EF8452220C72E6FE571BB33] [0;255;3;0;9;read: 2-6-0 s=4,c=0,t=3,pt=0,l=0,sg=0:] [2;4;0;0;3;] [0;255;3;0;9;read: 2-6-0 s=5,c=3,t=16,pt=0,l=0,sg=0:] [0;255;3;0;9;send: 0-0-6-2 s=255,c=3,t=17,pt=6,l=25,sg=0,st=ok:0FA8B16F58BF135EB9E55E2DA402BAF4BEF26ADDA1812398FB] [0;255;3;0;9;read: 2-6-0 s=5,c=0,t=4,pt=0,l=0,sg=0:] [2;5;0;0;4;] [0;255;3;0;9;sign fail] [0;255;3;0;9;send: 2-0-0-0 s=5,c=0,t=4,pt=0,l=0,sg=0,st=fail:]
I'm using MyController. Here is the node's view of that specific boot.
Starting repeater (RRORAS, 2.0.0-beta) Encryption Enabled Radio init successful. Signing required Skipping security for command 3 type 15 send: 2-2-7-0 s=255,c=3,t=15,pt=0,l=2,sg=0,st=ok:␁␁ Waiting for GW to send signing preferences... Skipping security for command 3 type 15 read: 0-7-2 s=255,c=3,t=15,pt=0,l=2,sg=0:␁␁ Mark node 0 as one that require signed messages Mark node 0 as one that do not require whitelisting Skipping security for command 3 type 16 send: 2-2-7-0 s=255,c=3,t=16,pt=0,l=0,sg=0,st=ok: Nonce requested from 0. Waiting... Skipping security for command 3 type 17 read: 0-7-2 s=255,c=3,t=17,pt=6,l=25,sg=0:AB31DBA98EE07CB671B2963E007346CE5FDD2DA182E2A9C5F6 Nonce received from 0. Proceeding with signing... Signing backend: ATSHA204Soft Message to process: 0200560012FF322E302E302D62657461 Current nonce: AB31DBA98EE07CB671B2963E007346CE5FDD2DA182E2A9C5F6AAAAAAAAAAAAAA HMAC: 8B06824B6BF99F22F06D1F40563FB0A6ABFDF0F2844C6B202CF1FA2454454746 Signature in message: 0106824B6BF99F22F06D1F40563FB0 Message signed Message to send has been signed send: 2-2-7-0 s=255,c=0,t=18,pt=0,l=10,sg=1,st=ok:2.0.0-beta Skipping security for command 3 type 16 send: 2-2-7-0 s=255,c=3,t=16,pt=0,l=0,sg=0,st=ok: Nonce requested from 0. Waiting... Skipping security for command 3 type 17 read: 0-7-2 s=255,c=3,t=17,pt=6,l=25,sg=0:E05E83E7C9D9A40DF0709B227D6A40B0542D1CE5B6819BB03C Nonce received from 0. Proceeding with signing... Signing backend: ATSHA204Soft Message to process: 02000E2306FF07 Current nonce: E05E83E7C9D9A40DF0709B227D6A40B0542D1CE5B6819BB03CAAAAAAAAAAAAAA HMAC: A35B9352DAE7B008EB1A7FD07A5F63CC61C424202996C5862294717CD4840399 Signature in message: 015B9352DAE7B008EB1A7FD07A5F63CC61C424202996C586 Message signed Message to send has been signed send: 2-2-7-0 s=255,c=3,t=6,pt=1,l=1,sg=1,st=ok:7 Skipping security for command 3 type 16 read: 0-7-2 s=255,c=3,t=16,pt=0,l=0,sg=0: Signing backend: ATSHA204Soft SHA256: 51D9E435DB0D76325F2B043EC3C65CFDB4941DDA1783F29641AAAAAAAAAAAAAA Transmittng nonce Skipping security for command 3 type 17 send: 2-2-7-0 s=255,c=3,t=17,pt=6,l=25,sg=0,st=ok:51D9E435DB0D76325F2B043EC3C65CFDB4941DDA1783F29641 Signature in message: 01F9305455406EC03185820C564790A2CA Message to process: 0002460B06FF496D70657269616C Current nonce: 51D9E435DB0D76325F2B043EC3C65CFDB4941DDA1783F29641AAAAAAAAAAAAAA HMAC: 20F9305455406EC03185820C564790A2CA36CA74EDE1C0D7714D5D17E64A0A16 Signature OK read: 0-7-2 s=255,c=3,t=6,pt=0,l=8,sg=0:Imperial Skipping security for ACK on command 3 type 6 send: 2-2-7-0 s=255,c=3,t=6,pt=0,l=8,sg=0,st=ok:Imperial Skipping security for command 3 type 16 send: 2-2-7-0 s=255,c=3,t=16,pt=0,l=0,sg=0,st=ok: Nonce requested from 0. Waiting... Skipping security for command 3 type 17 read: 0-7-2 s=255,c=3,t=17,pt=6,l=25,sg=0:D3152DA42E7827D878A6A3B3069BFE3CF37D015F1CC4B786ED Nonce received from 0. Proceeding with signing... Signing backend: ATSHA204Soft Message to process: 020056C400FFFFFFFFFFFFFFFFFF0300 Current nonce: D3152DA42E7827D878A6A3B3069BFE3CF37D015F1CC4B786EDAAAAAAAAAAAAAA HMAC: E789304B6589D6457D5465A5C3059CE062434B1E0834523764E55032503E0BB9 Signature in message: 0189304B6589D6457D5465A5C3059C Message signed Message to send has been signed send: 2-2-7-0 s=255,c=4,t=0,pt=6,l=10,sg=1,st=ok:FFFFFFFFFFFFFFFF0300 Skipping security for command 3 type 16 send: 2-2-7-0 s=255,c=3,t=16,pt=0,l=0,sg=0,st=ok: Nonce requested from 0. Waiting... Skipping security for command 3 type 17 read: 0-7-2 s=255,c=3,t=17,pt=6,l=25,sg=0:8B3B8DB4F3DAC3070F287736FDD9D47407E6760C5925984F3F Nonce received from 0. Proceeding with signing... Signing backend: ATSHA204Soft Message to process: 02005E030BFF5377697463684D6F746533 Current nonce: 8B3B8DB4F3DAC3070F287736FDD9D47407E6760C5925984F3FAAAAAAAAAAAAAA HMAC: 07EC6A51765F376E8D16CD2C4E091FBBDE5695FED00CABF27E13EF4543429520 Signature in message: 01EC6A51765F376E8D16CD2C4E09 Message signed Message to send has been signed send: 2-2-7-0 s=255,c=3,t=11,pt=0,l=11,sg=1,st=ok:SwitchMote3 Skipping security for command 3 type 16 send: 2-2-7-0 s=255,c=3,t=16,pt=0,l=0,sg=0,st=ok: Nonce requested from 0. Waiting... Skipping security for command 3 type 17 read: 0-7-2 s=255,c=3,t=17,pt=6,l=25,sg=0:E206B5FF9C387BB7E1B3BB1FD3C88C05CE14FB6BE015DF204F Nonce received from 0. Proceeding with signing... Signing backend: ATSHA204Soft Message to process: 02002E030CFF312E312E30 Current nonce: E206B5FF9C387BB7E1B3BB1FD3C88C05CE14FB6BE015DF204FAAAAAAAAAAAAAA HMAC: 656EE1D40B058730D45C690128D07914514295556F3EC0CDE8D5758A9974C95A Signature in message: 016EE1D40B058730D45C690128D0791451429555 Message signed Message to send has been signed send: 2-2-7-0 s=255,c=3,t=12,pt=0,l=5,sg=1,st=ok:1.1.0 Skipping security for command 3 type 16 send: 2-2-7-0 s=1,c=3,t=16,pt=0,l=0,sg=0,st=ok: Nonce requested from 0. Waiting... Skipping security for command 3 type 17 read: 0-7-2 s=255,c=3,t=17,pt=6,l=25,sg=0:A1EC8E72F0B4B834D9DB0A5099942545B9D1A775D35EE05237 Nonce received from 0. Proceeding with signing... Signing backend: ATSHA204Soft Message to process: 020006000301 Current nonce: A1EC8E72F0B4B834D9DB0A5099942545B9D1A775D35EE05237AAAAAAAAAAAAAA HMAC: 582AD81BE5DCDDACCC095AC487A4006BE34C7F9CA04239DCFE2F778BCCCDA407 Signature in message: 012AD81BE5DCDDACCC095AC487A4006BE34C7F9CA04239DCFE Message signed Message to send has been signed send: 2-2-7-0 s=1,c=0,t=3,pt=0,l=0,sg=1,st=ok: Skipping security for command 3 type 16 send: 2-2-7-0 s=2,c=3,t=16,pt=0,l=0,sg=0,st=ok: Nonce requested from 0. Waiting... Skipping security for command 3 type 17 read: 0-7-2 s=255,c=3,t=17,pt=6,l=25,sg=0:76F66319A6DC1BD70A3D0AF214BC445E0947F38D2A0A087DFE Nonce received from 0. Proceeding with signing... Signing backend: ATSHA204Soft Message to process: 020006000302 Current nonce: 76F66319A6DC1BD70A3D0AF214BC445E0947F38D2A0A087DFEAAAAAAAAAAAAAA HMAC: C03E0B0D759CE6CA1D9D2AB097129FFD1CCA295F80E7825BE495F6EE6AC54233 Signature in message: 013E0B0D759CE6CA1D9D2AB097129FFD1CCA295F80E7825BE4 Message signed Message to send has been signed send: 2-2-7-0 s=2,c=0,t=3,pt=0,l=0,sg=1,st=ok: Skipping security for command 3 type 16 send: 2-2-7-0 s=3,c=3,t=16,pt=0,l=0,sg=0,st=ok: Nonce requested from 0. Waiting... Skipping security for command 3 type 17 read: 0-7-2 s=255,c=3,t=17,pt=6,l=25,sg=0:E00D2258F1A1838AA0B6DAC5F5AB612916F04D94681281D322 Nonce received from 0. Proceeding with signing... Signing backend: ATSHA204Soft Message to process: 020006000303 Current nonce: E00D2258F1A1838AA0B6DAC5F5AB612916F04D94681281D322AAAAAAAAAAAAAA HMAC: 382B1AF5470719EE1C00D1B5E70137F69A5F098A75C21685442FBCE764C360B5 Signature in message: 012B1AF5470719EE1C00D1B5E70137F69A5F098A75C2168544 Message signed Message to send has been signed send: 2-2-7-0 s=3,c=0,t=3,pt=0,l=0,sg=1,st=ok: Skipping security for command 3 type 16 send: 2-2-7-0 s=4,c=3,t=16,pt=0,l=0,sg=0,st=ok: Nonce requested from 0. Waiting... Skipping security for command 3 type 17 read: 0-7-2 s=255,c=3,t=17,pt=6,l=25,sg=0:42EF7EAB7EF35224564ABCBB832EF8452220C72E6FE571BB33 Nonce received from 0. Proceeding with signing... Signing backend: ATSHA204Soft Message to process: 020006000304 Current nonce: 42EF7EAB7EF35224564ABCBB832EF8452220C72E6FE571BB33AAAAAAAAAAAAAA HMAC: EA04A9123E20CAF354189809E30AA51A3F57B61A878489427C6F20B33A3D5A2E Signature in message: 0104A9123E20CAF354189809E30AA51A3F57B61A878489427C Message signed Message to send has been signed send: 2-2-7-0 s=4,c=0,t=3,pt=0,l=0,sg=1,st=ok: Skipping security for command 3 type 16 send: 2-2-7-0 s=5,c=3,t=16,pt=0,l=0,sg=0,st=ok: Nonce requested from 0. Waiting... Skipping security for command 3 type 17 read: 0-7-2 s=255,c=3,t=17,pt=6,l=25,sg=0:0FA8B16F58BF135EB9E55E2DA402BAF4BEF26ADDA1812398FB Nonce received from 0. Proceeding with signing... Signing backend: ATSHA204Soft Message to process: 020006000405 Current nonce: 0FA8B16F58BF135EB9E55E2DA402BAF4BEF26ADDA1812398FBAAAAAAAAAAAAAA HMAC: 203850D0A1BF3EEBB94A59ADCC377CC2B0EEA1DB4546ADF05CD47EB65CA1FD58 Signature in message: 013850D0A1BF3EEBB94A59ADCC377CC2B0EEA1DB4546ADF05C Message signed Message to send has been signed send: 2-2-7-0 s=5,c=0,t=4,pt=0,l=0,sg=1,st=ok: Init complete, id=2, parent=7, distance=2
So what we see is that my network has "re-converged" from everyone talking directly to the gateway and now I'm bouncing from node 2 via 7 to 6 than to the gateway.. what gives? these are all one hop from the gateway and should directly talk with it...
Now, this is the first time I've used the Signing Feature. But this routing thing has happened twice before without that feature enabled. It's a real pain in the butt as now the routing is stuck in the NVRAM and even rebooting the nodes won't fix it. The messages are very intermittent and that makes the controls not reliable. I have to pull the nodes out of the wall, run the clear config script and then reload the switch script to heal the network. I'm at a loss... thanks for any help.
Oh, and my code for the Mote is here: https://github.com/TheCranston/MY-Sensors.git