Signing fails due to NACKed messages



  • Hello everyone!

    I am experiencing issues with my "pilot wire" node (controlling one of my heaters) since a few days.

    At 8:00 a.m., the MySensors gateway is supposed to send a V_PERCENTAGE message to the heater with "0" as payload, so that it goes into comfort mode and starts heating.
    At the opposite, at 10:00 p.m., a V_PERCENTAGE message with "3" as payload is sent to the heater so that it goes into "eco" mode.

    All communications are encrypted on my MySensors network and the pilot wire node requires signed messages. Everything was correctly set up (after long discussions here 😜) and it has been working well for months.

    However, I'm starting to get "!TSF:MSG:SIGN FAIL" messages on the MySensors gateway and the heater therefore fails to change of mode. Here are relevant logs:

    Nov 10 07:35:21 DEBUG TSM:READY:NWD REQ
    Nov 10 07:35:21 DEBUG TSF:MSG:SEND,0-0-255-255,s=255,c=3,t=20,pt=0,l=0,sg=0,ft=0,st=OK:
    Nov 10 07:35:21 DEBUG TSF:SAN:OK
    Nov 10 07:50:21 DEBUG TSF:SAN:OK
    Nov 10 07:55:21 DEBUG TSM:READY:NWD REQ
    Nov 10 07:55:21 DEBUG TSF:MSG:SEND,0-0-255-255,s=255,c=3,t=20,pt=0,l=0,sg=0,ft=0,st=OK:
    Nov 10 08:00:01 DEBUG !TSF:MSG:SEND,0-0-2-2,s=0,c=3,t=16,pt=0,l=0,sg=1,ft=0,st=NACK:
    Nov 10 08:00:01 DEBUG !TSF:MSG:SIGN FAIL
    Nov 10 08:05:21 DEBUG TSF:SRT:OK
    Nov 10 08:05:21 DEBUG TSF:SAN:OK
    Nov 10 08:15:21 DEBUG TSM:READY:NWD REQ
    Nov 10 08:15:21 DEBUG TSF:MSG:SEND,0-0-255-255,s=255,c=3,t=20,pt=0,l=0,sg=0,ft=0,st=OK:
    Nov 10 08:20:21 DEBUG TSF:SAN:OK
    Nov 10 08:35:21 DEBUG TSF:SRT:OK
    Nov 10 08:35:21 DEBUG TSM:READY:NWD REQ
    Nov 10 08:35:21 DEBUG TSF:MSG:SEND,0-0-255-255,s=255,c=3,t=20,pt=0,l=0,sg=0,ft=0,st=OK:
    Nov 10 08:35:21 DEBUG TSF:SAN:OK
    Nov 10 08:50:21 DEBUG TSF:SAN:OK
    Nov 10 08:55:21 DEBUG TSM:READY:NWD REQ
    Nov 10 08:55:21 DEBUG TSF:MSG:SEND,0-0-255-255,s=255,c=3,t=20,pt=0,l=0,sg=0,ft=0,st=OK:
    Nov 10 09:05:21 DEBUG TSF:SRT:OK
    Nov 10 09:05:21 DEBUG TSF:SAN:OK
    Nov 10 09:15:21 DEBUG TSM:READY:NWD REQ
    Nov 10 09:15:21 DEBUG TSF:MSG:SEND,0-0-255-255,s=255,c=3,t=20,pt=0,l=0,sg=0,ft=0,st=OK:
    Nov 10 09:20:21 DEBUG TSF:SAN:OK
    Nov 10 09:35:21 DEBUG TSF:SRT:OK
    Nov 10 09:35:21 DEBUG TSM:READY:NWD REQ
    Nov 10 09:35:21 DEBUG TSF:MSG:SEND,0-0-255-255,s=255,c=3,t=20,pt=0,l=0,sg=0,ft=0,st=OK:
    Nov 10 09:35:21 DEBUG TSF:SAN:OK
    Nov 10 09:50:21 DEBUG TSF:SAN:OK
    Nov 10 09:55:21 DEBUG TSM:READY:NWD REQ
    Nov 10 09:55:21 DEBUG TSF:MSG:SEND,0-0-255-255,s=255,c=3,t=20,pt=0,l=0,sg=0,ft=0,st=OK:
    Nov 10 10:05:21 DEBUG TSF:SRT:OK
    Nov 10 10:05:21 DEBUG TSF:SAN:OK
    Nov 10 10:10:30 DEBUG !TSF:MSG:SEND,0-0-2-2,s=0,c=3,t=16,pt=0,l=0,sg=1,ft=0,st=NACK:
    Nov 10 10:10:30 DEBUG !TSF:MSG:SIGN FAIL
    Nov 10 10:10:30 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=17,pt=6,l=25,sg=1:<NONCE>
    Nov 10 10:10:34 DEBUG TSF:MSG:SEND,0-0-2-2,s=0,c=3,t=16,pt=0,l=0,sg=1,ft=0,st=OK:
    Nov 10 10:10:34 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=17,pt=6,l=25,sg=1:<NONCE>
    Nov 10 10:10:34 DEBUG TSF:MSG:SEND,0-0-2-2,s=0,c=1,t=3,pt=0,l=1,sg=1,ft=0,st=OK:0
    

    As you can see, when I noticed that the heater was cold at 10:10 a.m. this morning, I changed the mode back to something else ("Off", which failed) and changed again to comfort (and this time, it worked).

    Do you have any clue why this is happening? Note that I haven't touched to the hardware.
    Is there a way to tell MySensors to re-send a message if it fails? Or maybe I could do something on the HomeAssistant side?

    Also, note that the radios used are RFM69W modules. Regarding location, the node is at the basement, the gateway is upstairs and if you drew a line through the walls, the distance between both is approximately 15 meters, I'd say.

    Thanks in advance for your help!

    #-------------------- Appendix --------------------#

    My Raspberry Pi serial Gateway:
    0_1573387957073_P_20191110_130510.jpg

    My pilot wire prototype:
    0_1573388005746_pilot_wire.jpg



  • Hello!

    I've just noticed in the logs I provided that the node doesn't respond to network discovery requests... So I'd say there is definitely something wrong with my setup now...

    I rebooted the gateway and the node and it is working again...
    I guess the issue is temporarily resolved.

    I have a question: does the TX/RX power of the (RFM69W) radio automatically changes over time, depending on signal quality?
    I think I've read that somewhere on the MySensors forums but I may be wrong.

    If it does, maybe the radio signal was automatically set to a too low value at some point?



  • Hello!

    The issue persists finally, but network discovery seems to be working, it seems... Here are relevant logs:

    Nov 14 21:32:00 DEBUG TSM:READY:NWD REQ
    Nov 14 21:32:00 DEBUG TSF:MSG:SEND,0-0-255-255,s=255,c=3,t=20,pt=0,l=0,sg=0,ft=0,st=OK:
    Nov 14 21:32:00 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=21,pt=1,l=1,sg=1:0
    Nov 14 21:41:59 DEBUG TSF:SRT:OK
    Nov 14 21:42:00 DEBUG TSF:SAN:OK
    Nov 14 21:52:00 DEBUG TSM:READY:NWD REQ
    Nov 14 21:52:00 DEBUG TSF:MSG:SEND,0-0-255-255,s=255,c=3,t=20,pt=0,l=0,sg=0,ft=0,st=OK:
    Nov 14 21:52:00 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=21,pt=1,l=1,sg=1:0
    Nov 14 21:57:00 DEBUG TSF:SAN:OK
    Nov 14 22:00:00 DEBUG !TSF:MSG:SEND,0-0-2-2,s=0,c=3,t=16,pt=0,l=0,sg=1,ft=0,st=NACK:
    Nov 14 22:00:00 DEBUG !TSF:MSG:SIGN FAIL
    Nov 14 22:00:00 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=17,pt=6,l=25,sg=1:<NONCE>
    Nov 14 22:00:41 DEBUG TSF:MSG:SEND,0-0-2-2,s=0,c=3,t=16,pt=0,l=0,sg=1,ft=0,st=OK:
    Nov 14 22:00:42 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=17,pt=6,l=25,sg=1:<NONCE>
    Nov 14 22:00:42 DEBUG TSF:MSG:SEND,0-0-2-2,s=0,c=1,t=3,pt=0,l=1,sg=1,ft=0,st=OK:6
    Nov 14 22:00:43 DEBUG TSF:MSG:SEND,0-0-2-2,s=0,c=3,t=16,pt=0,l=0,sg=1,ft=0,st=OK:
    Nov 14 22:00:43 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=17,pt=6,l=25,sg=1:<NONCE>
    Nov 14 22:00:43 DEBUG TSF:MSG:SEND,0-0-2-2,s=0,c=1,t=3,pt=0,l=1,sg=1,ft=0,st=OK:3
    

    As you can see, the heater was supposed to go into "Eco" mode at 22:00:00 but signing failed.
    It seems that the nonce was received though, but the nonce request wasn't acknowledged...

    At 22:00:43, I changed manually the mode to "Off" and to "Eco" again (in HomeAssistant) and it finally worked...

    Do you have any clue why this is happening?
    Maybe @Anticimex -- the awesome security guy of MySensors -- has an idea? 😛


  • Contest Winner

    @encrypt the only things that comes to my mind is that your rf environment has shifted slightly or the noise level of the supply to the radio has increased to cause intermittent failure in the transmissions.
    You could try to switch channel for the radio and see if it helps? Or measure the stability of the power supply to the radio when the node(s) are communicating.



  • Hello @Anticimex, thanks for your answer!

    You're right, since nothing changed on the hardware side, it may be a noise / power supply issue.

    Also, I should really finish making clean MySensors PCBs for that project and not rely on breadboards and stuff like that which is very likely to degrade the signal.

    I'll keep you updated with my progress, thanks again 🙂


  • Contest Winner

    @encrypt for sure, going pcb will give the radio supply a better power signal (just remember to not make the trace too thin).
    But as you have had the rig operational for a long time, it might be worth checking other radio channels as well as perhaps some neighbor got some dirty rf toy or similar that mess with the rf environment.



  • Well, I'm using RFM69W radios running on the 868MHz band.
    So, there should be a very low traffic on that band (compared to Wi-Fi).

    I'll probably add a capacitor near Vcc pin of the radio, I've just seen on the datasheet that it's recommended , although it seems there is already one on the board.

    Also, regarding the gateway, I plan to make a dipole antenna, since I don't have any space constraint.


  • Contest Winner

    @encrypt 868 MHz is not unique so there could be interference. But yes, decouple the hell out of it. You cannot go overboard with decoupling so if you have the space, go for both electrolythic and ceramic caps. Perhaps even a tantal to try to catch as many frequencies as possible.
    Antenna will also help a ton if it is properly matched.



  • Hello again!

    I restarted the node this morning, just out of curiosity...

    I'm actually very surprised to see that presentation works without any problem (and the node sends its initial value correctly), but commands sent later failed...

    Here are the logs:

    Nov 16 12:12:16 DEBUG TSF:MSG:READ,2-2-255,s=255,c=3,t=7,pt=0,l=0,sg=0:
    Nov 16 12:12:16 DEBUG TSF:MSG:BC
    Nov 16 12:12:16 DEBUG TSF:MSG:FPAR REQ,ID=2
    Nov 16 12:12:16 DEBUG TSF:PNG:SEND,TO=0
    Nov 16 12:12:16 DEBUG TSF:CKU:OK
    Nov 16 12:12:16 DEBUG TSF:MSG:GWL OK
    Nov 16 12:12:16 DEBUG TSF:MSG:SEND,0-0-2-2,s=255,c=3,t=8,pt=1,l=1,sg=0,ft=0,st=OK:0
    Nov 16 12:12:18 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=24,pt=1,l=1,sg=0:1
    Nov 16 12:12:18 DEBUG TSF:MSG:PINGED,ID=2,HP=1
    Nov 16 12:12:18 DEBUG TSF:MSG:SEND,0-0-2-2,s=255,c=3,t=25,pt=1,l=1,sg=0,ft=0,st=OK:1
    Nov 16 12:12:18 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=15,pt=6,l=2,sg=0:0101
    Nov 16 12:12:18 DEBUG TSF:MSG:SEND,0-0-2-2,s=255,c=3,t=15,pt=6,l=2,sg=0,ft=0,st=OK:0100
    Nov 16 12:12:18 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=16,pt=0,l=0,sg=0:
    Nov 16 12:12:18 DEBUG TSF:MSG:SEND,0-0-2-2,s=255,c=3,t=17,pt=6,l=25,sg=0,ft=0,st=OK:<NONCE>
    Nov 16 12:12:18 DEBUG TSF:MSG:READ,2-2-0,s=255,c=0,t=17,pt=0,l=5,sg=1:2.3.1
    Nov 16 12:12:19 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=16,pt=0,l=0,sg=1:
    Nov 16 12:12:19 DEBUG TSF:MSG:SEND,0-0-2-2,s=255,c=3,t=17,pt=6,l=25,sg=1,ft=0,st=OK:<NONCE>
    Nov 16 12:12:19 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=6,pt=1,l=1,sg=1:0
    Nov 16 12:12:19 DEBUG TSF:MSG:SEND,0-0-2-2,s=255,c=3,t=16,pt=0,l=0,sg=1,ft=0,st=OK:
    Nov 16 12:12:19 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=17,pt=6,l=25,sg=1:<NONCE>
    Nov 16 12:12:19 DEBUG TSF:MSG:SEND,0-0-2-2,s=255,c=3,t=6,pt=0,l=1,sg=1,ft=0,st=OK:M
    Nov 16 12:12:19 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=16,pt=0,l=0,sg=1:
    Nov 16 12:12:19 DEBUG TSF:MSG:SEND,0-0-2-2,s=255,c=3,t=17,pt=6,l=25,sg=1,ft=0,st=OK:<NONCE>
    Nov 16 12:12:19 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=11,pt=0,l=10,sg=1:Fil pilote
    Nov 16 12:12:20 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=16,pt=0,l=0,sg=1:
    Nov 16 12:12:20 DEBUG TSF:MSG:SEND,0-0-2-2,s=255,c=3,t=17,pt=6,l=25,sg=1,ft=0,st=OK:<NONCE>
    Nov 16 12:12:20 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=12,pt=0,l=3,sg=1:1.0
    Nov 16 12:12:20 DEBUG TSF:MSG:READ,2-2-0,s=0,c=3,t=16,pt=0,l=0,sg=1:
    Nov 16 12:12:20 DEBUG TSF:MSG:SEND,0-0-2-2,s=255,c=3,t=17,pt=6,l=25,sg=1,ft=0,st=OK:<NONCE>
    Nov 16 12:12:21 DEBUG TSF:MSG:READ,2-2-0,s=0,c=0,t=4,pt=0,l=0,sg=1:
    Nov 16 12:12:21 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=26,pt=1,l=1,sg=1:2
    Nov 16 12:12:21 DEBUG TSF:MSG:SEND,0-0-2-2,s=255,c=3,t=16,pt=0,l=0,sg=1,ft=0,st=OK:
    Nov 16 12:12:21 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=17,pt=6,l=25,sg=1:<NONCE>
    Nov 16 12:12:21 DEBUG TSF:MSG:SEND,0-0-2-2,s=255,c=3,t=27,pt=1,l=1,sg=1,ft=0,st=OK:1
    Nov 16 12:12:21 DEBUG TSF:MSG:READ,2-2-0,s=0,c=3,t=16,pt=0,l=0,sg=1:
    Nov 16 12:12:21 DEBUG TSF:MSG:SEND,0-0-2-2,s=255,c=3,t=17,pt=6,l=25,sg=1,ft=0,st=OK:<NONCE>
    Nov 16 12:12:21 DEBUG TSF:MSG:READ,2-2-0,s=0,c=1,t=2,pt=2,l=2,sg=1:1
    Nov 16 12:12:21 DEBUG TSF:MSG:READ,2-2-0,s=0,c=3,t=16,pt=0,l=0,sg=1:
    Nov 16 12:12:22 DEBUG TSF:MSG:SEND,0-0-2-2,s=255,c=3,t=17,pt=6,l=25,sg=1,ft=0,st=OK:<NONCE>
    Nov 16 12:12:22 DEBUG TSF:MSG:READ,2-2-0,s=0,c=1,t=3,pt=2,l=2,sg=1:2
    Nov 16 12:13:21 DEBUG TSF:MSG:SEND,0-0-2-2,s=0,c=3,t=16,pt=0,l=0,sg=1,ft=0,st=OK:
    Nov 16 12:13:21 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=17,pt=6,l=25,sg=1:<NONCE>
    Nov 16 12:13:21 DEBUG TSF:MSG:SEND,0-0-2-2,s=0,c=1,t=3,pt=0,l=1,sg=1,ft=0,st=OK:3
    Nov 16 12:13:25 DEBUG !TSF:MSG:SEND,0-0-2-2,s=0,c=3,t=16,pt=0,l=0,sg=1,ft=0,st=NACK:
    Nov 16 12:13:25 DEBUG !TSF:MSG:SIGN FAIL
    Nov 16 12:13:25 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=17,pt=6,l=25,sg=1:<NONCE>
    Nov 16 12:13:30 DEBUG !TSF:MSG:SEND,0-0-2-2,s=0,c=3,t=16,pt=0,l=0,sg=1,ft=0,st=NACK:
    Nov 16 12:13:30 DEBUG !TSF:MSG:SIGN FAIL
    Nov 16 12:13:30 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=17,pt=6,l=25,sg=1:<NONCE>
    Nov 16 12:13:39 DEBUG !TSF:MSG:SEND,0-0-2-2,s=0,c=3,t=16,pt=0,l=0,sg=1,ft=0,st=NACK:
    Nov 16 12:13:39 DEBUG !TSF:MSG:SIGN FAIL
    Nov 16 12:13:39 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=17,pt=6,l=25,sg=1:<NONCE>
    Nov 16 12:13:44 DEBUG !TSF:MSG:SEND,0-0-2-2,s=0,c=3,t=16,pt=0,l=0,sg=1,ft=0,st=NACK:
    Nov 16 12:13:44 DEBUG !TSF:MSG:SIGN FAIL
    Nov 16 12:13:44 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=17,pt=6,l=25,sg=1:<NONCE>
    Nov 16 12:13:47 DEBUG TSF:MSG:SEND,0-0-2-2,s=0,c=3,t=16,pt=0,l=0,sg=1,ft=0,st=OK:
    Nov 16 12:13:47 DEBUG TSF:MSG:READ,2-2-0,s=255,c=3,t=17,pt=6,l=25,sg=1:<NONCE>
    Nov 16 12:13:47 DEBUG TSF:MSG:SEND,0-0-2-2,s=0,c=1,t=3,pt=0,l=1,sg=1,ft=0,st=OK:0
    

    As you can see, the node finished starting at 12:13:25 but my first command at 12:13:25 failed.

    If it was really a noise / power supply issue, I would expect presentation to fail, considering that it's a long process taking more than 5s and using the radio quite intensively.

    I've read that people here got range issue with MySensors 2.3.x, isn't that linked? Here for instance:

    @tekka: It seems you were working on that, was it supposed to be fixed in version 2.3.1?

    I can't enable radio logging easily but I suspect it could be caused by the auto-adjusting RSSI feature...


  • Contest Winner

    @encrypt noise sensitivity could be temperature related if the radio runs "hot"


  • Admin

    @encrypt said in Signing fails due to NACKed messages:

    I can't enable radio logging easily but I suspect it could be caused by the auto-adjusting RSSI feature...

    I'm afraid, but without additional debug information it's gonna stay a guessing game. If possible, enable

    #define MY_DEBUG_VERBOSE_SIGNING
    #define MY_DEBUG_VERBOSE_RFM69
    

    for additional info. Also, please post the node sketch to interpret the log in the right context.


 

240
Online

9.0k
Users

9.7k
Topics

101.3k
Posts