Radio FAIL after ~3 weeks [SOLVED]
-
@Reza
Based on your answer I have to confess I didn´t read your log carefully enough - being fixed to missing ACKs I didn´t pay attention to successful transmissions.Difference between Node 160 log and Node 4 log is the complete absence of any messages from Node 0 (GW) in Node 160 log, whereas in Node 4 log there are answers from GW.
So my next question would be: are these failures consistent?
If you are not feed up already, would you please log some boot sequences from Node 4? And some button-presses too?
I think it would be interesting if failures are the exact identical or just almost identical.
(exact identical favors software problem, almost identical favors hardware problem).A parallel log from the GW would be nice too. Maybe there is a problem in the ACK system.
If you don´t trust your OrangePi hardware - would a temporarily switch to Laptop/Windows/Linux make much effort? Just to prove integrity of GW-Arduino and radio.
-
@Reza
I forgot - you are completely right. Why is there any need for a repeater node in between at 1m distance?To get a clean solution you could isolate your GW and Node 4 by changing RF-channels and have a look at their private conversation without being disturbed by other (healthy) nodes.
-
Coul it be you are using modules with different chips? See https://forum.mysensors.org/topic/1153/we-are-mostly-using-fake-nrf24l01-s-but-worse-fakes-are-emerging
-
@Reza
I forgot - you are completely right. Why is there any need for a repeater node in between at 1m distance?To get a clean solution you could isolate your GW and Node 4 by changing RF-channels and have a look at their private conversation without being disturbed by other (healthy) nodes.
@tboha
sorry for english
generally , my problem is connection.
i have a controller (domoticz on orangepi) and one serial gateway with usb cable to controller. and some sensors in all of house (Sporadic) and some relay for test just in my room. id 4 or 160 is choose random for test. i am near controller and test some sketch, some relay , radio and etc...
so my problem is just connection between relays and gateway. sensors is good and work well and report all of states. but relays. first can not connection and show error :
(4143 TSF:MSG:SEND,5-5-255-255,s=255,c=3,t=7,pt=0,l=0,sg=0,ft=0,st=OK:
6150 !TSM:FPAR:NO REPLY
6152 TSM:FPAR
6188 TSF:MSG:SEND,5-5-255-255,s=255,c=3,t=7,pt=0,l=0,sg=0,ft=0,st=OK:
8195 !TSM:FPAR:FAIL
8196 TSM:FAIL:CNT=1
8198 TSM:FAIL:PDT)
but after some time (with some power off/on) relay can connect to gateway . but after connect this is very unstable and some command send and some command dont send and show error ( NACK...)
i test and i change hardware and sketch but dont resolve :(A parallel log from the GW would be nice too. Maybe there is a problem in the ACK system.
this is a solution ?please explain more . what am i do ?
use a repeater near gateway and use this for connect all of relays to gateway? -
Coul it be you are using modules with different chips? See https://forum.mysensors.org/topic/1153/we-are-mostly-using-fake-nrf24l01-s-but-worse-fakes-are-emerging
@Jan-Gatzke
my radios is not fake. i use 3 type of radio nrf+ but i have problem -
@Jan-Gatzke
my radios is not fake. i use 3 type of radio nrf+ but i have problemCan you try to add the following line to your sketch?
#define MY_RF24_PA_LEVEL RF24_PA_LOWRight at the top, along with the other defines. I had similar problems with a LED Dimmer node. This solved the problem.
-
@tboha
sorry for english
generally , my problem is connection.
i have a controller (domoticz on orangepi) and one serial gateway with usb cable to controller. and some sensors in all of house (Sporadic) and some relay for test just in my room. id 4 or 160 is choose random for test. i am near controller and test some sketch, some relay , radio and etc...
so my problem is just connection between relays and gateway. sensors is good and work well and report all of states. but relays. first can not connection and show error :
(4143 TSF:MSG:SEND,5-5-255-255,s=255,c=3,t=7,pt=0,l=0,sg=0,ft=0,st=OK:
6150 !TSM:FPAR:NO REPLY
6152 TSM:FPAR
6188 TSF:MSG:SEND,5-5-255-255,s=255,c=3,t=7,pt=0,l=0,sg=0,ft=0,st=OK:
8195 !TSM:FPAR:FAIL
8196 TSM:FAIL:CNT=1
8198 TSM:FAIL:PDT)
but after some time (with some power off/on) relay can connect to gateway . but after connect this is very unstable and some command send and some command dont send and show error ( NACK...)
i test and i change hardware and sketch but dont resolve :(A parallel log from the GW would be nice too. Maybe there is a problem in the ACK system.
this is a solution ?please explain more . what am i do ?
use a repeater near gateway and use this for connect all of relays to gateway?@Reza No, an additional repeater was not intended (though it may be a solution).
From your "Node 4" log I conclude:- you got at least 3 fully functional nodes (node 2,3,4). These nodes talk to each other in the environment of MySensors in an expected way.
- so hardware (radio and power) is ok. Software is ok too.
- GW functionality is partially ok - maybe hard- or software issues.
Now there is a decision to make: go on testing potentially defective hardware or start over with known intact hardware.
Go on testing means: are this failures reproducible? So I asked for logs of repeated boot sequences.
Until now we don´t know the exact failure. The parallel log from Gateway would show if there are really missing transmissions (and the extent of missing messages) or only missing ACKs.
I used ACKs last time in MySensors 1.4 and those days it made more trouble than profit. At last is showed up transmissions were ok (enough) the ACK system was not ok.
For my purposes MySensors keeps up with transmission difficulties so I have not to care about (15m Distance, in house, reinforced concrete). So I decided not to use ACKs anymore and this did not reduce performance.If there are not to many missing transmissions (compare node messages vs. GW messages) I would skip the whole ACK thing.
I would test potentially defective hardware first because it is easy to get some logs an costs not much time.
Jan Gatzkes proposal is done quickly too.An additional repeater would only obscure severe GW-problems. For the GW is the heart of MySensors system - you should not tolerate any problems (hard- or software) because it will come on top again in the very near future.
Starting over with known hardware takes more effort but should be doable in a manageable amount of time.
-
Can you try to add the following line to your sketch?
#define MY_RF24_PA_LEVEL RF24_PA_LOWRight at the top, along with the other defines. I had similar problems with a LED Dimmer node. This solved the problem.
use this just for nrf +pa+lna? now i use just usual nrf+
-
@Reza No, an additional repeater was not intended (though it may be a solution).
From your "Node 4" log I conclude:- you got at least 3 fully functional nodes (node 2,3,4). These nodes talk to each other in the environment of MySensors in an expected way.
- so hardware (radio and power) is ok. Software is ok too.
- GW functionality is partially ok - maybe hard- or software issues.
Now there is a decision to make: go on testing potentially defective hardware or start over with known intact hardware.
Go on testing means: are this failures reproducible? So I asked for logs of repeated boot sequences.
Until now we don´t know the exact failure. The parallel log from Gateway would show if there are really missing transmissions (and the extent of missing messages) or only missing ACKs.
I used ACKs last time in MySensors 1.4 and those days it made more trouble than profit. At last is showed up transmissions were ok (enough) the ACK system was not ok.
For my purposes MySensors keeps up with transmission difficulties so I have not to care about (15m Distance, in house, reinforced concrete). So I decided not to use ACKs anymore and this did not reduce performance.If there are not to many missing transmissions (compare node messages vs. GW messages) I would skip the whole ACK thing.
I would test potentially defective hardware first because it is easy to get some logs an costs not much time.
Jan Gatzkes proposal is done quickly too.An additional repeater would only obscure severe GW-problems. For the GW is the heart of MySensors system - you should not tolerate any problems (hard- or software) because it will come on top again in the very near future.
Starting over with known hardware takes more effort but should be doable in a manageable amount of time.
-
@tboha
i test my hardware, change these and test again with some radio and wire and arduino ....
now about parallel log .what am i do ? this is means i build 2 gateway?@Reza No.
Parallel may not be the exact description, I meant simultaneous recording of GW and node.I don`t know about Domoticz - maybe there is a raw log function - then you are done.
Or if you installed arduino ide on your OrangePi just use the serial monitor.
Otherwise just hook up a terminal to your gateway.
I.e. on your OrangePi look for the device GW is bound to and start some terminal program linked hereto (e.g. putty, minicom, kermit). -
@Reza No.
Parallel may not be the exact description, I meant simultaneous recording of GW and node.I don`t know about Domoticz - maybe there is a raw log function - then you are done.
Or if you installed arduino ide on your OrangePi just use the serial monitor.
Otherwise just hook up a terminal to your gateway.
I.e. on your OrangePi look for the device GW is bound to and start some terminal program linked hereto (e.g. putty, minicom, kermit). -
@Reza
Sometimes it is better to step back and have a look from the distance.A receipe for a clean restart.
you need:
2 working Arduinos with radio (node 4 and node 3 from your log)
1 working Computer/Laptop with 2 free USB interfaces
preferably Windows with Arduino IDE installed. (Linux might work too.)
connect both arduinos to your USB ports.
if Arduinos are different - write down which Arduino is on which port (eg. com34: or similar)- start Arduino IDE
- from examples/Mysensors load "Gateway serial"
2a. choose one USB/COM-port. - don´t change anything and load it to the Arduino No.1
- from examples/Mysensors load "MockMySensors"
4a. choose the other USB/COM-Port - don`t change anything and load it to Arduino No.2
You are done. Watch Arduinos communicating via serial monitor.
Ok- you are right - it is not truely simultaneaus - but close enough.
If this is ok you should try to upload the relay-sketch and watch what is happening.
This should not take to long, and as a result you will hopefully know:
- my hardware is ok or not
- my software (MySensors) is ok or not
- my sketch is ok or not.
-
@Reza No.
Parallel may not be the exact description, I meant simultaneous recording of GW and node.I don`t know about Domoticz - maybe there is a raw log function - then you are done.
Or if you installed arduino ide on your OrangePi just use the serial monitor.
Otherwise just hook up a terminal to your gateway.
I.e. on your OrangePi look for the device GW is bound to and start some terminal program linked hereto (e.g. putty, minicom, kermit). -
@Reza
Sometimes it is better to step back and have a look from the distance.A receipe for a clean restart.
you need:
2 working Arduinos with radio (node 4 and node 3 from your log)
1 working Computer/Laptop with 2 free USB interfaces
preferably Windows with Arduino IDE installed. (Linux might work too.)
connect both arduinos to your USB ports.
if Arduinos are different - write down which Arduino is on which port (eg. com34: or similar)- start Arduino IDE
- from examples/Mysensors load "Gateway serial"
2a. choose one USB/COM-port. - don´t change anything and load it to the Arduino No.1
- from examples/Mysensors load "MockMySensors"
4a. choose the other USB/COM-Port - don`t change anything and load it to Arduino No.2
You are done. Watch Arduinos communicating via serial monitor.
Ok- you are right - it is not truely simultaneaus - but close enough.
If this is ok you should try to upload the relay-sketch and watch what is happening.
This should not take to long, and as a result you will hopefully know:
- my hardware is ok or not
- my software (MySensors) is ok or not
- my sketch is ok or not.
@tboha
i think found problem . i was careless about parent, relay is bad working when use a parent device for connect to gateway.but when connect directly , work better . more better . but now there is 2 questions ! first i am near gateway (1meter) and relay for test is near gateway.but after start or reset , some time relay choose a node (10 m far) for parent, while gateway is near that ! why ?
i know that i can use static parent for nodes but i want know why node choose a node 10m far for parent while gateway is in near (1m)and second question. why with a parent relay work very bad and more command dont send and there is error!? this is related to power of radio of parent ?
thank you -
@Reza this is weird - i have to think about.
Just for clarification:it seems you issued a command at your controller which results in sending
60;2;1;0;2;1
dissected:
60; = to node 60,
2; = sensor 2
1; = set value
0; = unacknowledged message (?)
2; = subtype is V_LIGHT
1; = payload is "1"I think you hit button "Relais 2 ON" .
The weird thing is - you haven`t asked for acknowledge - but system is trying to get acknowledge ---- strange.
I just got your next question, I have to think about this too.
-
@tboha
i think found problem . i was careless about parent, relay is bad working when use a parent device for connect to gateway.but when connect directly , work better . more better . but now there is 2 questions ! first i am near gateway (1meter) and relay for test is near gateway.but after start or reset , some time relay choose a node (10 m far) for parent, while gateway is near that ! why ?
i know that i can use static parent for nodes but i want know why node choose a node 10m far for parent while gateway is in near (1m)and second question. why with a parent relay work very bad and more command dont send and there is error!? this is related to power of radio of parent ?
thank you@Reza I think we are getting close to solution.
but after start or reset , some time relay choose a node (10 m far) for parent, while gateway is near that ! why ?
This seems to be the crucial question.
I never dived into the core functions how MySensors decides about choosing parent.
Maybe it depends upon speed of answer?a little excerpt from an earlier log (which i didn´t read close enough, I told you).
16 TSM:FPAR 52 TSF:MSG:SEND,4-4-255-255,s=255,c=3,t=7,pt=0,l=0,sg=0,ft=0,st=OK: 2059 !TSM:FPAR:NO REPLY 2061 TSM:FPAR 2097 TSF:MSG:SEND,4-4-255-255,s=255,c=3,t=7,pt=0,l=0,sg=0,ft=0,st=OK: 2165 TSF:MSG:READ,2-2-4,s=255,c=3,t=8,pt=1,l=1,sg=0:1 2170 TSF:MSG:FPAR OK,ID=2,D=2 2340 TSF:MSG:READ,3-3-4,s=255,c=3,t=8,pt=1,l=1,sg=0:3 4105 TSM:FPAR:OK 4106 TSM:IDat 2097 Node 4 broadcasts for parent and gets accepted for parenting from Node 2.
at 2340 Node 3 offers parenting too - upon which request?and why doesn`t GW offer parenting ?? GW shows up at 8240 with response to PING, so it is alive and connection was ok.
at least at the moment - unexpected behavior.
So reading logs give us some hints.
Could you provide a little more from the simultaneous logs? And don´t stick to the NACK logs - the other messages are interesting as well. -
@Reza I think we are getting close to solution.
but after start or reset , some time relay choose a node (10 m far) for parent, while gateway is near that ! why ?
This seems to be the crucial question.
I never dived into the core functions how MySensors decides about choosing parent.
Maybe it depends upon speed of answer?a little excerpt from an earlier log (which i didn´t read close enough, I told you).
16 TSM:FPAR 52 TSF:MSG:SEND,4-4-255-255,s=255,c=3,t=7,pt=0,l=0,sg=0,ft=0,st=OK: 2059 !TSM:FPAR:NO REPLY 2061 TSM:FPAR 2097 TSF:MSG:SEND,4-4-255-255,s=255,c=3,t=7,pt=0,l=0,sg=0,ft=0,st=OK: 2165 TSF:MSG:READ,2-2-4,s=255,c=3,t=8,pt=1,l=1,sg=0:1 2170 TSF:MSG:FPAR OK,ID=2,D=2 2340 TSF:MSG:READ,3-3-4,s=255,c=3,t=8,pt=1,l=1,sg=0:3 4105 TSM:FPAR:OK 4106 TSM:IDat 2097 Node 4 broadcasts for parent and gets accepted for parenting from Node 2.
at 2340 Node 3 offers parenting too - upon which request?and why doesn`t GW offer parenting ?? GW shows up at 8240 with response to PING, so it is alive and connection was ok.
at least at the moment - unexpected behavior.
So reading logs give us some hints.
Could you provide a little more from the simultaneous logs? And don´t stick to the NACK logs - the other messages are interesting as well.@tboha
Thanks for your time and effort, you're doing an excellent analysis. Please find below some answers to your questions:at 2097 Node 4 broadcasts for parent and gets accepted for parenting from Node 2.
at 2340 Node 3 offers parenting too - upon which request?and why doesn`t GW offer parenting ?? GW shows up at 8240 with response to PING, so it is alive and connection was ok.
The find parent step is initiated by a local I_FIND_PARENT_REQUEST broadcast, i.e. all repeaters/GW will reply with I_FIND_PARENT_RESPONSE if their uplink connections are operational (this prevents circular referencing). Node 3 (@2340) is replying to the same request, but ignored due to a greater distance to the GW (D=3+1=4 vs D=1+1=2 from node 4). The GW does not offer parenting, because it's either too close (radio interference) or too far from the requesting node. After a timeout (default 2000ms) the node uses the closest repeater/GW that replied to the request unless a static/preferred parent is defined.
-
@Reza I think we are getting close to solution.
but after start or reset , some time relay choose a node (10 m far) for parent, while gateway is near that ! why ?
This seems to be the crucial question.
I never dived into the core functions how MySensors decides about choosing parent.
Maybe it depends upon speed of answer?a little excerpt from an earlier log (which i didn´t read close enough, I told you).
16 TSM:FPAR 52 TSF:MSG:SEND,4-4-255-255,s=255,c=3,t=7,pt=0,l=0,sg=0,ft=0,st=OK: 2059 !TSM:FPAR:NO REPLY 2061 TSM:FPAR 2097 TSF:MSG:SEND,4-4-255-255,s=255,c=3,t=7,pt=0,l=0,sg=0,ft=0,st=OK: 2165 TSF:MSG:READ,2-2-4,s=255,c=3,t=8,pt=1,l=1,sg=0:1 2170 TSF:MSG:FPAR OK,ID=2,D=2 2340 TSF:MSG:READ,3-3-4,s=255,c=3,t=8,pt=1,l=1,sg=0:3 4105 TSM:FPAR:OK 4106 TSM:IDat 2097 Node 4 broadcasts for parent and gets accepted for parenting from Node 2.
at 2340 Node 3 offers parenting too - upon which request?and why doesn`t GW offer parenting ?? GW shows up at 8240 with response to PING, so it is alive and connection was ok.
at least at the moment - unexpected behavior.
So reading logs give us some hints.
Could you provide a little more from the simultaneous logs? And don´t stick to the NACK logs - the other messages are interesting as well.@tboha
my friend . i am so sorry , i am weak in english and i can not know what am i do and you told to me what am i do!!
very very thank you for time and tried for me and my problem. i can not get your time more i am sorry . i am very thank you for tried for me but i think , i am failure.
this is very complicate for me because i am beginner.
problems is many. this serial monitor is some problem. every time in serial monitor there is new problem and new error.
i am Ashamed for your time and thank you . i can not understand your guidance. -
@tboha
Thanks for your time and effort, you're doing an excellent analysis. Please find below some answers to your questions:at 2097 Node 4 broadcasts for parent and gets accepted for parenting from Node 2.
at 2340 Node 3 offers parenting too - upon which request?and why doesn`t GW offer parenting ?? GW shows up at 8240 with response to PING, so it is alive and connection was ok.
The find parent step is initiated by a local I_FIND_PARENT_REQUEST broadcast, i.e. all repeaters/GW will reply with I_FIND_PARENT_RESPONSE if their uplink connections are operational (this prevents circular referencing). Node 3 (@2340) is replying to the same request, but ignored due to a greater distance to the GW (D=3+1=4 vs D=1+1=2 from node 4). The GW does not offer parenting, because it's either too close (radio interference) or too far from the requesting node. After a timeout (default 2000ms) the node uses the closest repeater/GW that replied to the request unless a static/preferred parent is defined.
@tekka This is very interesting. I think i will review my own system - just for curiosity. Thank you for this explanation.
@Reza: don´t get desperate. I think you are not a beginner - but if you are - your learning curve is fairly steep.
As mentioned before, sometimes it is better to step backwards and look from the distance.If you aren´t too annoyed, I would offer to guide you to a working two-arduino system. (and I am sure @tekka will give advice if necessary) From there you will probably manage it on your own.
Started correctly MySensors will supply you with a lot of fun -- but today it is not funny for you.
So put away this stuff for today, go fishing and have a cold one.
Tomorrow (or Wednesday, because I don´t know my schedule for tomorrow now) we will build up things in an ordered way. (and please don´t scavenge your current gateway, I think it is the source of all evil and I am too curious about the reason).
footnote: if you are worried about your English (I am not) - give google translate a try - if available for your language
-
@tekka This is very interesting. I think i will review my own system - just for curiosity. Thank you for this explanation.
@Reza: don´t get desperate. I think you are not a beginner - but if you are - your learning curve is fairly steep.
As mentioned before, sometimes it is better to step backwards and look from the distance.If you aren´t too annoyed, I would offer to guide you to a working two-arduino system. (and I am sure @tekka will give advice if necessary) From there you will probably manage it on your own.
Started correctly MySensors will supply you with a lot of fun -- but today it is not funny for you.
So put away this stuff for today, go fishing and have a cold one.
Tomorrow (or Wednesday, because I don´t know my schedule for tomorrow now) we will build up things in an ordered way. (and please don´t scavenge your current gateway, I think it is the source of all evil and I am too curious about the reason).
footnote: if you are worried about your English (I am not) - give google translate a try - if available for your language
