[solved] RS485 nodes stop sending data after some hours or days
-
@pjr Short story: Still no satisfying results, but to be honest, I didn't spend too much time on that for now. The - for the moment - most important part (Node_1) works pretty reliably, the others I have to restart from time to time (Node_2 is always the first to fail)
Longer story:
- ordered some MAX487 chips to replace the MAX485 - this took some weeks from China and they still need to be soldered when there's time to do that...
- GW (seems to work reliably by now):
-- tried to use a Pro Micro with hw-serial as gw - didn't work as expected, I reported about that some weeks ago (may have been in the fhem-forum).
-- Next step is to review it (Pro Micro, Nano or STM32F103) once more when replacing the transceivers and do some testing wrt resistor values - The timing on the nodes may also offer room for improvement - by now, my plan is to really delay the startup procedures (or the first measurement) and nail the measurement times to a fixed value. This may avoid overlap of the nodes sending slots in direction to the gw as much as possible.
- last step could be a review on powering issues, seems Node_1 at some point in time suffered from issues wrt that; maybe there are other nodes with similar effects too (all nodes have a lot of wires attached).
-
Short update, thx for reminding me there's still work to do :grin::
- Moved pullpup-/pulldown resistors (440Ohm) from one end of the network to the other (now: GW).
- Replaced all MAX485 with MAX487 (all placed on LC-Modules, most of the resistors 5-7 are desoldered, the 120Ohm's remain only on GW and last node).
- SOH-Count is now set to 3 on GW and Nodes 1 to 3 and 5, so only Node 4 (BME280) is remaining with default (1)
At first sight, everything's working, and node 2 for now seems not to fail as soon as the last time before these changes; but as always: If this is really reliable over time, we'll see. So expect at least one more update, hope this will be the last :grinning:
-
I had quite strange problem with my smaller network. There is a nano with enc28j60 shield as gateway and there was 2 relay/fet-nodes for controlling lights. Everything was fine until now I added one light switch node to the network. After that only the light switch was working. Strange...
Disconnected all the nodes from network and checked what is causing the problem. It was the GW. Measured the bus between A and B was ~2.5V. So it was pulling the bus to logical one all the time.
Changed the RS485 module.. no help. Then measured the "MY_RS485_DE_PIN" what I was using pin 2. The enc28j60 shield was pulling the pin to 0.6V and that was causing the RS485 shield to drive bus to ~2.5V. I changed the pin to 3 and now everything is working like a dream. Of course none of these nodes are sending all the time so most likely there wont be any collisions.
So when some node/bus is hanging next time measure the voltage of DE-pin :D
-
So once more a short update: No significant changes achieved by changing the transmitters and the placing of the resistors - only two of 4 nodes are working as expected :sob: , the 5th (Node_2) is still turned off to prevent possible interference with Node_1.
As two of them are online since my last post (around 18 days), I'm quite sure, it's not a gw issue as @pjr reported, and as one of the nodes is powered from a different source than the other 3 and different than the GW, it seems also not to be powering related. So I'm a little running out of ideas how to further debug :sob: .
Now', Im thinking about reverting Baudrate back to 9600 and - in case this will not help (what most likely will happen) - splitting up the bus to two lines, this may help to find out what is going on with individual nodes.
So @otto001 At this point in time I'd say: It really depends...
If you have only a few nodes (2-3+GW) you want to attach, RS485 is a simple and secure option. But as soon as there are more, one failing will affect the entire communication - that's really no fun. So stay with nRF24 (or other wireless transceivers) for nodes just sending in data and try RS485 with a few important switching/security relevant nodes first.Just my2ct...
-
the problem with cables and signals is that every environment is different, cables are different , there are a lot of possible causes that can screw up communication on bus
@gohan I absolutely agree. Wrt to wiring: Most of the wires I use are twisted pairs of CAT6 network cables (one pair for signal, and - when distributing also 12V - one for 12V+GND). Some newer parts (trouble began before that) are 4 wire telefone wires with around the same copper diameter per single line.
Connections: Just one Wago between GW and Node_1, the others are either directly screwed using the modules or build short stubs (<20cm) from a Wago clamp with three connections (in/stub to node/out).So if you see room for improvement, suggestions are welcome :smile:
-
So once more a short update: No significant changes achieved by changing the transmitters and the placing of the resistors - only two of 4 nodes are working as expected :sob: , the 5th (Node_2) is still turned off to prevent possible interference with Node_1.
As two of them are online since my last post (around 18 days), I'm quite sure, it's not a gw issue as @pjr reported, and as one of the nodes is powered from a different source than the other 3 and different than the GW, it seems also not to be powering related. So I'm a little running out of ideas how to further debug :sob: .
Now', Im thinking about reverting Baudrate back to 9600 and - in case this will not help (what most likely will happen) - splitting up the bus to two lines, this may help to find out what is going on with individual nodes.
So @otto001 At this point in time I'd say: It really depends...
If you have only a few nodes (2-3+GW) you want to attach, RS485 is a simple and secure option. But as soon as there are more, one failing will affect the entire communication - that's really no fun. So stay with nRF24 (or other wireless transceivers) for nodes just sending in data and try RS485 with a few important switching/security relevant nodes first.Just my2ct...
That is what I all time suggest use CAN bus drivers instead of 485 bus drivers.
CAN bus driver adds some safety, because disconnect microcomputer by hardware from bus, if it sends dominant state too long ( when program hangs etc. ).
So single node cannot damage all communication on the bus.And try different node ID than 1 - 4.
It maybe collides with packet wrapping characters, defined in standard ASCII table for 485 transport protocol in wrong situation.#define SOH 1
#define STX 2
#define ETX 3
#define EOT 4 -
That is what I all time suggest use CAN bus drivers instead of 485 bus drivers.
CAN bus driver adds some safety, because disconnect microcomputer by hardware from bus, if it sends dominant state too long ( when program hangs etc. ).
So single node cannot damage all communication on the bus.And try different node ID than 1 - 4.
It maybe collides with packet wrapping characters, defined in standard ASCII table for 485 transport protocol in wrong situation.#define SOH 1
#define STX 2
#define ETX 3
#define EOT 4@kimot Thx for reffering to CAN.
Some questions and remarks on that:- The Node ID's assigned in reality are 97 and higher, the node-# mentionned here are just for simplyfing explanation by following the physical order they are attached to the bus.
- How to setup a CAN network with MySensors? I saw some suggestions wrt to that in the past, but that seemed not to be "ready to use" code and hardware. So is there an option to just replace the MAX48x by a different chip and use the MyS-RS485 communication layer?
I have some MCP2515 modules laying around, but these use SPI as connection towards the mcu and would require an appropriate communication layer in the sketches (at least as far as I understood).
But in general, also standard RS485 claims to be robust and not rocket science tech. So it's really frustrating to experience that amount of problems and backdraws.
EDIT: I found this thread: https://forum.mysensors.org/topic/5327/can-bus-transport-implementation-for-mys. Most likely, really understanding most of it's content will need a lot of rereading. But as far as I understood, integration of CAN still would need a lot of development?
-
@kimot Thx for reffering to CAN.
Some questions and remarks on that:- The Node ID's assigned in reality are 97 and higher, the node-# mentionned here are just for simplyfing explanation by following the physical order they are attached to the bus.
- How to setup a CAN network with MySensors? I saw some suggestions wrt to that in the past, but that seemed not to be "ready to use" code and hardware. So is there an option to just replace the MAX48x by a different chip and use the MyS-RS485 communication layer?
I have some MCP2515 modules laying around, but these use SPI as connection towards the mcu and would require an appropriate communication layer in the sketches (at least as far as I understood).
But in general, also standard RS485 claims to be robust and not rocket science tech. So it's really frustrating to experience that amount of problems and backdraws.
EDIT: I found this thread: https://forum.mysensors.org/topic/5327/can-bus-transport-implementation-for-mys. Most likely, really understanding most of it's content will need a lot of rereading. But as far as I understood, integration of CAN still would need a lot of development?
@rejoe2
I am not meaning CAN protocol.
Only CAN bus drivers:
ebayRS485 is robust, but nod designed for multimaster communication, when two nodes can ocupy bus at the same time. CAN bus drivers are designed for this situation.
You can use CAN drivers like 485, only forgot about RE, DE.
CAN bus driver always listens.
And you must use higher speeds ( 57 600 ) because driver cut of controller if it sends dominant state longer then 250 μs ( byte 00hex must be send quickly then this timeout )
Or use MCP2551, where this time is 1.25 ms. ( 9 600 )
ebay -
@rejoe2
I am not meaning CAN protocol.
Only CAN bus drivers:
ebayRS485 is robust, but nod designed for multimaster communication, when two nodes can ocupy bus at the same time. CAN bus drivers are designed for this situation.
You can use CAN drivers like 485, only forgot about RE, DE.
CAN bus driver always listens.
And you must use higher speeds ( 57 600 ) because driver cut of controller if it sends dominant state longer then 250 μs ( byte 00hex must be send quickly then this timeout )
Or use MCP2551, where this time is 1.25 ms. ( 9 600 )
ebay@kimot Thanks for clarification, just ordered a bunch of TJA1050 modules and a couple of naked MCP2551 (seem to be pin compatible with the TJA's) :grin: . That may take some time for all the way from china (new year is coming...).
Next step then will be to change Baudrate to 57600 (seems to be the upper limit when using software-serial - as needed for my Nano-GW. Btw.: I did some really disappointing tests with a pro micro-GW, but that seemed not to work, I most likely will have to make another attempt on this to use HW-serial :grin: ).Then I'll replace the MAX48x-modules by these TJA's and see, if everything's fine then.
Just one remark: If it's as easy as that, wouldn't it be good to just recommend using that type of module as a standard instead of the problematic standard RS485 types?
EDIT: One more question: Mixing both (or all three) types of transceiver should be possible, or am I wrong? (This would not completely eliminate the MAX485-disadvantages, that's clear to me)
-
@rejoe2 :
Thanks a lot!
I am having some unusual sketches (washing mashine monitoring, entry-system with fingerprint) where I am using self-written sketches with mysensors). Maybe I should wait. It is just annoying that the radio-stuff is not always working as it should. Indeed, esp with mqtt could be a solution too. But I do not know yet about the stability of esp-stuff :-(Cheers,
Otto -
I had quite strange problem with my smaller network. There is a nano with enc28j60 shield as gateway and there was 2 relay/fet-nodes for controlling lights. Everything was fine until now I added one light switch node to the network. After that only the light switch was working. Strange...
Disconnected all the nodes from network and checked what is causing the problem. It was the GW. Measured the bus between A and B was ~2.5V. So it was pulling the bus to logical one all the time.
Changed the RS485 module.. no help. Then measured the "MY_RS485_DE_PIN" what I was using pin 2. The enc28j60 shield was pulling the pin to 0.6V and that was causing the RS485 shield to drive bus to ~2.5V. I changed the pin to 3 and now everything is working like a dream. Of course none of these nodes are sending all the time so most likely there wont be any collisions.
So when some node/bus is hanging next time measure the voltage of DE-pin :D
-
@mick Its still in use. I had one bus freeze since last "update post".
The gateway did pull the line up to 190mV what seems to be enough to get traffic frozen. After "reboot" of the gateway it was still pulling the bus to 160mV so it must be some solder on wrong place and bad chinese pin headers or protoboards.. I have to rebuild the "motherboard" of the gateway..
-
@mick Its still in use. I had one bus freeze since last "update post".
The gateway did pull the line up to 190mV what seems to be enough to get traffic frozen. After "reboot" of the gateway it was still pulling the bus to 160mV so it must be some solder on wrong place and bad chinese pin headers or protoboards.. I have to rebuild the "motherboard" of the gateway..
@pjr that sounds promising. My Serial gateway and one node ran for months with no problems. I’ve since added a node in between the two and that’s when I started having issues. Removing resistors on the middle node (terminating, pull-up and pulldown) and removing the pull-up and pull down on the end node seem to have fixed the issues for now however when I added an extra 20m to the cable run I started having issues again. I’de Love to get this network working 100% but I think a lot persistence and patience is needed! :)
-
Hints on RS-485 networks:
Termination resistors
You always need them. Termination resistors should be added to the nodes located at the ends of the line. The communication may works without them if the wire is short enough and/or the bit rate is low.Pull up/down resistors a.k.a failsafe bias resistors
Why you need it is well explained here: https://electronics.stackexchange.com/a/284788/88486
When you need it: It depends on the RS485 transceiver IC. Most modern transceivers include these.Common ground
Do you have a common ground between your transceivers? RS-485 is not a 2 wire network. Besides the A-B lines it requires ground.
See: http://store.chipkin.com/articles/rs485-rs485-cables-why-you-need-3-wires-for-2-two-wire-rs485
Schematics at http://www.analog.com/media/en/technical-documentation/application-notes/AN-960.pdf page 4.Isolation
When you deal with long links, you have to take care of isolation.
Page 8 at http://www.analog.com/media/en/technical-documentation/application-notes/AN-960.pdf -
@kimot Thanks for clarification, just ordered a bunch of TJA1050 modules and a couple of naked MCP2551 (seem to be pin compatible with the TJA's) :grin: . That may take some time for all the way from china (new year is coming...).
Next step then will be to change Baudrate to 57600 (seems to be the upper limit when using software-serial - as needed for my Nano-GW. Btw.: I did some really disappointing tests with a pro micro-GW, but that seemed not to work, I most likely will have to make another attempt on this to use HW-serial :grin: ).Then I'll replace the MAX48x-modules by these TJA's and see, if everything's fine then.
Just one remark: If it's as easy as that, wouldn't it be good to just recommend using that type of module as a standard instead of the problematic standard RS485 types?
EDIT: One more question: Mixing both (or all three) types of transceiver should be possible, or am I wrong? (This would not completely eliminate the MAX485-disadvantages, that's clear to me)