My weather station starts dropping packet after 1 week of use...



  • Hello team,

    I post here, as I'm lost.
    I built a weather station based on arduino mega 2650 + PCB shield that I made (100 hours on this project -> 50 for troubleshooting).
    I use NRF24+PA/LNA with 4.7uF cap (I tried 100uF, with same result). The antenna is powered externally. I use RF24_PA_MAX. But using RF24_PA_LOW gives the same result. "Nov 6th 2015" Library version is used.
    I send 10 packets every 5 minutes (each packet in the raw is sent every 100ms to avoid any starvation), to bring the different value of my station.
    Everything works correctly during 6/7 days, after, I start dropping packets. For example I don't receive anything for 15/20 minutes, and it comes back working !!! the strange thing, is that, I was expecting to loss some packets sometimes, which is a normal behavior of wifi. However , I loss packets in the same row during 20 minutes or 10 minutes, really weird, is there a starvation of something ? I don't know, and my knowledge about electronic is below the need to move forward.
    during issue, I have st=fail on the station. If I resend packet based on the ack, the second packet is dropped also. In fact, everything is dropped during the issue.
    As I report some rain value, I can not drop packet.
    I tried to change a lot of things without success. My PA/LNA has been bought to ITEAD? and should be a Nordic one, it is sold like this.
    My weather station report the good value WITHOUT any drop during one week.
    It means that I receive 17280 packets without any drop !!!
    I followed all the golden rules (I assume I would say), and I have no more ideas on how to troubleshoot deeper. Should I enable "debug" in myconfig.h ?
    Otherwise, is there a well known PA/LNA that I can buy, and be sure 100% that's the nordic, and tested by you guys ??? I am almost ready to leave out this project in favour of Netatmo, because, I don't know how to move forward now. I passed 50 hours to troubleshoot, making some tries in all directions without success ! I send a bottle in the sea, it will be my last hope .
    thank you in advance


  • Admin

    Any neighbour running their microwave or something else that can affect the frequencies you're using?


  • Hardware Contributor

    If you get st:fail but still receives the message in the gateway - it might be something wrong with the receiving node/gateway radio?
    It would be a good thing to debug the serial on both node and receiving node i think... if you miss that ack from the gateway, the node will start to find a new parent and maybe thats why you miss packets.



  • This post is deleted!


  • @hek Nope, I don't have any microwave running when I have some drops.
    I don't have microwave oven at home. Just a TV. It was shutdown during events.
    Th eweird thing, is that it doesn't work for 15 minutes. I loss 3 rows (3X10 messages) with 5 minutes interval.
    It means that's the issue happens for a while. After, this event, everything runs correctly.
    During the event, I have another sensors ( 5 sensors sensebender) which send correctly a packet. It means that issue is not on GW. And anyway, I don't have issue with GW.
    I have a big ceiling to pass through. On GW and sensor, I use 7db antenna.
    I don't understand why it works for 6 days, and stop suddenly for 15 minutes !

    @sundberg84 . I forgot to mention, that the route is enforced to 0. There is then, no computation of route, and the route is direct. When I have st=fail, my packet doesn't arrive to my GW, as my value doesn't appear my controller. I use a Serial GW, if my packet doesn't arrive to my controller, I can not check more. As the problem happens suddenly, I can put a sniffer in the air also.


  • Admin

    Strange.. A millis() wrap issue should have the frequency of ~50 days...

    Most likely @sundberg84 is right. Somehow the node misses a few transmissions in a row and starts looking for a new route to gateway. It will retry search every time you make a new transmission (every 5 minute?).


  • Admin

    What controller are you running? Could the issue be between gw/controller?



  • @hek, as explained in the previous post, I enforced the next hop (GW).. so, it shouldn't used another route ?
    gw.begin(NULL, 102, false, 0);
    here is my config.



  • @hek said:
    I looked in the other sensors (sensebender at the same time). I have 5 sensebender, and there is no drop at this time. Packets are received every 5 minutes during this period. So, I don't expect a freeze of the GW.
    I use JEEDOM.
    I don't expect issue between GW and Controller ... but it can be a lead.
    it difficult to troubleshoot controller. Because it's a serial GW, if packet doesn't arrive to the controller, I can not check at the GW level, as there is no way to connect.


  • Admin

    Agree with you. Very strange. Especially when you use a direct route to gw.



  • @hek "Could the issue be between gw/controller?" ... I have a st=fail so it shouldn't be between controller and GW ?



  • @hek said:

    Agree with you. Very strange. Especially when you use a direct route to gw.

    the strange thing : I receive alsmost 19000 messages without issue.
    When it happens for the first time, the issues are more and more frequent.
    the NRF PA LNA provided by ITEAD should be good, no ? Which PA LNA are you recommending ?
    One more thing, when problem happens, and recover, I tried to make a reboot from my controller, and the WS is not longer communicating. I have to power down/up to recover


  • Admin

    I don't have super much experience with the Mega2650.. The only thing I've noticed with the one I got is that the 3v3 power rail is really bad. So I got really strange issues when trying to power radio from it. Had to use a step down from 5V and decouple it.

    Would it be hard to test your setup on a vanilla radio? Or is the distance too far?



  • @hek My antenna is powered externally directly in 3.3V with 4.7uF...
    What is vanilla radio ? distance is 5m, but a big ceiling is in between...

    0_1454321242150_Capture d’écran 2016-02-01 à 11.02.56.png


  • Admin



  • @hek ok :-) Yes of course, I tried, it doesn't work...ceiling is really too thick !
    12H03 - 12H23 We can see clearly the drops... :-( and I have st=fail.
    I resend packet based on st=fail, but resend is dropped also


  • Hardware Contributor

    I think it will try to find parent even though you have it set to 0 (if it fails).
    I might be wrong, but this is something i notised in an older version.



  • @sundberg84 I use the last version (I guess nov 2015).
    @hek it can be the case ? trying to use another route.
    There is no other route available normally. I have disable repeater mode on all powered sensor.
    Could it be a bad NRF24 ? is there a brand well-kwown really trusted ?


  • Admin

    I don't think it should search...
    https://github.com/mysensors/Arduino/blob/master/libraries/MySensors/MySensor.cpp#L463

    But you might wanna try the development branch anyway to see if you notice any difference.



  • @hek

    @hek said:

    I don't think it should search...
    https://github.com/mysensors/Arduino/blob/master/libraries/MySensors/MySensor.cpp#L463

    But you might wanna try the development branch anyway to see if you notice any difference.

    I can try at the next occurence... as I have no more idea... I think that code is clean, and problem is rather on the NRF. Maybe a starvation of something after a while, I don't know. difficult to say.
    btw, you didn't answer my question, is there a trusted NRF/PA/LNA seller, with a tested NRF24 tested by you, and working at it should ?


Log in to reply
 

Looks like your connection to MySensors Forum was lost, please wait while we try to reconnect.