[SOLVED] latest git-snapshot causes freezes



  • Only thing I can say so far is that this method-call causes the freez:

    Hardware setup:

    • node = 1mhz arduino pro mini (internal osciallator) with jboard and nrf24
    • gateway = esp8266

    the node is connected to an i2c adxl345 acceleration sensor

     send(_message,true);
    
    prepare rssi<\r><\n>
    12075 TSF:MSG:SEND,14-14-0-0,s=1,c=1,t=28,pt=0,l=11,sg=0,ft=0,st=OK:rssi: 8|0|2<\n>
    Update<\r><\n>
    12173 TSF:MSG:SEND,14-14-0-0,s=0,c=1,t=16,pt=2,l=2,sg=0,ft=0,st=OK:0<\n>
     | hwretry: 0<\r><\n>
    12255 TSF:MSG:READ,0-0-14,s=0,c=1,t=16,pt=2,l=2,sg=0:0<\n>
    12320 TSF:MSG:ACK<\n>
    prepare to sleep<\r><\n>
    12369 MCO:SLP:MS=16,SMS=0,I1=1,M1=3,I2=255,M2=255<\n>
    12419 MCO:SLP:TPD<\n>
    12435 MCO:SLP:WUP=1<\n>
    12
    

    Gateway

    0;255;3;0;9;TSF:MSG:SEND,0-0-14-14,s=0,c=1,t=16,pt=2,l=2,sg=0,ft=0,st=OK:0<\n>
    0;255;3;0;9;Sending message on topic: mygateway-out/14/0/1/0/16<\n>
    0;255;3;0;9;TSF:MSG:READ,14-14-0,s=0,c=1,t=16,pt=2,l=2,sg=0:0<\n>
    0;255;3;0;9;TSF:MSG:ACK REQ<\n>
    0;255;3;0;9;TSF:MSG:SEND,0-0-14-14,s=0,c=1,t=16,pt=2,l=2,sg=0,ft=0,st=OK:0<\n>
    0;255;3;0;9;Sending message on topic: mygateway-out/14/0/1/0/16<\n>
    0;255;3;0;9;TSF:MSG:READ,14-14-0,s=0,c=1,t=16,pt=2,l=2,sg=0:0<\n>
    0;255;3;0;9;TSF:MSG:ACK REQ<\n>
    0;255;3;0;9;TSF:MSG:SEND,0-0-14-14,s=0,c=1,t=16,pt=2,l=2,sg=0,ft=0,st=OK:0<\n>
    0;255;3;0;9;Sending message on topic: mygateway-out/14/0/1/0/16<\n>
    0;255;3;0;9;Sending message on topic: mygateway-out/0/0/1/0/15<\n>
    0;255;3;0;9;Sending message on topic: mygateway-out/0/0/1/0/15<\n>
    
    

    whole project:

    node-debug-log:0_1475363295558_adxl345_vibrationtest_fabo_rf.rar

    You can see that the output stops while flushing an millis output ...

    I am using my own milliseconds-wrapper but this should make no difference. I can say that the freezes occurs within 1-5minutes with the setup from the attached file.

    I speed up the process of the loop to better reproduce the error. I noticed that the hangup occurs somewhere during the end of the loop. In addition to that he freeze does not occur if I send no messages to the gateway/controller at all but do only internal processing.

    Tomorrow I might clean up my sketch and try a minimal setup and reproduce the hangup.

    One possibility I am looking into is a interference of:

    TWBR = 0;
    

    At the same time I got a random gateway crash:

    0;255;3;0;9;Sending message on topic: mygateway-out/14/0/1/0/16<\n>
    Fatal exception 28(LoadProhibitedCause):<\n>
    epc1=0x402026ac, epc2=0x00000000, epc3=0x00000000, excvaddr=0x00000003, depc=0x00000000<\n>
    <\r><\n>
    Exception (28):<\r><\n>
    epc1=0x402026ac epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000003 depc=0x00000000<\r><\n>
    <\r><\n>
    ctx: cont <\r><\n>
    sp: 3ffef980 end: 3ffefbe0 offset: 01a0<\r><\n>
    <\r><\n>
    >>>stack>>><\r><\n>
    3ffefb20:  00000000 3ffeff7c 3ffefdc4 4020589f  <\r><\n>
    3ffefb30:  001e8480 00085f6b 3ffefdc4 40202f98  <\r><\n>
    3ffefb40:  00000001 00000011 00000000 3ffeebac  <\r><\n>
    3ffefb50:  3fffdad0 00085f6b 3ffefc8c 40203282  <\r><\n>
    3ffefb60:  3fffdad0 00085f6b 3ffefc8c 40203bb8  <\r><\n>
    3ffefb70:  3fffdad0 00085f6b 0000ea60 402031bd  <\r><\n>
    3ffefb80:  3fffdad0 00085f6b 0000ea60 4020463d  <\r><\n>
    3ffefb90:  3fffdad0 00000000 0000ea60 4020465e  <\r><\n>
    3ffefba0:  00000001 00000000 0000ea60 40204683  <\r><\n>
    3ffefbb0:  00000000 00000000 3ffeeb80 40204b49  <\r><\n>
    3ffefbc0:  3fffdad0 00000000 3ffeeb80 40204d83  <\r><\n>
    3ffefbd0:  feefeffe feefeffe 3ffeebc0 40100114  <\r><\n>
    <<<stack<<<<\r><\n>
    <\r><\n>
     ets Jan  8 2013,rst cause:2, boot mode:(1,6)<\r><\n>
    <\r><\n>
    <\r><\n>
     ets Jan  8 2013,rst cause:4, boot mode:(1,6)<\r><\n>
    <\r><\n>
    wdt reset<\r><\n>
    

  • Admin

    @cimba007 You are living on the bleeding edge 🙂 Please provide additional information such as HW setup and debug logs of both, GW and node. Thanks


  • Admin

    @cimba007 Please post new messages instead of editing previous post - this makes it easier in terms of understanding and following the chronology of the thread.



  • Okay .. I am currently investigating:

    TWBR = 0;
    

    Might be somehow related as removing it causes no hangups so far...maybe I am seeing things with the newest snapshots 😄 Strangely the code was working fine until I added the mysensor-part ...

    wopps .. again

    264077 TSF:MSG:READ,0-0-14,s=0,c=1,t=16,pt=2,l=2,sg=0:0<\n>
    264142 TSF:MSG:ACK<\n>
    prepare to sleep<\r><\n>
    264208 MCO:SLP:MS=16,SMS=0,I1=1,M1=3,I2=255,M2=255<\n>
    264273 MCO:SLP:TPD<\n>
    264290 MCO:SLP:WUP=1<\n>
    .30
    

    So i2c-speed is probably unrelated .. as even after commeting the TWBR = 0; out the hangup repeated.

    Next I will try to remove the acceleration-sensor readings ...



  • And again ...

    145440 MCO:SLP:MS=16,SMS=0,I1=1,M1=3,I2=255,M2=255<\n>
    145489 MCO:SLP:TPD<\n>
    145522 MCO:SLP:WUP=1<\n>
    .152002<\r><\n>
    Update<\r><\n>
    145522 TSF:MSG:SEND,14-14-0-0,s=0,c=1,t=16,pt=2,l=2,sg=0,ft=0,st=OK:1<\n>
     | hwretry: 1<\r><\n>
    145522 TSF:MSG:READ,0-0-14,s=0,c=1,t=16,pt=2,l=2,sg=0:1<\n>
    145522 TSF:MSG:ACK<\n>
    
    

    But this time at a differenct part ...

      fabo3axis.readIntStatus();
      sleep(digitalPinToInterrupt(3),RISING,16);
    

    Removing this as well ..



  • and .. again ..

    .79366<\r><\n>
    Update<\r><\n>
    76038 TSF:MSG:SEND,14-14-0-0,s=0,c=1,t=16,pt=2,l=2,sg=0,ft=0,st=OK:1<\n>
     | hwretry: 0<\r><\n>
    76038 TSF:MSG:READ,0-0-14,s=0,c=1,t=16,pt=2,l=2,sg=0:1<\n>
    76038 TSF:MSG:ACK<\n>
    

    My loop currently looks like this:

    void loop() { 
    
      motionSensor.set(1);  motionSensor.waitsend(50);
      /*
      uint8_t s = fabo3axis.readIntStatus();
      if(ADXL345_INT_ACTIVITY & s){
        Serial.print("activity ");
        Serial.println(millis());
        motionSensor.set(1);  motionSensor.waitsend(50);
      }
      else
      {
        Serial.println(s,HEX);
        motionSensor.set(0);  motionSensor.waitsend(50);
      }
      */
      Serial.println("prepare to sleep");
      Serial.flush();
      
      // Pending Int-Stati might prevent interrupt!
      //fabo3axis.readIntStatus();
      sleep(digitalPinToInterrupt(3),RISING,16);
      
      // helper.h
      //extern unsigned long millis_offset;
      millis_offset += 16;
      Serial.print(".");
      Serial.println(mymillis());
    }
    

    All referencec to the acceleration sensor removed .. the interrupt should not trigger as the interrupt-register is not "cleard" by issuing the readIntStatus ..



  • I forgot to mention that the node is battery powered @ 2x AA-Alkaline-Battery @ ~2,5volt .. no LDO involved .. and an NRF24+LNA+PA version of the NRF24 ..

    I got like 4-5 boards with this exact same setup and they run totally fine with the latest stable release ..

    .. investigating the root cause might be difficulty as my code is a little bit customized with a little wrapper .. will try out a fresh sketch tomorrow ..



  • Okay ... I switched back between snapshot from yesterday and stable 2.0.0 and all I can say is that the stable is perfectly fine and the snapshot freezes.

    prepare to sleep
    17416 MCO:SLP:MS=16,SMS=0,I1=1,M1=3,I2=255,M2=255
    17481 MCO:SLP:TPD
    17498 MCO:SLP:WUP=1
    .18
    

    Nothing on the code has been changed. Only difference is that I had to modifie latest stable to include a few delays for my slow 1mhz node ..

    PS: I currently don't have the time to debug this issue further .. last time it took me some days to figure out the delay-issue of slow nodes ...


  • Admin

    @cimba007 Could you upload a minimal sketch (ideally without external dependencies) to reproduce this?



  • // http://www.earth.org.uk/OpenTRV/Arduino/bootloader/ATmega328P-1MHz/README.txt
    #define MY_NODE_ID 14
    
    #define MY_BAUD_RATE 9600 // DONT GO HIGHER AS 57600 !!! 115200 might cause DEADLOCKS!!!
    
    // Enable debug prints to serial monitor
    #define MY_DEBUG
    #define MY_DEBUG_LIGHT
    
    #define MY_RF24_PA_LEVEL RF24_PA_HIGH
    
    // Enable and select radio type attached
    #define MY_RADIO_NRF24
    //#define MY_RADIO_RFM69
    
    #include <SPI.h>
    #include <MySensors.h>
    //#include "helper.h"
    
    //MySensorChild motionSensor = MySensorChild(0, S_MOTION, V_TRIPPED, "");
    
    void before()
    {
      //wdt_enable(WDTO_8S);
    }
    
    void presentation()
    {
      // Send the sketch version information to the gateway and Controller
      char datetime[30]; // 22 + 1 should ben enough
      snprintf(datetime, 30, "%s | %s", __DATE__, __TIME__);
      #ifdef MY_DEBUG_LIGHT
      Serial.print(F("SketchInfo: "));
      Serial.println(datetime);
      #endif
    
      sendSketchInfo("adxl345", datetime);
    }
    
    #include <Wire.h>
    #include <FaBo3Axis_ADXL345.h>
    
    FaBo3Axis fabo3axis;
    
    //void ADXL_ISR()
    //{
    //
    //}
    
    void setup()
    {
      Serial.begin(9600); // ă‚·ăƒȘă‚ąăƒ«ăźé–‹ć§‹ăƒ‡ăƒăƒƒă‚Żç”š
      
      Serial.println("Checking I2C device...");
      
      //if(fabo3axis.searchDevice()){
      //  Serial.println("I am ADXL345");
      //}
      Serial.println("Init...");
      
      fabo3axis.configuration();
      fabo3axis.powerOn();
      //fabo3axis.enableTap();
      fabo3axis.enableActivity();
      //attachInterrupt(digitalPinToInterrupt(3), ADXL_ISR, RISING);
      fabo3axis.readIntStatus();
      
      // http://www.avrfreaks.net/forum/wire-library-increasing-i2c-speed
      //TWBR = ((F_CPU / 400000) - 16) / 2;
      TWBR = 0;
      Serial.print("i2c @ "); Serial.print(TWBR); Serial.println(); 
    }
    
    
    void loop() { 
      //wdt_reset();
      MyMessage msg(0, V_TRIPPED);
      //motionSensor.set(1);  motionSensor.waitsend(50);
      //send(msg,true);
      
      //uint8_t s = fabo3axis.readIntStatus();
      uint8_t s = 1;
      if(false){
        Serial.print("activity ");
        //Serial.println(millis());
        msg.set(1);
        send(msg,true);
        //motionSensor.set(1);  motionSensor.waitsend(50,60000);
      }
      else
      {
        Serial.print(s,HEX);
        msg.set(0);
        send(msg,true);
        //motionSensor.set(0);  motionSensor.waitsend(50,60000);
      }
      
      //Serial.println("prepare to sleep");
      Serial.flush();
      
      // Pending Int-Stati might prevent interrupt!
      fabo3axis.readIntStatus();
      sleep(digitalPinToInterrupt(3),RISING,16);
    
      //millis_offset += 16;
      Serial.print(" | ");
      Serial.println(hwMillis());
    }
    
    bool my_ack_received;
    void receive(const MyMessage &message)
    {
      if (message.isAck())
        my_ack_received = true;
    }
    

    I suspect the sleep/interrupt combination.

    Could not remove the accelerometer cause it is the source of the interrupts



  • no sleep = fine
    no i2c-communiction/no interrupts = fine

    both = lockup



  • I noticed that I can make the node crash if I trigger some interrupts by force (moving the accelerometer) .. nice ..

    I guess the sleep/interrupt code has some kind of bug since the latest rework in the developer branch.



  • I hope you catch it, as I believe I'm hit by the same bug. It the happens similar way on Sensebender Micro & NRF24L01+. I have a thread on troubleshooting forum with topic of interrupt mystery. I'm arduino newbie, so I'm not much help. I'm just about to try it on Arduino Pro mini just for comparison.

    Somehow the enbling of interrupts lock it within a minute or so. If no interrupt enabled, no locking.

    BR,
    Ikke



  • @ikkeT Are you using the latest stable or snapshot version?


  • Admin

    @cimba007 Thanks, I'll try to reproduce it. What Arduino IDE & AVR Board defs are you using?
    @ikkeT Same here, can you upload your (minimal) sketch to reproduce and also indicate MySensors, Arduino IDE and AVR board def versions?


  • Hardware Contributor

    @cimba007
    just an idea..i don't know if it can help at all..but it looks there is a while loop in FaBo3Axis::readI2c
    do you think if adding some debug here you get something??
    i was just thinking..a freeze that's weird..no wdt?



  • @tekka Arduiono 1.6.12

    ##############################################################
    # Add the new board to boards.txt (normally located at "C:\Program Files\Arduino\hardware\arduino\avr"
    # The *.bootloader.* etries only matters if you want to program bootloader (and fuses) from Arduino IDE. 
    # See http://www.engbedded.com/fusecalc (select Atmega328p) for interpretation of fuse values and how 
    # extended fuses are written in different applications (07h in Arduino IDE = FFh in Atmel studio).
    ##############################################################
    
    apm96.name=APM Optiboot internal 1MHz noBOD 9600baud
    
    apm96.upload.tool=avrdude
    apm96.upload.protocol=arduino
    apm96.upload.maximum_size=32256
    apm96.upload.speed=9600
    apm96.bootloader.tool=avrdude
    apm96.bootloader.low_fuses=0x62
    apm96.bootloader.high_fuses=0xde
    apm96.bootloader.extended_fuses=0x06
    apm96.bootloader.path=optiboot_v50
    apm96.bootloader.file=atmega328_1a.hex
    apm96.bootloader.unlock_bits=0x3F
    apm96.bootloader.lock_bits=0x2F
    apm96.build.board=AVR_APM96
    apm96.build.mcu=atmega328p
    apm96.build.f_cpu=1000000L
    apm96.build.core=arduino
    apm96.build.variant=standard
    
    ##############################################################
    

    I changed the fuses to include BDOUT @ 1,8Volt
    VCC of my setup is 2-3Volt



  • @scalz Thanks for the hint with the while-loop. I changed the i2c-read the following way:

    void FaBo3Axis::readI2c(uint8_t register_addr, uint8_t num, uint8_t buffer[])
    {
      Wire.beginTransmission(_i2caddr);
      Wire.write(register_addr);
      Wire.endTransmission();
    
      Wire.requestFrom(_i2caddr, num);
    
      int i = 0;
      uint8_t limit = num;
      while(Wire.available() && limit--)
      {
    	Serial.print("r");
        buffer[i] = Wire.read();
        i++;
      }
    }
    

    Now it should under no circumstanced read more then the number given. Sadly this had no impact:

     | 34291
    34291 TSF:MSG:READ,0-0-14,s=0,c=1,t=16,pt=2,l=2,sg=0:0
    34357 TSF:MSG:ACK
    134390 TSF:MSG:SEND,14-14-0-0,s=0,c=1,t=16,pt=2,l=2,sg=0,ft=0,st=OK:0
    r34471 MCO:SLP:MS=16,SMS=0,I1=1,M1=3,I2=255,M2=255
    34537 MCO:SLP:TPD
    34553 MCO:SLP:WUP=1
     | 34553
    34553 TSF:MSG:READ,0-0-14,s=0,c=1,t=16,pt=2,l=2,sg=0:0
    34553 TSF:MSG:ACK
    134553 TSF:MSG:SEND,14-14-0-0,s=0,c=1,t=16,pt=2,l=2,sg=0,ft=0,st=OK:0
    

    Just before the sleep ..

    I then added a Serial.flush() to the "r"-output ..

    void FaBo3Axis::readI2c(uint8_t register_addr, uint8_t num, uint8_t buffer[])
    {
    	Serial.print("readI2c");
    	Serial.flush();
      Wire.beginTransmission(_i2caddr);
      Wire.write(register_addr);
      Wire.endTransmission();
    
      Wire.requestFrom(_i2caddr, num);
    
      int i = 0;
      uint8_t limit = num;
      while(Wire.available() && limit--)
      {
    	Serial.print("r");
    	Serial.flush();
        buffer[i] = Wire.read();
        i++;
      }
    }
    
    54706 MCO:SLP:WUP=1
     | 54706
    54706 TSF:MSG:READ,0-0-14,s=0,c=1,t=16,pt=2,l=2,sg=0:0
    54706 TSF:MSG:ACK
    154706 TSF:MSG:SEND,14-14-0-0,s=0,c=1,t=16,pt=2,l=2,sg=0,ft=0,st=OK:0
    

    Now the freeze is clearly before entering the sleep function ..

    but then .. on another round ..

     | 56623
    56623 TSF:MSG:READ,0-0-14,s=0,c=1,t=16,pt=2,l=2,sg=0:0
    56623 TSF:MSG:ACK
    156623 TSF:MSG:SEND,14-14-0-0,s=0,c=1,t=16,pt=2,l=2,sg=0,ft=0,st=OK:0
    readI2c
    
    readI2cgot # from slave: 1
    r78987 MCO:SLP:MS=16,SMS=0,I1=1,M1=3,I2=255,M2=255
    79052 MCO:SLP:TPD
    79069 MCO:SLP:WUP=1
     | 79069
    79069 TSF:MSG:READ,0-0-14,s=0,c=1,t=16,pt=2,l=2,sg=0:0
    79069 TSF:MSG:ACK
    179069 TSF:MSG:SEND,14-14-0-0,s=0,c=1,t=16,pt=2,l=2,sg=0,ft=0,st=OK:0
    readI2c
    

    So .. your tip was very good .. it seems to be related to the

      Wire.beginTransmission(_i2caddr);
      Wire.write(register_addr);
      Wire.endTransmission();
    

    -call .. only one questions remains .. why was it working perfectly on the stable library?

    Just to be sure I will run exactly the same code in a non-stop loop with the stable library ..


  • Admin

    @cimba007 What AVR board defs are you on?



  • @tekka said:

    AVR board def

    What do you mean with AVR board def?

    https://forum.mysensors.org/topic/4999/latest-git-snapshot-causes-freezes/17

    I am using an arduino pro-mini china clone (328p) with 8mhz internal oscillator and DIV_8 fuse.

    The bootloader is an "APM Optiboot 1Mhz" .. cant find the link right now.


  • Admin

    @cimba007 You'll find the information in your Boards Manager (under Tools | Board | Boards Manager)



  • yeah .. sorry

    0_1475435692355_upload-68958f1f-2d52-4640-bdf0-b33688831f9f


  • Admin

    @cimba007 ok, do you see it hapepening if you downgrade the AVR board defs to 1.6.12 or 13?



  • btw .. the "old" library is running happy for 6 minutes now ... still continuing

    TSP:MSG:READ 0-0-14 s=0,c=1,t=16,pt=2,l=2,sg=0:0
    1TSP:MSG:SEND 14-14-0-0 s=0,c=1,t=16,pt=2,l=2,sg=0,ft=0,st=ok:0
    readI2cgot # from slave: 1
    r | 373997
    TSP:MSG:READ 0-0-14 s=0,c=1,t=16,pt=2,l=2,sg=0:0
    1TSP:MSG:SEND 14-14-0-0 s=0,c=1,t=16,pt=2,l=2,sg=0,ft=0,st=ok:0
    readI2cgot # from slave: 1
    r | 374177

    @tekka: I will try the "new snapshot" with the board defs you mentioned .. standby



  • @1.6.12 board defs

    r68550 MCO:SLP:MS=16,SMS=0,I1=1,M1=3,I2=255,M2=255
    68599 MCO:SLP:TPD
    68632 MCO:SLP:WUP=1
     | 68632
    68632 TSF:MSG:READ,0-0-14,s=0,c=1,t=16,pt=2,l=2,sg=0:0
    68632 TSF:MSG:ACK
    168632 TSF:MSG:SEND,14-14-0-0,s=0,c=1,t=16,pt=2,l=2,sg=0,ft=0,st=OK:0
    readI2c
    

    @1.6.13 board defs

    readI2cgot # from slave: 1
    r62914 MCO:SLP:MS=16,SMS=0,I1=1,M1=3,I2=255,M2=255
    62980 MCO:SLP:TPD
    62996 MCO:SLP:WUP=1
     | 62996
    62996 TSF:MSG:READ,0-0-14,s=0,c=1,t=16,pt=2,l=2,sg=0:0
    62996 TSF:MSG:ACK
    162996 TSF:MSG:SEND,14-14-0-0,s=0,c=1,t=16,pt=2,l=2,sg=0,ft=0,st=OK:0
    readI2c
    

  • Hardware Contributor

    hmm..too bad that didn't do the trick..but there was something (and that's an issue which can appear easily when using other libs, "hidden" while) and difficult to reproduce exactly the same case on my side as i'm not using the same radio.
    i've dl the latest snpashot. so i will check on my side if one of my sketch is still running fine (with 4ints) and if i get the problem i'll try to debug it too. but i hope to not get it lol 🙂



  • That would be nice .. my latest findings point out that the i2c-function is getting stuck. I investigated the i2c-source of arduino a bit and found out it is quite depending on interrupts.

    This is now

    wild speculation

    For some reason interrupts got disabled while the i2c library is trying to write some data on the i2c-bus ..

    wild-speculation-end



  • Code working: (4minutes+)

      sei();
      fabo3axis.readIntStatus();
    

    Code not working:

      //sei();
      fabo3axis.readIntStatus();
    

    I noticed that interrupts in the sleep function are only detached if the intrrupt occurs ..

    whereas in the stable

    	detachInterrupt(interrupt1);
    	if (interrupt2!=0xFF) detachInterrupt(interrupt2);
    

    They get detached after the wakeup in the sleep ..

    What if .. for some strange reason the interrupt is interfearing with .. whatever?!

    EDIT: with the added sei(); the node is running for 10 minutes now ...
    EDIT2: Running for 20 minutes now ..



  • @cimba007 I cloned the master branch from git about week ago. Or which ever is the default. I'll come back tomorrow, already in bed now.



  • @tekka the scetch is here along with the libraries, just use it as Arduino folder to reproduce:

    https://github.com/ikke-t/sensebender

    I'll check the versions tomorrow (UTC+3). The arduino ide is 1.6.xx whatever is the very latest in Fedora 24.



  • @tekka, the versions are:

    Mysensors is development branch, I didn't pay attention while cloning, it's set to default to development:

    commit 8cacb4825b256f63aa2fc51468fd11a90bb19678
    Merge: 75a100f 8ccb1ca
    Author: Patrick Fallberg <patrick@fallberg.net>
    Date:   Thu Sep 22 19:02:11 2016 +0200
    

    Arduino IDE is 1.6.4

    $ rpm -qa arduino*
    arduino-core-1.6.4-8.fc24.noarch
    arduino-doc-1.6.4-8.fc24.noarch
    arduino-1.6.4-8.fc24.noarch
    

    My IDE board manager shows Arduino AVR boards version 1.6.7, and it seems there is newer one available, 1.6.14. I will update that.

    Mysensors AVR board definition for Micro version is 1.0.1


  • Hardware Contributor

    so far, on my side, using custom sleep() and other mix, my ints are still working..i'm using latest snap' now, and still in 1.6.8..
    but that's not the same usecase as i'm not using mys sleep for this node..
    I will try your case; at least mine is working well 🙂 . it could be sort of race..always a dilemma somewhere..



  • I got suspecious about the code working with the manual adding of sei() before calling the i2cread-function.

    So I changed the end of hwPowerDown to include a serial output with a flush.

    	// restore saved WDT settings
    	WDTCSR = WDTsave;
    	Serial.println("sei()");
    	Serial.flush();
    	sei();
    	// enable ADC
    	ADCSRA |= (1 << ADEN);
    }
    

    Now have a close look at how sei()-output is missing just before the freeze:

    57114 MCO:SLP:TPD<\n>
    sei()<\r><\n>
    57147 MCO:SLP:WUP=-1<\n>
     | 57180<\r><\n>
    57180 TSF:MSG:READ,0-0-14,s=0,c=1,t=16,pt=2,l=2,sg=0:0<\n>
    57245 TSF:MSG:ACK<\n>
    157262 TSF:MSG:SEND,14-14-0-0,s=0,c=1,t=16,pt=2,l=2,sg=0,ft=0,st=OK:0<\n>
    readI2cgot # from slave: 1<\r><\n>
    r57376 MCO:SLP:MS=16,SMS=0,I1=1,M1=3,I2=255,M2=255<\n>
    57442 MCO:SLP:TPD<\n>
    sei()<\r><\n>
    57475 MCO:SLP:WUP=-1<\n>
     | 57491<\r><\n>
    57507 TSF:MSG:READ,0-0-14,s=0,c=1,t=16,pt=2,l=2,sg=0:0<\n>
    57573 TSF:MSG:ACK<\n>
    157589 TSF:MSG:SEND,14-14-0-0,s=0,c=1,t=16,pt=2,l=2,sg=0,ft=0,st=OK:0<\n>
    readI2cgot # from slave: 1<\r><\n>
    r57704 MCO:SLP:MS=16,SMS=0,I1=1,M1=3,I2=255,M2=255<\n>
    57769 MCO:SLP:TPD<\n>
    57786 MCO:SLP:WUP=1<\n>
     | 57786<\r><\n>
    57786 TSF:MSG:READ,0-0-14,s=0,c=1,t=16,pt=2,l=2,sg=0:0<\n>
    57786 TSF:MSG:ACK<\n>
    157786 TSF:MSG:SEND,14-14-0-0,s=0,c=1,t=16,pt=2,l=2,sg=0,ft=0,st=OK:0<\n>
    readI2c
    

    Just after "57769 MCO:SLP:TPD<\n>" the sei()-output is missing. Thus I suspect that interrupts are not enabled when the readI2c-function is called .. and as i2c is depending on interrupts the code gets stuck there.

    Still no idea why the sei() is "sometimes" not called ...

    Notice how the "millis" interrupt is not called after the missing sei .. thus all timestamps are the same:

    57786 MCO:SLP:WUP=1<\n>
     | 57786<\r><\n>
    57786 TSF:MSG:READ,0-0-14,s=0,c=1,t=16,pt=2,l=2,sg=0:0<\n>
    57786 TSF:MSG:ACK<\n>
    157786 TSF:MSG:SEND,14-14-0-0,s=0,c=1,t=16,pt=2,l=2,sg=0,ft=0,st=OK:0<\n>
    readI2c
    


  • Comparing the code to the atmel documentation here:

    http://www.atmel.com/webdoc/AVRLibcReferenceManual/group__avr__sleep.html

    mysensors explicit disables interrupts after wake from sleep:

    	sei();
    	
        // Directly sleep CPU, to prevent race conditions! (see chapter 7.7 of ATMega328P datasheet)
    	sleep_cpu();
        sleep_disable();	
    	// restore previous WDT settings
    	cli();
    

    whereas the atmel documentation enables interrupts

    #include <avr/interrupt.h>
        #include <avr/sleep.h>
    
    ...
      set_sleep_mode(<mode>);
      cli();
      if (some_condition)
      {
        sleep_enable();
        sleep_bod_disable();
        sei();
        sleep_cpu();
        sleep_disable();
      }
      sei();
    

    Might not be totally related as the code did not change from last stable .. except one line:

    	WDTCSR |= (1 << WDIE); // set the WDIE flag to enable interrupt callback function.
    

    is now missing in the snapshot ..

    Adding an extra sei() after the sleep_disable() is not the solution .. still thought it might be noteworthy


  • Mod

    @cimba007 said:

    WDTCSR |= (1 << WDIE); // set the WDIE flag to enable interrupt callback function.
    is now missing in the snapshot ..

    Can you indicate where this change is located?
    I compared 2.0.0 (master) to 2.0.1beta (development) and WDTCSR is only touched in MyHwATMega328.cpp
    The watchdog related parts seem identical between both versions...

    Futhermore, is it the library located at https://github.com/FaBoPlatform/FaBo3Axis-ADXL345-Library/blob/master/src/FaBo3Axis_ADXL345.cpp that you are using?



  • Jep, this is the library.

    https://github.com/mysensors/MySensors/blob/development/core/MyHwATMega328.cpp#L86
    https://github.com/mysensors/MySensors/blob/master/core/MyHwATMega328.cpp#L75

    well .. I have no idea but on my code it is there:

    0_1475516793275_upload-115e8437-9933-4d71-ba85-0f3f11ea7036

    Seems to be neither in the stable nor the developement branch .. but I most gotten it somewhere .. lets check

    I think I added it for some reason on latest stable. Otherwise pinchage-interrupt was not working with mysensors-library .. I will try to remove it on stable to see if I get stuck there too and add in snapshot to see if it helps in some way.

    EDIT1: Adding in the latest snapshot doesn't change a thing
    EDIT2: Removing it in the latest stable doesn't change a thing

    I might have added it from another testsketch .. but it would be better located in that sketch right after the sleep function .. so just ignore it for now - my bad ;_(


  • Mod

    @cimba007 Well, I filed (https://github.com/mysensors/MySensors/issues/598) and fixed (https://github.com/mysensors/MySensors/pull/599) and issue with interrupts not being detached correctly in all cases when using hwSleep().
    This might be causing your issue, but is hard to say without actually debugging the code.
    Once the PR is merged, could you try getting the latest development branch and see if it fixes the issue? (please use the development branch, without any local modifications)



  • Just copy'pasted your fix and .. voila

    Problems seems to be solved - although I do not unterstand why?!

    • _wakeUp2Interrupt should be totally unrelated as it is never set
    • detach should happen in the ISR

    .. even if the interrupt is not detached correctly .. how should it lead to the total skip of these code:

    	cli();
    	wdt_reset();
    	// enable WDT changes
    	WDTCSR |= (1 << WDCE) | (1 << WDE);
    	// restore saved WDT settings
    	WDTCSR = WDTsave;
    	Serial.println("sei()");
    	Serial.flush();
    	sei();
    	// enable ADC
    	ADCSRA |= (1 << ADEN);
    	//WDTCSR |= (1 << WDIE); // set the WDIE flag to enable interrupt callback function.
    

    As this code-snipped is skipped sei() gets never called in my case .. and the subsequent call to i2c freezes the node ... as i2c depends on interrupts ..

    Anyway thanks Yveaux .. althour your fix is simple .. i don't know why it is working ...

    EDIT: Furthermore .. even if the interrupt is not detached in the isr .. there should be no further interrupt if the interrupt register is not read via i2c .. and that in fact doesn't happen >:>


  • Mod

    @cimba007 Great to hear it solves your problem!

    Regarding the serial print not showing: you call it from withing a section with interrupts disabled; I don't know if this will work at all, or maybe even cause a hang... I would toggle some IO pins instead and use an oscilloscope to verify program flow.

    Sleeping an AVR using the watchdog and pin interrupts is some tricky stuff, which is very susceptible to race conditions and almost impossible to debug without a decent hardware debugger.
    Having a compiler that woes every now and then also isn't helping...



  • I only used the pin change interrupt in another sketch. I was doubting Serial in sections without isr too! But they are clever. Serial output depends on polling in sections where interrupts are disabled 😉


  • Mod

    @cimba007 said:

    Serial output depends on polling in sections where interrupts are disabled

    Ok, learned something today 😉

    The PR has now been merged to development btw.



  • @Yveaux While we are at this topic .. any chance to let the user-code know what caused the wakeup during sleep() ??

    For example .. if there was a wakeup at all or which of the two interrupt sources was the cause.

    Unforunatly the "cause" gets cleaned up at the end of hwSleep:

    bool interruptWakeUp()
    {
        return _wokeUpByInterrupt != INVALID_INTERRUPT_NUM;
    }
    

    Calling this from user-code will always return 0 as the end of hwSleep does this:

     _wokeUpByInterrupt = INVALID_INTERRUPT_NUM;
    

    Not clearing the "last interrupt source" would be nice.


  • Admin

    @cimba007 The return value of sleep() indicates the wake-up cause: irq (nr) or timer (=MY_WAKE_UP_BY_TIMER).
    See here: https://github.com/mysensors/MySensors/blob/development/core/MyHwATMega328.cpp#L146-L147



  • I also confirm it fixed my problem. Thank you!



  • @tekka Thanks, that is what I was looking for



  • @Yveaux Thank you for committing this fix. I had the same issue and it caused me a lot of headache


Log in to reply
 

Suggested Topics

72
Online

11.5k
Users

11.1k
Topics

112.7k
Posts