What is the "robust" way to sleep / send messages?



  • Hello everyone!

    I recently built a water meter with MySensors based on:

    Here is the code running on the microcontroller:

    // MySensors configuration
    #define MY_SPLASH_SCREEN_DISABLED
    #define MY_RADIO_RFM69
    #define MY_RFM69_RST_PIN 8
    #define MY_RFM69_NEW_DRIVER
    #define MY_RFM69_ENABLE_ENCRYPTION 
    
    // Include libraries
    #include <MySensors.h>
    
    // Global settings
    #define BATT_ADC_PIN A1
    #define CHILD_ID 0
    #define SENSOR_PIN 3
    
    // Global variables
    byte last_batt_level = 0;
    unsigned long total_water = 0;
    MyMessage water_msg(CHILD_ID, V_VOLUME);
    
    // Check the battery voltage and eventually send it
    void report_battery_level() {
    
        // Read the battery voltage
        long adc = analogRead(BATT_ADC_PIN);
    
        // Compute the battery level
        // Vmin = 2/(3.3/1023) = 620
        // Vmax = 3/(3.3/1023) = 930
        long level = (adc-620)*100/310;
        byte batt_level = constrain(level, 0, 100);
        
        // The battery level changed, send it
        if (batt_level != last_batt_level) {
            sendBatteryLevel(batt_level);        
            last_batt_level = batt_level;
        }
    }
    
    void presentation() { 
        sendSketchInfo("Water meter", "1.0");
        present(CHILD_ID, S_WATER);
    }
    
    void setup() {
        pinMode(BATT_ADC_PIN, INPUT);
        pinMode(SENSOR_PIN, INPUT);
    }
    
    void loop() {
      
        // Sleep until we get a FALLING signal
        sleep(digitalPinToInterrupt(SENSOR_PIN), FALLING, 0);
    
        // Send the new water volume to the gateway
        total_water++;
        send(water_msg.set(total_water));
    
        // Send the battery level every 150L
        if (total_water % 150 == 0) {
            report_battery_level();
        }
    }
    

    It has been working well globally. However I ran into two issues.

    First problem: For an unknown reason, my Raspberry Pi gateway stopped responding at some point. As a result, the send() call probably failed and I suspect the sleep() method to immediately return with a value I don't check. When I restarted the gateway, the node reported a water consumption of a million liters so the loop() method was surely called multiple times...
    This is the first issue to fix in the code.

    Second problem: When the voltage of the batteries was too low, the same kind of problem also occured. I could see in the gateway logs that the sensor tried to reconnect multiple times without success (there were NACKs during the presentation step), since the radio didn't have enough power to operate normally. When presentation finally succeeded, the node reposted a water consumption of a million litters. So here again, the loop() method had been called multiple times.

    So, what is the "robust" way to prevent these problems, in the code?

    The underlying questions probably are:

    • What will happen if the send() call fails? Will the code continue or will it retry sending the data a few times?
    • In which cases will the sleep() method actually not put the node into sleep mode?
    • Do you know why the loop() method was called multiple times, resulting in a large value of the total_water variable during the issue? According to this discussion, I would have thought the loop() method would not have been called again if the node wasn't successfully reconnected to the gateway due to the while (!isTransportReady() && (sleepDeltaMS < sleepingTimeMS) && (sleepDeltaMS < MY_SLEEP_TRANSPORT_RECONNECT_TIMEOUT_MS)) part of the code... Unless _process() actually calls loop()?

    It would be great to have complete "robust" sketch examples in the MySensors repository on GitHub, since all examples consider that the gateway will always be responding, that the node will always be correctly power and will always successfully send its payload.

    On the hardware side, the Connecting the radio page also assumes that the RFM69W radio will always be operating correctly and it's not recommended to connect the RESET pin of the radio to the Arduino / ATMEGA. That's unfortunate since connecting that pin on one of my projects recently fixed stability issues! 😉

    Thanks in advance for your help!



  • This is a pretty complicated example / problem. I think maybe why you have not received any replies. I know I had to read multiple times, and even then, I am still not sure I understand the problem.

    Maybe try to break it down to one particular part of the problem, so us mortals perusing the forums whilst enjoying our coffee in the morning can at least have a chance at understanding it. 😉

    Have you made any progress on this in the meantime?



  • @Encrypt Sorry, meant to respond to this when you posted it but got distracted 🙄 again...
    If I recall correctly the calls to Gateway are limited to 7 or 8 attempts or thereabouts, but if your Gateway falls over every increment it will do the same. This can hammer Node batteries as I discovered after a power failure (regular feature) killed the Pi3, now solved with a DC/DC UPS.
    Never had comms issues with the RFM69 on 433MHz but they do use proper whips.

    Have a similar setup with an Elster sensor, but by the time the water meter was done I'd learned from similar "issues" from my first ever Node on the gas meter which uses a Reed...

    Interrupt on FALLING was causing rapid cycles of the increment the trick was to ensure the increment only happened once on every wake from Sleep.
    Set the interrupt to LOW, verify that the pin does indeed read LOW after wakening (in case of false triggers), increment the count, send the updated value, then verify in a SLEEP loop that the pin has indeed gone HIGH again before going to deep sleep.

    Hope this helps



  • Hello!

    Thank you @TRS-80 and @zboblamont for your answers and sorry for the delay...

    To answer your question @TRS-80, I'm actually trying to figure out what is the most robust way to use the MySensors library.
    Most examples out there consider that all underlying layers (hardware, transmission...) will just work an the return value of sleep / send is never used.

    I've found that if the send call fails due to missing hardware ACKs, the node ends up registering again. So, I was considering retrying sending data if send failed, but MySensors already retries sending the data a few times as @zboblamont says.

    Regarding the sleep call, I'm now checking the return value and making sure that the return code is the interrupt, as specified by the API. Here is my new code:

    // MySensors configuration
    #define MY_SPLASH_SCREEN_DISABLED
    #define MY_RADIO_RFM69
    #define MY_RFM69_RST_PIN 8
    #define MY_RFM69_NEW_DRIVER
    #define MY_RFM69_ENABLE_ENCRYPTION 
    
    // Include libraries
    #include <MySensors.h>
    
    // Global settings
    #define BATT_ADC_PIN A1
    #define CHILD_ID 0
    #define SENSOR_PIN 3
    #define SENSOR_INT digitalPinToInterrupt(SENSOR_PIN)
    
    // Global variables
    byte last_batt_level = 0;
    unsigned long total_water = 0;
    MyMessage water_msg(CHILD_ID, V_VOLUME);
    
    // Check the battery voltage and eventually send it
    void report_battery_level() {
    
        // Read the battery voltage
        long adc = analogRead(BATT_ADC_PIN);
    
        // Compute the battery level
        // Vmin = 1.6/(3.3/1023) = 496
        // Vmax = 3/(3.3/1023) = 930
        long level = (adc-496)*100/434;
        byte batt_level = constrain(level, 0, 100);
        
        // The battery level changed, send it
        if (batt_level != last_batt_level) {
            sendBatteryLevel(batt_level);        
            last_batt_level = batt_level;
        }
    }
    
    void presentation() { 
        sendSketchInfo("Water meter", "1.0");
        present(CHILD_ID, S_WATER);
    }
    
    void setup() {
        pinMode(BATT_ADC_PIN, INPUT);
        pinMode(SENSOR_PIN, INPUT);
    }
    
    void loop() {
      
        // Sleep until we get a FALLING signal
        byte interrupt = sleep(SENSOR_INT, FALLING, 0);
    
        // We were woken up by the sensor
        if (interrupt == SENSOR_INT) {
    
            // Send the new water volume to the gateway
            total_water++;
            send(water_msg.set(total_water));
        
            // Send the battery level every 150L
            if (total_water % 150 == 0) {
                report_battery_level();
            }
        }
    }
    

    I've had communication issues since I changed the code and the reported water consumption didn't skyrocket as previously, which is good.

    To answer to @zboblamont, the FALLING interrupt is fine actually. The water volume is incremented "rather slowly", at most every 7s I'd say (if we open all taps :P).

    I'll have to fix the communication problems and to do that I'll have to design proper PCBs. The sensors / gateway are on breadboards right now, which is surely suboptimal.

    Cheers!



  • @Encrypt It sounds as if you are making progress, moving off breadboard will certainly make for greater reliability.

    My reasoning for checking HIGH and LOW is that these are fixed states rather than in transition. FET type sensors are very fast, but the Elster Water Meter Node would send an intermittent false increment (No idea why/what), which is why the additional check on pin state was introduced.
    That pin check was spurred by experience with the Gas Node, which gave rapidly escalating readings, found to be caused by the reed closing for ca 6 seconds. Checking in a Sleep loop that the pin had changed back from LOW to HIGH before going deep sleep resolved it, 3 years on it's still in complete sync with the register.



  • Besides what Bob (?) mentioned (you seem to be making progress, breadboard unreliability, specifics), I would point out that there have been a number of discussions about ACKs, re-sending, reliability, keeping stats of NACK percentages (as a possible indicator of some problem), etc. over the years. The most recent I can recall was probably this one that I learned about when @BearWithBeard mentioned it here in my evidence based radio testing method (and capacitors) thread. My thread is more about hardware and radio testing, but we get into the subject of software ACKs etc. there in my thread a bit, and also in the linked thread.

    Is that the sort of stuff you are looking for?


Log in to reply
 

Suggested Topics

1
Online

11.2k
Users

11.1k
Topics

112.5k
Posts