Saving three bytes of memory with this crazy loop structure..



  • In order to use encryption on Arduino Nano's I'm trying to save as much memory as possible. It's a real learning experience.

    In order to create a non-blocking 'once per second' clock without using a (4 byte) long variable I created the monster below. It uses a modulo function to measure seconds instead, as well as a single boolean (1 byte) to make sure the loop runs only once per second.

    (#define LOOPDURATION = 1000)
    
      static boolean loopDone = false;                        // used to make sure the 'once every millisecond' things only run once every millisecond (or 2.. on very rare occasions the millis() function skips a millisecond.);
    
      // Main loop to time actions.
      if( (millis() % LOOPDURATION) < 4 && loopDone == false ) { 
        loopDone = true;
    
        // once-per-second stuff here.
      }
    
      // This resets the loopDone variable after the loop has run.
      if( (millis() % LOOPDURATION) > LOOPDURATION - 4 && loopDone == true ) {
        loopDone = false;  
      }
    
    
    

    I'd be curious what you think about this method.

    • It uses more space in the flash memory
    • Does using modulo use a lot of processing power?

  • Mod

    @alowhum trading one (or more) resource(s) for another is what optimizing is about. So if you need ram, this is reasonable. And no, using modulus will not affect the performance of most applications.

    What I would worry about is the maintainability / readability of the code. What happens if you need to modify the code 2 years from now, will you remember how it works? What is the risk of introducing new bugs or strange side-effects? Is that risk worth saving the ram? If it is, then you use it. I don't see it being anything more complicated than that.


  • Mod

    @alowhum I guess that if your loop runs slower than once every 4ms you might miss the 1 second deadline.

    How about this?

    void loop()
    {
        static uint8_t first = 1;
        static uint8_t secsPrev;
        uint8_t secs = millis() / 1000;
        if (first or (secs != secsPrev))
        {
            // ... code that runs every second...
            secsPrev = secs;
        }
        first = 0;
    }
    

    For AVR:
    Flash: 602 bytes vs 634
    Ram: 12 bytes vs 10 (you beat me there 😉 )

    Or even, when timing doesn't need to be too accurate (2.4% slower):

    void loop()
    {
        static uint8_t first = 1;
        static uint8_t secsPrev;
        uint8_t secs = millis() >> 10;
        if (first or (secs != secsPrev))
        {
            // ... code that runs every second...
            secsPrev = secs;
        }
        first = 0;
    }
    

    For AVR:
    Flash: 526 bytes vs 634
    Ram: 12 bytes vs 10

    Can we go even lower? 💪


  • Mod

    @yveaux you could probably use the MSB of secsPrev for the "first" boolean and save a byte of ram?


  • Mod

    @mfalkvidd

    void loop()
    {
        static int8_t secsPrev = -1;
        const int8_t secs = int8_t(millis() / 1000) & 0x7F;
        if ((secsPrev < 0) or (secs != secsPrev))
        {
            // ... code that runs every second...
            secsPrev = secs;
        }
    }
    

    For AVR:
    Flash: 595 bytes vs 634
    Ram: 11 bytes vs 10 (almost there 😉 )

    Wheeeeeeeeeee!


  • Mod

    An alternative is to move the counter to another hardware resource (a timer, for instance):

    void setup() {
      cli();
      TCCR1A = 0;// set entire TCCR1A register to 0
      TCCR1B = 0;// same for TCCR1B
      TCNT1  = 0;//initialize counter value to 0
      // set compare match register for 1hz increments 15624 for 16MHz, 7812 for 8MHz
      OCR1A = 7812;// = (16*10^6) / (1*1024) - 1 (must be <65536)
      // turn on CTC mode
      TCCR1B |= (1 << WGM12);
      // Set CS10 and CS12 bits for 1024 prescaler
      TCCR1B |= (1 << CS12) | (1 << CS10);
      // enable timer compare interrupt
      TIMSK1 |= (1 << OCIE1A);
      sei();
    }
    
    volatile uint8_t shouldRun = 1;
    
    ISR(TIMER1_COMPA_vect) {
      shouldRun = 1;
    }
    
    void loop() {
      // put your main code here, to run repeatedly:
      if (shouldRun) {
        // ... code that runs every second...
        shouldRun = false;
      }
    }
    

    Flash 570 bytes
    Ram: 11 bytes
    This solution uses 1 byte less (or is it 3 bytes, since it also uses 2 pointers less?) on the stack as well, which might make a difference. But it reserves timer1, which might or might not be OK depending on your application.


  • Mod

    @mfalkvidd Nice one!

    To quote @mfalkvidd : "What I would worry about is the maintainability / readability of the code" 😉



  • Wow, what a response! Amazing stuff!

    thanks @mfalkvidd for the explanation.

    @Yveaux that millis() / 1000 is very elegant. That's a very interesting direction.

    For now though, looking at some of the creations.. I think I'll stick with my modulo system for readability 😄 Very cool though.


  • Admin

    @Yveaux @mfalkvidd Now it gets ugly - and we are not talking ASM yet... 😜

    ISR (WDT_vect) {
      WDTCSR = _BV(WDCE) | _BV(WDE); 
      WDTCSR = _BV(WDIF) | _BV(WDIE) | 6; // 1s
      EEARL = 1;
    } 
    
    void setup() {
      WDT_vect();
    }
    
    void loop() {
        if(EEARL) {
          EEARL = 0;
          // ... code that runs every second...
        }
    }
    

    Flash: 502 bytes
    Ram: 9 bytes



  • @tekka Whoa 🙂 Can you elaborate what your code voodoo does a little bit?

    For a balance of readability, I was pondering how to make this:

    • Divide millis() / 1000
    • round that down
    • get the last bit of that rounded down variable
    • if that last bit is different that before, a second has passed.

  • Mod

    @alowhum that's exactly what my first code snippet does (apart from testing the lowest bit, but you don't need that. Change in seconds is sufficient)



  • @Yveaux I know, I really like it. But I was thinking about shaving of another byte somehow 🙂


  • Admin

    @alowhum Instead of a timer, the watchdog interrupt is used to set a flag on a (in this sketch) not-used eeprom addressing register (EEARL)...certainly not a generic approach, but functional 😃


  • Mod

    @tekka so it wouldn't work in a MySensors sketch? That's cheating 😉


  • Admin

    @mfalkvidd Strictly speaking, only @Yveaux's solution would work without modifications to the MySensors core 🙂 But the challenge was weakly defined, so no cheating in that sense 😅


  • Mod

    @tekka would it? I verified that my solution worked on a serial gateway. But maybe there was some aspect that I didn't test, that would fail.


  • Mod

    @tekka I couldn't agree more 😉
    IMHO readability is the differentiator here.



  • Hi @mfalkvidd,

    Problem is, when your code can't compile because you are one (or several) bytes short of ram, nothing else matters.

    As to readability, I assume you know about that rarely used compiler feature called 'comments'? I hear they use zero Arduino RAM and even less ROM memory. 😳.

    Ok, I'm just being cheeky, so don't flame me, it just seemed a good opportunity for a reminder to everybody. Point being we are all guilty of not using enough comments in our code.

    You said you are worried about readability and maintainability - it's just like code backups, it's a problem only because we only worry about them 'after' a drive crash, or in the case of comments, two years later when we are trying to remember what the hell this weird code does.. THE REALITY: If we are really worried, we would add liberal comments and do regular backups - otherwise I say we're not really 'that' worried.

    Cheers,

    Paul


 

456
Online

7.9k
Users

8.8k
Topics

93.8k
Posts