Over the air updates

JeJ

@ToSa I have my gateway connected via the GPIO and i have followed the steps in the Readme.md.
I will try to use a USB-Rs232 cable and see what happens.

ToSa

@JeJ one potential reason is that the port is already in use. I mentioned somewhere that the startup script doesn't yet stop the NodeJsController correctly. Maybe you already have a NodeJsController process running? Try "sudo killall node" and then try starting it again. To check if the port itself is working you can try to open a simple terminal (minicom etc.) and reset the gateway.

Zeph

@ToSa said:

With some software improvements over time and minor version changes during development 16bit for version seems to be the better fit as well.

Hmm. That seems like overkill, if I'm understanding correctly. (So maybe I am not understanding).

What I heard was:

Each sensactuator node has a "node type" and a "version" within that node type. Each combination of sensors and pin assignments has a unique "node type" (within a given wireless network). A node can only be OTA updated to a newer (higher) version of the same "node type" of the current firmware, and all nodes of that "node type" will be updated.

And extra byte for "version" isn't a big deal tho.

Will there be one or two bytes for "node type"?

Damme

@Zeph 16bit calculations on a 8bit mcu will always come to a price. Imo I think we should try to keep things to 8bit as much as possible. but I dont know if its possible to shred another 900bytes out of the bootloader to fit in one less size of space (1024 words instead of 2048 words). Might be if we make a mini version of mysensors/mymessage

ToSa

@Zeph said:

Each combination of sensors and pin assignments has a unique "node type" (within a given wireless network).

Actually that's part of the question - as @hek mentioned there is a desire to sell MySensors hardware - at some point there might be not just generic pinhead PCBs but real fit-for-use devices. Ideally these would have a unique node type assigned not just within a given network. New firmware could be published on mysensors.org (or via codebender or...) and based on the unique (but common across networks) node type less tech-savvy people could be secured from sending a firmware that doesn't fit the hardware... I know - a LOT of "IF"s...

@Damme
you are right - probably not the full 900 bytes but additional space could be used for encryption etc. so every reduced byte is beneficial at this point. I'll check later how much can be saved by using CRC8 instead of CRC16.
I'm already using a mini version of mysensors / mymessage: not using the cpp code files at all but just the headers and if you have a look at the "#ifdef __cplusplus" statements just added for that purpose, there is almost nothing left (the MyMessage class is stripped down to a struct and the MySensors class removed completely / enums and #defines should not consume space after compilation)

Zeph

@ToSa

I'm realizing how similar the implementations of your model of updates and mine might be. This is just an early inspiration, not fully thought out.

uint8_t   node_type_id;  // same for multiple nodes
uint16_t version;   // loaded version for given node_type_id
... 
if(new_version > version) {  // test for OTA update needed

versus

uint8_t   node_id;   // unique per node
uint16_t  progmem_crc;  // calculated from PROGMEM
... 
if(new_progmem_crc != progmem_crc) { // test for OTA update needed

This might mean that I could (eventually) use a relatively minor fork of the OTA programming code to get the per-node flexibility that I seek.

ToSa

@Zeph
yes, that's what I meant - you might not even need any fork of the bootloader itself and just a slight adjustment on the controller end - because the nodeID is contained in the packet (not in the payload but in the header as sender address) so you have all you need for your setup

Zeph

@ToSa
The other half is testing inequality between the computed CRC of the application firmware in PROGMEM, with the CRC of the available replacement (rather than comparing for higher version number).

An example use case of the ability to load arbitrary new code into any given node. If I was diagnosing some kind of interference, I might temporarily replace the sensor firmware in some nodes (of varying node-type) with a custom radio test firmware, then later restore each with it's original sensor node firmware.

Suppose we have:

 node 5, node type 17, version 2, PROGMEM CRC 0x4567  // attic
 node 6, node type 3, version 5, PROGMEM CRC 0xABCD  // crawlspace
 node 7, node type 3, version 5, PROGMAM CRC 0xABCD // living room

And I want to temporarily replace the firmware in node 5 and 6, but keep 7 still running as a sensor.

I make RF test code available on the server, with CRC 0x7E57. This is not type 17 or type 3.

I edit the server's table of firmware assignments:

node 5, 0x7E57
node 6, 0x7E57
node 7, 0xABCD  // unchanged

This causes node 5 and 6 (formerly of different types) to load the test firmware when reload is triggered.

Then when testing is done, I edit the table back:

node 5, 0x4567   // back to its old type and version
node 6, 0xABCD  // back to the same type and version as node 7
node 7, 0xABCD  // still unaffected

This causes the normal sensor firmware (type and version) to be loaded back in on the next reload.

There could be more than just a CRC to identify the firmware (in order to avoid the birthday paradox), this is just an example.

An alternate use case is loading in my Halloween firmware to the front yard nodes (but not other nodes) for a week or two, then back..

Or an beta version of type 3, version 6, which I'd like to load on some type 3 nodes for in-situ testing (eg: in the crawlspace), but not all of the type 3 nodes because I want most of the system to continue functioning normally while I test. If the beta is bad, I may revert the test nodes to version 5; once the new version is good, I may convert all type 3 nodes to version 6.

These are some of the reasons I'd like to be able to use OTA programming of any arbitrary firmware into any given node, without being constrained to:

  Only upgrades of the same node type
  Only upgrades to higher version numbers
  Only upgrades of all nodes of the same type or none

And so that's why inequaity testing of the PROGMEM signature on a per-node basis is attractive, not just testing for a higher version number. For similar complexity, we can upgrade to a higher version number, downgrade to a different version number, or change the node type back and forth.

The type and version dynamics (which certainly IS a common use case) can be handled on the server. For example, the server can know what type every node is (kind of a good idea anyway), and can change the node -> signature entry for every node of type 3 to the signature of the next version, and then let it proceed as above to get them all updated. But that's just one option, centrally controlled.

ToSa

@Zeph
from the MyOtaBootloader.c:

if (firmwareConfigResponse->version == fc.version)
	if (firmwareConfigResponse->blocks == fc.blocks)
		if (firmwareConfigResponse->crc == fc.crc)

so as long as you send the same version / blocks / crc back to the node as what iscurrently installed, no update is started. As soon as one of the three elements differs an update is loaded. It's completely in control of the server if (and which) firmware is bootloaded.

Zeph

@ToSa

OK, so version is tested for != rather than for > ? Downgrading is OK?

And CRC is used as well (and block count?) where CRC is based on what's in PROGMEM now?

Cool.

Then I think all that would be needed is for the server to be able to potentially feed back a different firmwareConfigResponse to each node. In my above example (which has been edited for clarity recently BTW, so re-read it), node 6 could receive a different response than node 7 (even tho they both have the same type initially). And thus nodes 5 and 6 (but not 7) could be told to load the test firmware and then later to go back to the old version. Etc.

Is that correct?

It would be a nice enhancement if we could query the node for the CRC (and block count?) of the current PROGMEM, just to help the server stay in sync with what's out there (eg: after a node joins the network). That could be done in the application code, so we don't even have to invoke the bootloader. Then the server could figure out which nodes need to be bootloaded and trigger just those to go into the bootloader (possibly one at a time). These two together support what I call push dynamics.

Damme

@Zeph I've been working on a read / write eeprom address thing in MQTT to be able to reset a node and stuff. But it seams there are more usage for it then. This might be coded into mysensors instead. (utilizing c_internal or somthing as the protocol is today)

ToSa

@Damme said:

@Zeph I've been working on a read / write eeprom address thing in MQTT to be able to reset a node and stuff. But it seams there are more usage for it then. This might be coded into mysensors instead. (utilizing c_internal or somthing as the protocol is today)

Good idea - that would allow to check for current value in normal operation - not just during bootloading.

@Zeph
If you urgently want to have the CRC of the current firmware submitted during bootloading, we can add this as a third parameter to the FirmwareConfigRequest message. Actually I was thinking about getting rid of request/response and use the same format for both which would mean crc would be included anyways.

Damme

This post is deleted!

Damme

I deleted my last message because I though I made a big mistake..

I've been working on a SD <-> OTA loader node, and got most of if working but got stuck on the last piece which is communication.. (i'll release it then I'm finished Ive made a small change in myotabootloader, add on line ~156 msg.destination = OTAGATEWAY; to configure custom ota address)

I cant figure the following out:
Just ignore contents of packages. not relevant.

Node: (Ota<->sd loader)

read: 34-0-254 s=255,c=4,t=0,pt=6,l=4:FFFFFFFF
send: 254-254-0-34 s=255,c=4,t=1,pt=8,l=4,st=ok:0100020000304200

GW:
0;0;3;0;9;read: 34-34-0 s=255,c=3,t=7,pt=0,l=0:
0;0;3;0;9;send: 0-0-34-34 s=255,c=3,t=8,pt=1,l=1,st=ok:0
0;0;3;0;9;read: 34-34-0 s=255,c=3,t=7,pt=0,l=0:
0;0;3;0;9;send: 0-0-34-34 s=255,c=3,t=8,pt=1,l=1,st=ok:0
0;0;3;0;9;read: 34-34-254 s=255,c=4,t=0,pt=6,l=4:FFFFFFFF
0;0;3;0;9;send: 34-0-254-254 s=255,c=4,t=0,pt=6,l=4,st=ok:FFFFFFFF
0;0;3;0;9;read: 254-254-34 s=255,c=4,t=1,pt=6,l=8:0100020000304200
0;0;3;0;9;send: 254-0-0-34 s=255,c=4,t=1,pt=6,l=8,st=fail:0100020000304200
OTA bootloader:
Go
<- 34,34,0,2,3,7,255,
<- 34,34,0,2,3,7,255,
-> 0,0,34,10,35,8,255,0,
<- 34,34,254,34,196,0,255,255,255,255,255,

What am I missing? package from 254 to 34 wont get delivered.
I've also noticed that then 254 tries to send, it wont receive the next transmitted message from OTAbootloader. the next thereafter is received.

ToSa

@Damme
I need to better understand the setup to think about what's going on. My take from the above:

You have three nodes:

Gateway (address 0)
SD OTA Loader Node ?!? (address 254)
Sensor Node (address 34)

Is that right?

Damme

@ToSa Yes, And I think I figured it out.. I by mistake changed BROADCAST_ADDRESS to GATEWAY_ADDRESS in the bootloader then I was playing around. Testing the correct version now.. :)

ToSa

@Damme
interesting setup :+1:
to make it work with non-static addressed nodes you should probably keep the destination set to GATEWAY_ADDRESS for the REQUEST_ID call and only change afterwards.

Never mind - looking at the line number you mentioned that's probably what you did :)

Damme

@ToSa Now I remember why I changed some things in there. (broadcast to gateway)

From the beginning I had problem getting it to talk with the GW. It only sends out
<- 255,255,255,2,3,7,255, and gets no response, The GW tries to send but fails. (wierd..) (I dont have any relay nodes)

This is with no modifications at all.

0;0;3;0;9;read: 255-255-255 s=255,c=3,t=7,pt=0,l=0:
0;0;3;0;9;send: 0-0-255-255 s=255,c=3,t=8,pt=1,l=1,st=fail:0
0;0;3;0;9;read: 255-255-255 s=255,c=3,t=7,pt=0,l=0:
0;0;3;0;9;send: 0-0-255-255 s=255,c=3,t=8,pt=1,l=1,st=fail:0
other packages send out works just fine.. (To other nodes)

and the OTA bootloader can receive other packages
Go
<- 255,255,255,2,3,7,255,
-> 23,23,0,42,225,1,11,205,204,90,66,1,
<- 255,255,255,2,3,7,255,
<- 255,255,255,2,3,7,255,
<- 255,255,255,2,3,7,255,

(from a temp / hum node)

Any ideas how to fix this?

Damme

@ToSa I finally figured out why my OTA bootloader didn't read any answers from my GW (Both on I_FIND_PARENT and I_ID_REQUEST) - The answers came to quick! First I tried hardcode a delay 125ms on the GW and it worked, so I changed the code on send write to the following and now all messages arrive. Been testing it for a couple of reboots now. I'm using 5v (at 3.3v) and 16MHz
edit; noticed it misses packages sometimes now but not close to 100% like before, more like 5% now. I'llinvestigate futher then I'm trying to upload data.

  static uint8_t sendAndWait(uint8_t reqType, uint8_t resType) {
  	msg.type = reqType;
  	for (uint8_t i = 0; i < 10; i++) {
  		sendWrite(msg);
  		for (uint8_t j = 0; j < 20; j++) {
  			for (uint8_t j = 0; j < 100; j++) {
  				uint8_t pipe;
  				boolean avail = available(&pipe);
  				wdt_reset();
  				if (avail && pipe<=6) {
  					read(rmsg.array,pipe);
  					if(!(mGetVersion(rmsg) == PROTOCOL_VERSION))
  						continue;
  					if (rmsg.destination == nc.nodeId) {
  						if (mGetCommand(rmsg) == C_INTERNAL) {
  							if (rmsg.type == I_FIND_PARENT_RESPONSE) {
  								if (rmsg.data[0] < nc.distance - 1) {
  									nc.distance = rmsg.data[0] + 1;
  									nc.parentNodeId = rmsg.sender;
  									eeprom_write_byte((uint8_t*)EEPROM_PARENT_NODE_ID_ADDRESS, nc.parentNodeId);
  									eeprom_write_byte((uint8_t*)EEPROM_DISTANCE_ADDRESS, nc.distance);
  								}
  							}
  						}
  						if ((mGetCommand(rmsg) == mGetCommand(msg)) && (rmsg.type == resType))
  							return 1;
  					}
  				}
  				delaym(1);
  			}
  		}
  	}
  	return 0;
  }

Damme

I had to put my project in the trash bin.. There is not enough RAM in the atmega328 to fit mysensors and SD-lib :) Tried 3 different versions..Too bad..! I could only transmit one package before SRAM got overrunned.

Over the air updates

11

11.7k

11.2k

113.1k