Over the air updates

Damme

@ToSa I've been working on getting OTA to work with MQTTgateway with some success.

But I do have problem with some packages missing and I think the communication should be something like this;
bootloader checks id and version and server said there is an update. (no change from today)
but then:

[bootloader] 0000 has CHK FF(just filler in first package) REQ 0000 type 01 version 01
[server] load 0000 from hex, send addr 0000 0C9428030C9447240C9474240C947605 C7
[bootloader] 0000 has CHK FF, REQ 0010 type 01 version 01
[server] (checksum mismatch) send addr 0000 0C9428030C9447240C9474240C947605 C7
[bootloader] 0000 has CHK C7, REQ 0010 type 01 version 01
[server] load 0010 from hex send addr 0010 0C94A3050C94D0050C9480100C945003 00
And so on.. :)

what do you think about this? the total package is 32bytes, mysensors header is 7bytes. and this layout would need 19 bytes from server to bootloader..
I Have also seen some intel hex that is not in order 0010 0020 0030 etc but it could jump address. I do not think arduino ide does this but you never know..

EDIT:
I havn't read this one yet but I guess there is alot of good stuff in it :)
http://www.nordicsemi.com/eng/nordic/download_resource/10878/2/94069421

ToSa

@JeJ @mikeones : how is your serial gateway connected? using a USB-Rs232 cable or via the GPIO pins on the RPi? Did you check the Readme.md in the NodeJsController directory?

Zeph

@Damme said:

what do you think about this? the total package is 32bytes, mysensors header is 7bytes. and this layout would need 19 bytes from server to bootloader..

16 bit offset, 16 data bytes, one byte checksum, right?

I Have also seen some intel hex that is not in order 0010 0020 0030 etc but it could jump address. I do not think arduino ide does this but you never know..

I see that your descriptions say "0010 from hex" etc, but I thought you would be fetching from a binary blob to satisfy requests from the bootloader. As in:

Server reads the Intel hex and uses it to fill in an array of bytes. (one time, or each time a given file is requested)
Server sends requested 16 byte chunks of that array to bootloader

In that case, it doesn't matter what order the original hex lines are in, or even if they are 16 or 32 bytes wide (or less than 16 bytes at the end).

Damme

@Zeph true (array) and Yes, so the node can request same address twice (might be a timeout) and verify checksum on every 16byte data.

ToSa

@Damme said:

@ToSa I've been working on getting OTA to work with MQTTgateway with some success.

great!

[bootloader] 0000 has CHK FF(just filler in first package) REQ 0000 type 01 version 01
[server] load 0000 from hex, send addr 0000 0C9428030C9447240C9474240C947605 C7
[bootloader] 0000 has CHK FF, REQ 0010 type 01 version 01
[server] (checksum mismatch) send addr 0000 0C9428030C9447240C9474240C947605 C7
[bootloader] 0000 has CHK C7, REQ 0010 type 01 version 01
[server] load 0010 from hex send addr 0010 0C94A3050C94D0050C9480100C945003 00
And so on.. :)

what do you think about this? the total package is 32bytes, mysensors header is 7bytes. and this layout would need 19 bytes from server to bootloader..

If I understand correctly you would send the CRC of the previous bloak back to the server together with the request for the next package - the server would then send the next block if the CRC is correct or resend the previous block if the CRC is not ok...
I'm wondering how the bootloader would ever run into that situation. The package itself is checksum'ed already and wouldn't be treated as correctly received package if the checksum is incorrect. Only if the previous block was received correctly the next block is requested. If that doesn't happen within a given amount of time the same block is requested again.
Can you explain a bit further what issues you are running into?

@Damme said:

I Have also seen some intel hex that is not in order 0010 0020 0030 etc but it could jump address. I do not think arduino ide does this but you never know..

Yes, I've seen that as well - actually Codebender does that sometimes and the "HEX file loader" function on the NodeJsController would need to be adjusted to address that (I'm not reading from the .hex file when a package arrives but from a byte array in Mongo)
Probably the best approach for this one would be to add the next address the bootloader should request to the previous package (e.g. "here is the data for 0x0000 - next request should be for 0x0010) and allow for variable block length because that's another one where Codebender sometimes uses <16 byte rows in the hex (not just at the end). The code on the bootloader side would get a little more complex as the flash is still written in pages and these might not line-up anymore with the 16byte blocks.

@Damme said:

I havn't read this one yet but I guess there is alot of good stuff in it :)
http://www.nordicsemi.com/eng/nordic/download_resource/10878/2/94069421

That chip is a totally different animal - same RF but 8051 MCU. I'll have a look at the AppNote tomorrow.

ToSa

@Damme said:

@ToSa I've been looking through the ota bootloader and noticed there are alot of uint16_t wich can be replaced with uint8_t.. saves 128bytes of code. Still needs ~900bytes less until 1024 words bootloader though but is makes more space for other stuff :)

I'll have a look. I've taken the code from an earlier project and adjusted to MySensors - didn't review the variable types that much. I'm using CRC16 as well where CRC8 might be sufficient...

EDIT: got it down a little from 0x0E18 to 0x0DD0 (72 bytes) changing a few loop counters from uint16 to uint8. I don't want to change type to 8bit looking at the large amount of sensors people are asking for / working on. FOr version I'm planning to keep some of these running for a long time with as little maintenance as possible. With some software improvements over time and minor version changes during development 16bit for version seems to be the better fit as well.

mikeones

@ToSa I use a Mini-B USB cable between my PRi and my gateway.

ToSa

@mikeones said:

dev/ttyUSB0

Then /dev/ttyUSB0 is correct. /dev/ttyAMA0 would only be valid for the on-board serial port on the GPIO pinheads.
Are you running into any issues once you set the port in NodeJsGateway accordingly?

JeJ

@ToSa I have my gateway connected via the GPIO and i have followed the steps in the Readme.md.
I will try to use a USB-Rs232 cable and see what happens.

ToSa

@JeJ one potential reason is that the port is already in use. I mentioned somewhere that the startup script doesn't yet stop the NodeJsController correctly. Maybe you already have a NodeJsController process running? Try "sudo killall node" and then try starting it again. To check if the port itself is working you can try to open a simple terminal (minicom etc.) and reset the gateway.

Zeph

@ToSa said:

With some software improvements over time and minor version changes during development 16bit for version seems to be the better fit as well.

Hmm. That seems like overkill, if I'm understanding correctly. (So maybe I am not understanding).

What I heard was:

Each sensactuator node has a "node type" and a "version" within that node type. Each combination of sensors and pin assignments has a unique "node type" (within a given wireless network). A node can only be OTA updated to a newer (higher) version of the same "node type" of the current firmware, and all nodes of that "node type" will be updated.

And extra byte for "version" isn't a big deal tho.

Will there be one or two bytes for "node type"?

Damme

@Zeph 16bit calculations on a 8bit mcu will always come to a price. Imo I think we should try to keep things to 8bit as much as possible. but I dont know if its possible to shred another 900bytes out of the bootloader to fit in one less size of space (1024 words instead of 2048 words). Might be if we make a mini version of mysensors/mymessage

ToSa

@Zeph said:

Each combination of sensors and pin assignments has a unique "node type" (within a given wireless network).

Actually that's part of the question - as @hek mentioned there is a desire to sell MySensors hardware - at some point there might be not just generic pinhead PCBs but real fit-for-use devices. Ideally these would have a unique node type assigned not just within a given network. New firmware could be published on mysensors.org (or via codebender or...) and based on the unique (but common across networks) node type less tech-savvy people could be secured from sending a firmware that doesn't fit the hardware... I know - a LOT of "IF"s...

@Damme
you are right - probably not the full 900 bytes but additional space could be used for encryption etc. so every reduced byte is beneficial at this point. I'll check later how much can be saved by using CRC8 instead of CRC16.
I'm already using a mini version of mysensors / mymessage: not using the cpp code files at all but just the headers and if you have a look at the "#ifdef __cplusplus" statements just added for that purpose, there is almost nothing left (the MyMessage class is stripped down to a struct and the MySensors class removed completely / enums and #defines should not consume space after compilation)

Zeph

@ToSa

I'm realizing how similar the implementations of your model of updates and mine might be. This is just an early inspiration, not fully thought out.

uint8_t   node_type_id;  // same for multiple nodes
uint16_t version;   // loaded version for given node_type_id
... 
if(new_version > version) {  // test for OTA update needed

versus

uint8_t   node_id;   // unique per node
uint16_t  progmem_crc;  // calculated from PROGMEM
... 
if(new_progmem_crc != progmem_crc) { // test for OTA update needed

This might mean that I could (eventually) use a relatively minor fork of the OTA programming code to get the per-node flexibility that I seek.

ToSa

@Zeph
yes, that's what I meant - you might not even need any fork of the bootloader itself and just a slight adjustment on the controller end - because the nodeID is contained in the packet (not in the payload but in the header as sender address) so you have all you need for your setup

Zeph

@ToSa
The other half is testing inequality between the computed CRC of the application firmware in PROGMEM, with the CRC of the available replacement (rather than comparing for higher version number).

An example use case of the ability to load arbitrary new code into any given node. If I was diagnosing some kind of interference, I might temporarily replace the sensor firmware in some nodes (of varying node-type) with a custom radio test firmware, then later restore each with it's original sensor node firmware.

Suppose we have:

 node 5, node type 17, version 2, PROGMEM CRC 0x4567  // attic
 node 6, node type 3, version 5, PROGMEM CRC 0xABCD  // crawlspace
 node 7, node type 3, version 5, PROGMAM CRC 0xABCD // living room

And I want to temporarily replace the firmware in node 5 and 6, but keep 7 still running as a sensor.

I make RF test code available on the server, with CRC 0x7E57. This is not type 17 or type 3.

I edit the server's table of firmware assignments:

node 5, 0x7E57
node 6, 0x7E57
node 7, 0xABCD  // unchanged

This causes node 5 and 6 (formerly of different types) to load the test firmware when reload is triggered.

Then when testing is done, I edit the table back:

node 5, 0x4567   // back to its old type and version
node 6, 0xABCD  // back to the same type and version as node 7
node 7, 0xABCD  // still unaffected

This causes the normal sensor firmware (type and version) to be loaded back in on the next reload.

There could be more than just a CRC to identify the firmware (in order to avoid the birthday paradox), this is just an example.

An alternate use case is loading in my Halloween firmware to the front yard nodes (but not other nodes) for a week or two, then back..

Or an beta version of type 3, version 6, which I'd like to load on some type 3 nodes for in-situ testing (eg: in the crawlspace), but not all of the type 3 nodes because I want most of the system to continue functioning normally while I test. If the beta is bad, I may revert the test nodes to version 5; once the new version is good, I may convert all type 3 nodes to version 6.

These are some of the reasons I'd like to be able to use OTA programming of any arbitrary firmware into any given node, without being constrained to:

  Only upgrades of the same node type
  Only upgrades to higher version numbers
  Only upgrades of all nodes of the same type or none

And so that's why inequaity testing of the PROGMEM signature on a per-node basis is attractive, not just testing for a higher version number. For similar complexity, we can upgrade to a higher version number, downgrade to a different version number, or change the node type back and forth.

The type and version dynamics (which certainly IS a common use case) can be handled on the server. For example, the server can know what type every node is (kind of a good idea anyway), and can change the node -> signature entry for every node of type 3 to the signature of the next version, and then let it proceed as above to get them all updated. But that's just one option, centrally controlled.

ToSa

@Zeph
from the MyOtaBootloader.c:

if (firmwareConfigResponse->version == fc.version)
	if (firmwareConfigResponse->blocks == fc.blocks)
		if (firmwareConfigResponse->crc == fc.crc)

so as long as you send the same version / blocks / crc back to the node as what iscurrently installed, no update is started. As soon as one of the three elements differs an update is loaded. It's completely in control of the server if (and which) firmware is bootloaded.

Zeph

@ToSa

OK, so version is tested for != rather than for > ? Downgrading is OK?

And CRC is used as well (and block count?) where CRC is based on what's in PROGMEM now?

Cool.

Then I think all that would be needed is for the server to be able to potentially feed back a different firmwareConfigResponse to each node. In my above example (which has been edited for clarity recently BTW, so re-read it), node 6 could receive a different response than node 7 (even tho they both have the same type initially). And thus nodes 5 and 6 (but not 7) could be told to load the test firmware and then later to go back to the old version. Etc.

Is that correct?

It would be a nice enhancement if we could query the node for the CRC (and block count?) of the current PROGMEM, just to help the server stay in sync with what's out there (eg: after a node joins the network). That could be done in the application code, so we don't even have to invoke the bootloader. Then the server could figure out which nodes need to be bootloaded and trigger just those to go into the bootloader (possibly one at a time). These two together support what I call push dynamics.

Damme

@Zeph I've been working on a read / write eeprom address thing in MQTT to be able to reset a node and stuff. But it seams there are more usage for it then. This might be coded into mysensors instead. (utilizing c_internal or somthing as the protocol is today)

ToSa

@Damme said:

@Zeph I've been working on a read / write eeprom address thing in MQTT to be able to reset a node and stuff. But it seams there are more usage for it then. This might be coded into mysensors instead. (utilizing c_internal or somthing as the protocol is today)

Good idea - that would allow to check for current value in normal operation - not just during bootloading.

@Zeph
If you urgently want to have the CRC of the current firmware submitted during bootloading, we can add this as a third parameter to the FirmwareConfigRequest message. Actually I was thinking about getting rid of request/response and use the same format for both which would mean crc would be included anyways.

Over the air updates

13

11.7k

11.2k

113.2k