Option non blocking registration at gateway

Bogdan Brezuica

I have the same issue with a gas sensor. I would really like the buzzer to sound even if the gateway is unreachable. Letting the home automation know there is a gas leak in order to take other measures wold be optimal but sounding the alarm is mandatory.

hek

It's only during startup the node does a handshake with the parent node (which can be disabled using MY_TRANSPORT_DONT_CARE_MODE in dev branch). During normal operation the library never block even if communication link is down.

tekka

Important: enabling MY_TRANSPORT_DONT_CARE_MODE requires setting MY_PARENT_NODE_ID

eric_smid

Great this option is available!
Is it possible to let the node use the last used parent, instead of configuring the parent?

tekka

@eric_smid No, not in the current implementation

TheoL

@hek Just curious. How important is the handshake? Because if it's important I think it would be a good option, to let node's do a delayed hand shake when the initial one fails. So that they can continue, but just won't use the my sensors part until a delayed handshake is done.

A delayed hand shake would only be needed when there's a power cut.

Heizelmann

Coming from this thread, where I learned that an implementation of this feature is in progress.

In my opinion it is an essential architectural feature. Reasons for this you can see above.

Here I summarize what I think is necessary:

General requirements

Devices should work independently from the gateway
Enter loop() even if gateway is not reachable
Buffering of messages not necessary. Its up to the user to implement this.

Needed functionality

retrieve available gateways (gets a list of available gateways)
request to register ( gets success or failure with reason)
de-register ( gets success or failure with reason)

DavidZH

@Heizelmann

I've been thinking about his as well. Probably for different reasons though.

I find myself with time to spare and my laptop at hand quite often, but I'm not home at those moments. I'd love to be able to write code and test sensors at those times, but that's impossible because there's no gateway.

For that I'd like to propose a "Test Mode".

test port to Transport.
NO connection to gateway.
simulate the Transport calls in the sketch with just a line in the debug with ACK.
when possible: send "messages" to node with the Arduino serial monitor.

I am aware that this might pose some biiiiiiig changes in the lib. So this might be something for the 2.5 or 3.0 release.
Thoughts?

Heizelmann

I am now ready with my node and I need the non-blocking feature urgently. Is there an estimation when this feature will be come or can anyone provide a workaround?

hek

If you use the development branch and want to prohibit the registration wait. Add

#define MY_TRANSPORT_WAIT_READY_MS 1

@tekka, correct me if I'm wrong please ;)

tekka

@hek correct :+1:
@Heizelmann Would be good to get some feedback on this feature, please let us know how it works in your hands.

Heizelmann

Here my report after a quick test.

To say it short: Generally I am happy with it, it works very well. Only one small problem.

In detail:
When connection is initially off, the sketch entered after the defined wait the loop.

before()
5 TSM:INIT
7 TSF:WUR:MS=1000
13 !TSM:INIT:TSP FAIL
15 TSM:FAIL:CNT=1
17 TSM:FAIL:PDT
1008 MCO:BGN:STP
Setup()
1510 MCO:BGN:INIT OK,TSP=0

If the sketch tries to send within the loop a message is logged.

145312 !MCO:SND:NODE NOT REG

If connection comes back the sketch connects to the gateway on the fly within the defined wait time.

244251 TSM:FPAR:OK
244252 TSM:ID
244254 TSM:ID:OK
244255 TSM:UPL
244259 TSF:MSG:SEND,42-42-0-0,s=255,c=3,t=24,pt=1,l=1,sg=0,ft=0,st=OK:1
244265 TSF:MSG:READ,0-0-42,s=255,c=3,t=25,pt=1,l=1,sg=0:1
244271 TSF:MSG:PONG RECV,HP=1
244274 TSM:UPL:OK
244276 TSM:READY:ID=42,PAR=0,DIS=1
244281 TSF:MSG:SEND,42-42-0-0,s=255,c=3,t=15,pt=6,l=2,sg=0,ft=0,st=OK:0100
244288 TSF:MSG:READ,0-0-42,s=255,c=3,t=15,pt=6,l=2,sg=0:0100
244295 TSF:MSG:SEND,42-42-0-0,s=255,c=0,t=17,pt=0,l=10,sg=0,ft=0,st=OK:2.1.0-beta
244304 TSF:MSG:SEND,42-42-0-0,s=255,c=3,t=6,pt=1,l=1,sg=0,ft=0,st=OK:0
244321 TSF:MSG:READ,0-0-42,s=255,c=3,t=6,pt=0,l=1,sg=0:M
Presentation()
244329 TSF:MSG:SEND,42-42-0-0,s=255,c=3,t=11,pt=0,l=22,sg=0,ft=0,st=OK:My Node
244339 TSF:MSG:SEND,42-42-0-0,s=255,c=3,t=12,pt=0,l=3,sg=0,ft=0,st=OK:1.2
244347 TSF:MSG:SEND,42-42-0-0,s=1,c=0,t=1,pt=0,l=0,sg=0,ft=0,st=OK:
244356 TSF:MSG:SEND,42-42-0-0,s=2,c=0,t=1,pt=0,l=0,sg=0,ft=0,st=OK:
244364 TSF:MSG:SEND,42-42-0-0,s=3,c=0,t=3,pt=0,l=0,sg=0,ft=0,st=OK:
244370 MCO:REG:REQ
244373 TSF:MSG:SEND,42-42-0-0,s=255,c=3,t=26,pt=1,l=1,sg=0,ft=0,st=OK:2
244380 TSF:MSG:READ,0-0-42,s=255,c=3,t=27,pt=1,l=1,sg=0:1
244385 MCO:PIM:NODE REG=1

Only problem is that you get high logging traffic in the case when you are in loop and network connection gets lost again as long as the connection is lost.

605458 TSF:MSG:READ,1-1-1,s=1,c=1,t=1,pt=0,l=0,sg=0:
605463 !TSF:MSG:LEN,0!=7
605465 TSF:MSG:READ,1-1-1,s=1,c=1,t=1,pt=0,l=0,sg=0:
605470 !TSF:MSG:LEN,0!=7
605472 TSF:MSG:READ,1-1-1,s=1,c=1,t=1,pt=0,l=0,sg=0:
605477 !TSF:MSG:LEN,0!=7
...

tekka

@Heizelmann Thanks for your testing. I couldn't reproduce this behavior, can you post your sketch for further analysis?

Heizelmann

@tekka May be it is the kind of my testing. For getting quick results I simply plug and un-plug the radio module (NRF24L01+) from the socket in my sensor node. I know, this is not a real use case but it was easy and I didn't thought there is adifference in behaviour. How do you you switch off and on the network connection in your test?

tekka

@Heizelmann yes, unplugging the radio results in what you observed. Simply unplug the GW (and repeaters)...

Heizelmann

@tekka OK unplugging the GW leads to a different behaviour.
First the node didn't detect the failure as long it is not trying to send.
When it tries to send it fails

131268 !TSF:MSG:SEND,42-42-0-0,s=1,c=1,t=16,pt=0,l=1,sg=0,ft=0,st=NACK:1

and after some more failure this

238494 !TSM:READY:UPL FAIL,SNP
238497 TSM:FPAR
238534 TSF:MSG:SEND,42-42-255-255,s=255,c=3,t=7,pt=0,l=0,sg=0,ft=6,st=OK:
238995 !TSF:SND:TNR
240002 !TSF:SND:TNR
240542 !TSM:FPAR:NO REPLY
240544 TSM:FPAR
240581 TSF:MSG:SEND,42-42-255-255,s=255,c=3,t=7,pt=0,l=0,sg=0,ft=0,st=OK:
246683 !TSM:FPAR:FAIL
246685 TSM:FAIL:CNT=1
246687 TSM:FAIL:PDT

and after a while mostly only

609748 !TSF:SND:TNR

When gateway comes back a proper reinitialization takes place:

620572 TSM:FAIL:RE-INIT
620574 TSM:INIT
620581 TSM:INIT:TSP OK
620583 TSM:INIT:STATID=42
620587 TSF:SID:OK,ID=42
620589 TSM:FPAR
620625 TSF:MSG:SEND,42-42-255-255,s=255,c=3,t=7,pt=0,l=0,sg=0,ft=0,st=OK:
Current: 0.00
0:0:0
Current: 0.00
0:0:0
621443 TSF:MSG:READ,0-0-42,s=255,c=3,t=8,pt=1,l=1,sg=0:0
621448 TSF:MSG:FPAR OK,ID=0,D=1
Current: 0.00
0:0:0
Current: 0.00
0:0:0
622633 TSM:FPAR:OK
622635 TSM:ID
622637 TSM:ID:OK
622638 TSM:UPL
622642 TSF:MSG:SEND,42-42-0-0,s=255,c=3,t=24,pt=1,l=1,sg=0,ft=0,st=OK:1
622652 TSF:MSG:READ,0-0-42,s=255,c=3,t=25,pt=1,l=1,sg=0:1
622657 TSF:MSG:PONG RECV,HP=1
622660 TSM:UPL:OK
622662 TSM:READY:ID=42,PAR=0,DIS=1

From now the sensor can send again without errors. :smiley:
So far my tests.
Except the case where the radio of the sensor went off and causes a lot of logging all test cases where satisfactorily.
I think this is really an edge case and not so serious.
From this point of view I would welcome that this beta development version will be published.

eric_smid

I tested my node with the "#define MY_TRANSPORT_WAIT_READY_MS 1" feature.
It seems to work fine!
Tnx for the work.

Heizelmann

I detected another problem. From debug log it looks like the loop() function starts before initialization is finished. This can cause problems when trying to send data in loop() very early.

Here a log example:

0 MCO:BGN:INIT NODE,CP=RNNNA--,VER=2.1.0-beta
4 MCO:BGN:BFR

5 before() begin
5 before() end

5 TSM:INIT
9 TSF:WUR:MS=1000
16 TSM:INIT:TSP OK
17 TSM:INIT:STATID=42
20 TSF:SID:OK,ID=42
21 TSM:FPAR
58 TSF:MSG:SEND,42-42-255-255,s=255,c=3,t=7,pt=0,l=0,sg=0,ft=0,st=OK:
651 TSF:MSG:READ,0-0-42,s=255,c=3,t=8,pt=1,l=1,sg=0:0
656 TSF:MSG:FPAR OK,ID=0,D=1
1010 MCO:BGN:STP

1011 Setup() begin
1512 Setup() end

1512 MCO:BGN:INIT OK,TSP=0

1516 loop() begin
2017 loop() end

2017 loop() begin

2065 TSM:FPAR:OK
2066 TSM:ID
2067 TSM:ID:OK
2069 TSM:UPL
2072 TSF:MSG:SEND,42-42-0-0,s=255,c=3,t=24,pt=1,l=1,sg=0,ft=0,st=OK:1
2078 TSF:MSG:READ,0-0-42,s=255,c=3,t=25,pt=1,l=1,sg=0:1
2083 TSF:MSG:PONG RECV,HP=1
2086 TSM:UPL:OK
2087 TSM:READY:ID=42,PAR=0,DIS=1
2093 TSF:MSG:SEND,42-42-0-0,s=255,c=3,t=15,pt=6,l=2,sg=0,ft=0,st=OK:0100
2100 TSF:MSG:READ,0-0-42,s=255,c=3,t=15,pt=6,l=2,sg=0:0100
2107 TSF:MSG:SEND,42-42-0-0,s=255,c=0,t=17,pt=0,l=10,sg=0,ft=0,st=OK:2.1.0-beta
2116 TSF:MSG:SEND,42-42-0-0,s=255,c=3,t=6,pt=1,l=1,sg=0,ft=0,st=OK:0
2126 TSF:MSG:READ,0-0-42,s=255,c=3,t=6,pt=0,l=1,sg=0:M

2131 Presentation() begin
2135 TSF:MSG:SEND,42-42-0-0,s=255,c=3,t=11,pt=0,l=22,sg=0,ft=0,st=OK:Sink Pump Control Node
2145 TSF:MSG:SEND,42-42-0-0,s=255,c=3,t=12,pt=0,l=3,sg=0,ft=0,st=OK:1.2
2153 TSF:MSG:SEND,42-42-0-0,s=1,c=0,t=1,pt=0,l=0,sg=0,ft=0,st=OK:
2161 TSF:MSG:SEND,42-42-0-0,s=2,c=0,t=1,pt=0,l=0,sg=0,ft=0,st=OK:
2168 TSF:MSG:SEND,42-42-0-0,s=3,c=0,t=3,pt=0,l=0,sg=0,ft=0,st=OK:
2174 Presentation() end

2176 MCO:REG:REQ
2181 TSF:MSG:SEND,42-42-0-0,s=255,c=3,t=26,pt=1,l=1,sg=0,ft=0,st=OK:2
2187 TSF:MSG:READ,0-0-42,s=255,c=3,t=27,pt=1,l=1,sg=0:1
2192 MCO:PIM:NODE REG=1

2518 loop() end

2518 loop() begin
 ...

tekka

@Heizelmann I'm not sure I understood your concern.

Initialization is finished here, right before the main loop is entered:

1512 MCO:BGN:INIT OK,TSP=0

Since your timeout is set to 1sec (which obviously is too short to fully establish the transport link) the node will enter the main loop() after the timeout. However, since the link is not established, only init messages are sent at that point (i.e. library init, signing preferences, presentation). Sensor data will not be sent until the node is registered.

Option non blocking registration at gateway

19

11.7k

11.2k

113.2k