PDA

View Full Version : Bug Report


WilliamM
06-28-2008, 02:11 PM
Summary of functionality for xSleep2Portal / xSleep2Node:

From xSleep2Portal the user can call sleepAll() to put all of the nodes to sleep (at this point simultaneously and including the bridge). xSleep2Portal also features interruptSleep(), which will tell the nodes to wake up the next time they check in (~every 3 min). If the bridge is not reset (therefore asleep), or out of communication with a particular node, that node should time out in trying to get a response from Portal (10 sec given), and go back to sleep. But when this is the case, I am getting the following problem. If bridge is awake (in communication), it also happens sometimes, but seemingly not EVERY time.

Issue #1: In the script "xSleep2Node" as it currently is (as attached), after executing the sub getSleepValue() (called from timer), the node will go into receiveSleepValue(sleep) - the NEXT sub, which it has no right to do! This sub must be (and IS) only called by Portal! If it is not called by Portal, the node should timeout and go back to sleep.

I was suspicious and decided to try and change the order of getSleepValue() and receiveSleepValue(sleep) in the script. This fixed it! Meaning that there is a bug somewhere which allows the nodes (and possibly Portal) to read into a sub that ISN'T called!

Issue #2: I still can't make compilation Portal scripts. If you look, my "xSleep2Node" script imports "xMsgNode". When I try to do the same for "xSleep2Portal" with "xMsgPortal", it will show printMessage() in the node info window, but printMessage() wont work. Try it. So I had to include printMessage() explicitly in "xSleep2Portal".

kbanks
06-30-2008, 08:55 AM
Issue #1: In the script "xSleep2Node" as it currently is (as attached), after executing the sub getSleepValue() (called from timer), the node will go into receiveSleepValue(sleep) - the NEXT sub, which it has no right to do! This sub must be (and IS) only called by Portal! If it is not called by Portal, the node should timeout and go back to sleep.

I was suspicious and decided to try and change the order of getSleepValue() and receiveSleepValue(sleep) in the script. This fixed it! Meaning that there is a bug somewhere which allows the nodes (and possibly Portal) to read into a sub that ISN'T called!

I added two print statements to your xSleep2Node.py script, as shown below:


def getSleepValue():
print "test1"
rpc(myPortalAddr, 'tellSleepValue', localAddr())

def receiveSleepValue(sleep):
print "test2"
if sleep == False:
sendMessage("Sleep interupted")
onWakeup()


I intercepted STDOUT from Portal. I then invoked getSleepValue() from Portal.

I only saw the "test1" message (no "test2" message), indicating that the SNAPpy interpreter is not "falling through" to the next subroutine.

Next I took a look at the Portal script xSleep2Portal.py. It contains the following routine:


def tellSleepValue(whoAsked):

rpc(whoAsked, 'receiveSleepValue', sleep)


So, it looks to me like your SNAPpy script called a Portal function which called a SNAPpy function.

I was able to verify this by temporarily loading a different Portal script. With no tellSleepValue() function in Portal, only "test1" prints. With your tellSleepValue() function present, you get both messages.

Your scripts are doing what you told them to do.

kbanks
06-30-2008, 09:03 AM
Issue #2: I still can't make compilation Portal scripts. If you look, my "xSleep2Node" script imports "xMsgNode". When I try to do the same for "xSleep2Portal" with "xMsgPortal", it will show printMessage() in the node info window, but printMessage() wont work. Try it. So I had to include printMessage() explicitly in "xSleep2Portal".

I was able to reproduce this. We will look into it.

(Follow-up - The bug you have reported has been entered into our online tracking system)

WilliamM
06-30-2008, 02:55 PM
Regarding Issue #1, I forgot that the bridge, even if not reset, will check-in just like all the other nodes (and during more-or-less the same 10 sec timespan). So you're right, the nodes are receiving the receiveSleepValue(sleep) call from Portal. But this doesn't really explain the discrepancy I am seeing between when I reset the bridge (so that it is always awake) and when I let it sleep and check-in with the others.

If I reset it, [see next reply].

If I let it sleep and check-in, all of the nodes behave as though the value of sleep is False, even though interruptSleep() was never run (the variable sleep should be True unless interruptSleep() is run).

Try running the routine sleepAll(hours) from xSleep2Portal, rather than running getSleepValue() directly on the node. Don't wake the bridge up (by resetting), and see if you can reproduce what I'm talking about (it will take ~5 min before they try to "check-in").

Is it possible that my initialization of sleep at the top level of xSleep2Portal is being executed again after I run sleepAll()? That's the only way sleep could return to False w/o my knowing it. However it doesn't explain why this doesn't happen if I reset the bridge.

WilliamM
06-30-2008, 03:26 PM
By the way, if the bridge node is reset after the nodes have been put to sleep, the other nodes seem to accurately receive the value of sleep when they check-in and behave as expected (even in the version of xSleep2Node I attached). Of course, all of these nodes are in relative proximity to the bridge and have no problem communicating with it. My concern is that in a setting with more nodes (hundreds), many will be too far away to communicate with the bridge directly when they check-in. Thus they should timeout and return to sleep until communication is established (when the in-between nodes are awake).

kbanks
07-01-2008, 10:58 AM
Is it possible that my initialization of sleep at the top level of xSleep2Portal is being executed again after I run sleepAll()? That's the only way sleep could return to False w/o my knowing it.

I think that is what is happening. In the version you have, Portal sometimes reloads the current Portal script. This means that any "top level" code in the script will also run.

You should be able to confirm this is happening by adding a line to your script

In your xSleep2Portal.py

global sleep
sleep = False
print "sleep set to False" # add this line!


You should be able to work around this issue, by moving the initialization of non-constant variables into a subroutine that you call manually.


global sleep

def init():
global sleep
sleep = False


You do have to remember to call init() every time you load the script.

The "reload issue" will be fixed in the next release of Portal. In the meantime, this is one work-around you can use.

kbanks
07-01-2008, 11:01 AM
My other recommendation is that you don't have the "bridge" node sleep.

Also, as your network grows (not in number of units but in numbers of hops), you are going to start needing to synchronize your nodes.

Please look at the "Sleepy Mesh" application note (and scripts) for ways of doing this.

WilliamM
07-05-2008, 07:56 PM
My other recommendation is that you don't have the "bridge" node sleep.

Also, as your network grows (not in number of units but in numbers of hops), you are going to start needing to synchronize your nodes.In my newer version, the bridge does all of the broadcasting and therefore does not sleep. I also added some code to de-synchronize all of the nodes by randomizing how long they wait before going to sleep after receiving a sleep command. The effect I'm hoping to get is that they check-in at random (lowering traffic), and go back to sleep unless told to do otherwise. If a sleep interrupt is initiated, they will wake up more-or-less in "waves" (of hops) radially outward from the bridge.

mgenti
07-07-2008, 08:33 AM
In my newer version, the bridge does all of the broadcasting and therefore does not sleep. I also added some code to de-synchronize all of the nodes by randomizing how long they wait before going to sleep after receiving a sleep command. The effect I'm hoping to get is that they check-in at random (lowering traffic), and go back to sleep unless told to do otherwise. If a sleep interrupt is initiated, they will wake up more-or-less in "waves" (of hops) radially outward from the bridge.

Having the nodes report in at varying intervals is a good idea, however you need to make sure that nodes are not sleeping for those messages to get back to the bridge.

WilliamM
07-09-2008, 03:31 PM
however you need to make sure that nodes are not sleeping for those messages to get back to the bridge.I probably didn't explain this very well in my previous post: The idea is that if the intermediate nodes ARE sleeping, the node "checking-in" will timeout and return to sleep on its own (by default). Only the nodes in communication with the bridge will be able to successfully check in and receive a sleep interrupt command. After they wake up, the nodes one hop further from the bridge will be able to receive the interrupt, and so on, and so on. Resulting in them waking up in "waves" (of hops) radially outward from the bridge.
Of course, they won't be perfect waves due to the randomization mentioned earlier.

kbanks
07-09-2008, 03:57 PM
Now I understand what you are doing... that is an interesting approach!

As more and more nodes get the "stop taking naps" notification, more and more nodes can get the notification.