Wednesday, April 4, 2018

Building an indoor test network and testing LBARD bug fixes

After the successful bug hunting last week, this week I have turned my attention to setting up to easily verify whether the bugs have been fixed. This means setting up a multi-hop Mesh Extender network in a convenient location. 

The building I am based in was purpose-built for the College of Science & Engineering, and we were able to have a number of specific provisions included during the design phase to make this kind of test network easier to deploy, and we are now finally able to start making fuller use of that.  More specifically, we have 13 points around the building where we have copper and fibre directly to our lab, and also have some funky mounting brackets, on which we can fit radio hardware, all nicely protected from the outside weather.  

So the obvious approach was to fit several Mesh Extenders to some of those points, and form a multi-hop UHF network.  Here are some photos from various angles showing the units:

First up, we have Mesh Extender serial #23 with SID BCA306B0*, which is at point NE1. Note the extra long 6dB antenna we fitted to this one to make sure it could talk to the one at N4 (1A15ED32*). Even with that antenna, the link is quite poor, often getting only one packet every 5 to 10 seconds.  While on the one hand annoying, this is actually quite helpful for testing the behaviour of Mesh Extenders in marginal link conditions.



Mesh Extender serial #34 is at point N4, and has SID 1A15ED32*:




Note that we have removed the Wi-Fi antenna from this one. This was necessary to make sure that it can't communicate via Wi-Fi with any of the other units, which it was otherwise able to do intermittently, due to being relatively near to the one in the lab, which is on the same floor, and the one on level 5, to which it almost has line of sight up to via the stair case.

Mesh Extender serial #50 is at NW5 and has SID 5F39F182*:





And finally, Mesh Extender serial #17 (6ABE0326*) is sitting on the rack in our lab:



Mesh Extender #34 (1A15ED32*) is the only node that can see all the others, which we confirmed by using the debug display on LBARD:

I was still fiddling around with things at this point, but you can see already that the RSSI for BCA306B* is quite a lot lower than for the others.  Although here it looks like the link is healthy, it really does come and go a lot, and goes seconds at a time without a single packet arriving.  

The last of the four rows, the one showing 39F1821A* is actually a known-bug manifesting. This is a ghost connection, caused by receiving a packet from 5F39F1821*, with the first byte having gone walk about for some reason. We are still to work out what the cause of this problem is.  It's on our bug list to find and fix.


Now, to give a better idea of how the network is laid out, here are a couple of shots from the outside of the building.  The first is looking along the northern facade to give a sense of the length of the building.  It is from memory about 50m long.


And here is the whole facade:

And, here is the facade showing, in red, the locations of the three units that are located on the mounting points just behind the three layers of metal-film coated glass that acts as a partial Faraday Cage. The black spot shows the approximate position of the lab, as projected onto the northern facade, although it is in reality on the southern side of the building, approximately 25 metres south of the northern facade. Green lines indicate strong UHF links, and the yellow line the poor link between the first and fourth floors.


The glass treatment means that very little signal can actually get out of the building, bounce of something, and find its way back in. We estimate the attenuation to be about 15dB at 923MHz, so a total of 30dB loss to pass out of the windows, and back in again, not counting any free space losses etc.  Thus we suspect that the links between the units are due to internal reflections in the building, and perhaps a bit of common mode radiation along metal parts of the building interior.  It is almost certainly not due to transmission through the ~30 - 50 cm thick reinforced concrete floor decks of the building.

Given that we are in a building with such poor RF propagation properties, it is rather pleasing that we are able to get any link at all between the first and fourth floors, given the multiple concrete decks and other obstacles that the link has to face.

So, with the network in place, it was time to do a little initial testing with it.  For this, I had a phone running Serval Chat in the lab, and I went down to level 1 by BCA306B0 and sent text messages, using the automatic delivery confirmation mechanism to know that a message has got to the other end, and indeed that a bundle has made its way back containing the acknowledgement.

First, we tested using the old version of LBARD, and were able to see the failure of messages to deliver that we had seen in the field.  Then it was a visit to all the Mesh Extenders to update LBARD (we will in the near future supply them all with ethernet connections back to the lab, so that we can remotely upgrade them). The result after upgrading LBARD was messages were reliably delivering, taking between about 1 and 4 minutes to get the delivery confirmation.  This was of course a very pleasing and comforting result, as the inability to reliably deliver messages was the most critical issue we were seeing in Vanuatu.

The large variation in time we expect to be simply due to the very high rates of packet loss that we see on the link between BCA306B0 and 1A15ED32 -- something that we will confirm was we continue testing. The main thing is that we now have a test environment that is both convenient, and realistic enough, for us to reproduce bugs, and confirm that they have been fixed.

Sunday, April 1, 2018

Fixing bugs and structure of LBARD

During our trips to Vanuatu last year we identified a number of bugs with LBARD, which have been sitting on the queue to be fixed for a while. 

At the same time, we have had a couple of folks who have tried to add support for additional radio types to LBARD, but have struggled due to the lack of structure and documentation in the LBARD source code.

Together, these have greatly increased the priority of fixing a variety of things in LBARD.  So over the past few days I have started pulling LBARD apart, taking a look at some preliminary refactoring by Lars from Germany, and tried to improve the understandability, maintainability and correctness of LBARD.  This has focussed on a few separate areas:

Generally improving the structure of the source code

The most obvious change here is that the source files now live in a set of reasonably appropriately named sub-directories, to make everything easier to find.  This is supported by the factoring out of a bunch of functionality into the radio drivers and message handlers described below.

Drivers for different packet radio types are now MUCH easier to write.  

Previously you had to hunt through the source code to find the various places where the radio-specific code existed, hoping you found them all, and worked out how they hang together. This also made maintenance of radio drivers more fragile, because of the multiple files that had to be maintained and that could give rise to merge conflicts.

In contrast, the process now consists of creating a header and source file in src/drivers/, that implement just a few functions, and has a single special header line that is used by the Makefile to create the necessary structures to make the drivers usable. 

The header file is the simplest: It just has to have prototypes for all of the functions you create in the source file.

The source file is not much more complex, excepting that you have to implement the actual functionality.  There are a few assumptions, however, primarily that the radios will all be controlled via a UART.  Using the RFD900 driver as an example, let's quickly go through what is involved, beginning with the magic comment at the start:


/*
The following specially formatted comments tell the LBARD build environment about this radio.
See radio_type for the meaning of each field.
See radios.h target in Makefile to see how this comment is used to register support for the radio.

RADIO TYPE: RFD900,"rfd900","RFDesign RFD900, RFD868 or compatible",rfd900_radio_detect,rfd900_serviceloop,rfd900_receive_bytes,rfd900_send_packet,always_ready,0

*/


In order for the compilation to detect the driver and make it available, it requires a line that begins with RADIO TYPE:  and is followed by the elements for a struct radio_type record, the details of which are found in include/radio_type.h. You should provide here:

1. the unique radio type suffix (this will get RADIOTYPE_ prefixed to it, and #defined to a unique value by the build system).
2. The short name of the radio type to appear in various messages.
3. A longer description of the radio type.
4. An auto-detect routine for the radio.
5. The main service loop routine for the radio. LBARD will call this for you on a regular basis. You have to decide when the radio is ready to send a packet, as well as to manage any session control, e.g., link establishment for radio types that require it.
6. A routine that is called whenever bytes are ready to be received from the UART the radio is connected to.
7. A routine that when called transmits a given packet.
8. A routine that returns 1 when the radio is ready to transmit a packet.
9. The turn-around delay for HF and other similar radios that take a considerable time to switch which end is transmitting.

These drivers are actually quite simple to write in the grand scheme of things. Even the RFD900 driver, which implements a congestion control scheme and an over-complicated transmit power selection scheme requires is still less than 500 lines of C.  The current proof-of-concept drivers for the Barrett and Codan HF radios are less than 300 lines each.

What hasn't been mentioned so far is that if you want your radio driver to get exercised by LBARD's test suite, you also need to add support for it to fakecsmaradio, LBARD's built-in radio simulator.  This simulator implements a number of useful features, such as the ability to implement unicast and broadcast transmission, the ability to detect packet collisions and cause loss of packets, adjustable random packet loss and a flexible firewall language that makes it fairly easy to define even quite complex network topologies.  The drivers for this are placed in src/drivers/fake_radiotype.h, and will be described in more detail in a future blog post.

LBARD Messages are now each defined in a separate source file

Whereas previously the code to produce messages for insertion into packets, and the code to parse and interpret them were found scattered all over the place, they now all live in a source file each in src/messages/. For further convenience, these files are automatically discovered, along with the message types they contain, and hooked into the packet parsing code.  This makes creation of message parsing functions quite trivial: One just needs to create a function called message_parser_XX, where XX is the upper-case hexadecimal value of the message type, i.e., the first byte of the message when encoded in the packet.  Where the same routine can be used to decode multiple message types, #define can be used to allow re-use of the function.  Apart from performing their internal functions, these routines need only return the number of bytes consumed during parsing. For example, here is the routine that handles four related message types that are used to acknowledge progress during a transfer:

#define message_parser_46 message_parser_41
#define message_parser_66 message_parser_41
#define message_parser_61 message_parser_41

int message_parser_41(struct peer_state *sender,char *prefix,
                      char *servald_server, char *credential,
                      unsigned char *message,int length)
{
   sync_parse_ack(sender,message,prefix,servald_server,credential);
  return 17;  // length of ACK message consumed from input
}

Fixing LBARD transfer bugs

In addition to the structural improvements described above (and in many regards facilitated by them), I have also found and fixed quite a few bugs with Rhizome bundle transfers.  The automated tests now all pass, with the exception of the auto-detection of the HF radios, which don't work yet because we are still writing the drivers for fakecsmaradio for them. Here are a pair of the more important bugs fixed:

1. The code for tracking and acknowledging transfers using acknowledgement bitmaps now actually works.  This is important for real-world situations where packet loss can be very high.  In contrast to Wi-Fi that tries to hide packet loss through repeated low-latency retransmissions, we have to conserve our bandwidth, and so we keep track of what a recipient says that they have received, and only retransmit that content when required. 

This also plays an important role when there are multiple senders, as they listen to each other's transmissions, and use that information to try very hard to make sure that they don't send any data that anyone else has recently sent.  This helps tremendously when there are multiple radios in range of one another.

This bug is likely responsible for some of the randomness in transfer time we were seeing in Vanuatu, where some bundles would transfer in a few seconds, while others would take minutes.

2. There were some edge-cases that could cause transmission to get stuck resending the last few bytes of a bundle over and over and over, even though they had already been received, and other parts of the bundle still needed sending.

This particular bug was tickled whenever the length of a bundle, when rounded up to the nearest multiple of 64 bytes was an odd multiple of 64 bytes. In that case, it would mistakenly think that there was 128 bytes there to send, and in trying to be as efficient as possible, send that instead of any single 64 byte piece that was outstanding (since it results in reduced amortised header cost).  However, the mishandling of the bundle length meant it would keep sending the last 64 bytes forever, until a higher priority bundle was encountered. 

It is quite possible that this bug was responsible for some of the rest of the randomness in latencies we were seeing, and also the "every other MeshMS message never delivers" bug, where sending one MeshMS would work, sending a 2nd wouldn't, but sending a 3rd would cause both the 2nd and 3rd to be delivered -- precisely because each additional message had a high probability of flipping the parity of the length of the bundle in 64 byte blocks.

So the net result is that the various tests we had implemented in the test suite, including delivering bundles over multiple (simulated) UHF hops, holding a MeshMS conversation in the face of 75% packet loss, transferring bundles to many recipients or receiving a bundle efficiently from many nearby senders all work just fine. 

The proof of the pudding is in the real-world testing, so I am hoping that we will be able to get outside with a bunch of Mesh Extenders in the coming week and setup some nice multi-hop topologies, and confirm whether we can now reliably deliver MeshMS and MeshMB traffic over multiple UHF radio hops. 

In parallel to this, we will work on the drivers for the HF radios, and also for the RFM96-based Lora-compatible radios that are legal to use in Europe, and get that all merged in and tested. But meanwhile, the current work is all there to see at https://github.com/servalproject/lbard/.

Wednesday, March 21, 2018

Early access to Serval Mesh Extender hardware

A quick post in response to a few enquiries we have had recently, where folks have been asking about getting their hands on some prototype Serval Mesh Extender devices.

So far, we have only produced 50 of these units, which have our (hopefully) IP65/IP66 injection-moulded case, IP67 power/external radio cable, integrated solar panel controller and battery charger if you wish to operate them off-grid.   Most of those are currently deployed in Vanuatu, or are in use in the lab here or with a few other partner organisations.

Although we are still in the process of testing them, they have already been in some interesting places in Vanuatu and beyond, as the images at the end of the post show.

Thus, we would like to gauge the interest in us having another production run, so that folks who would like to work with these experimental units, including to help us identify, document and fix any problems with them, can do so.

A reminder and warning: The devices would be experimental, and have known software issues that need to be resolved before they can be sensibly deployed.  It is also possible that hardware issues will be found in the process, and that these units may well never be deployable by you in any useful way, i.e., cannot be considered as a finished product or fit for merchantability. Rather, you are being given the opportunity to purchase hardware that would allow you to participate in the ongoing development of these devices.

If you want to buy Mesh Extenders to use in a serious way, this is not the offer you are looking for. As soon as we are ready to make them commercially available, we will be sure to let everyone know, and will update this post as well.


Because of the small production size that we would anticipate for this, the per-unit price will likely be quite high -- potentially as high as AU$700 each (and you really need a pair for them to be useful).  If we were able to get 100 or more manufactured in a single run, then this cost would come down, quite likely a lot.  (But please note, this is not at all representative of the final price of these units when we hopefully bring them to market -- we intend that the end price to be a LOT lower than this.  Again, the purpose of this current offer now is to provide early access that would support the finalisation of these devices, and allow us to get a few extra devices made for our internal use at the same time.)

So, given the above, is there any one else out there who would also like to get their hands on one or more Mesh Extenders, in your choice of Wholly White, Rather Red or Generally Grey?

(We realise the price at this stage is rather high, and that this might preclude interest from some folks from participating. During this pre-production stage, we have a couple of options to reduce the cost a little: First, if you already own an RFD900 or RFD868 radio, or want a Wi-Fi-only Mesh Extender, you can subtract AU$100 from the price.  Second, if you want to make an offer contingent on a production run of at least 100 units, which should get the price down by AU$100 - AU$200 per unit (subject to confirming with our suppliers), you are welcome to indicate this.  We will in any case come back to the community before we proceed with any production run.)







Thursday, February 22, 2018

Getting back up to speed for 2018

It's a been a while since the last post, in part due to summer holidays here in Australia, and in part due to various other things taking my time and energy.  But, we are now ramping back up for 2018.  There are a few interesting things going on that are worth commenting on:

1. Porting Serval to Ubuntu Touch

A group has formed to port Serval to Ubuntu Touch, in their own words:

"The Serval Mesh for UBports group is dedicated to the task of launching a fully functional Serval app for Ubuntu Touch and aiding in the expansion of community-built mesh networks."

and

"Hi! We are busy making Serval Mesh work on UT and it's going WELL (they say)"

In just a couple of days, they already have servald running on the phone (this is one of the nice things about Ubuntu Touch basically being a port of Ubuntu Linux, and thus brings with it many benefits of convergence, i.e., where something made for one platform works on the other, possibly without any changes.  Here are some early screenshots showing their current progress (which, given they have been working on it for only two days, is very nice):



In that last one, you can also just see the cat's ears poking up in the bottom of the list of apps, as well :)

Currently the list of devices that support Ubuntu Touch is relatively limited. That said, my FairPhone 2 is on the supported list, but I am not sure if I will switch over just yet.  I might just get a second FP2 for testing for now.  The FP2 is of course the perfect phone for Ubuntu Touch in my view, as it is an open-hardware phone as far as is possible, and allows for end-user servicing -- very much in line with the Linux ethos.

2. Outernet integration

This is a project supported by the Humanitarian Innovation Fund (HIF) out of the UK.  While we have a couple of roadblocks to solve, as a result of Outernet moving back from L-band (~1.5GHz) to Ku-Band (~12GHz), the end result will be VERY exciting, in part because the bandwidth will be higher, and in part because it looks like we will be able to incorporate the entire receiver into the existing Mesh Extender cases, thus offering a single unit that does both functions.  The new Ku receiver antennae are also nicely shaped to allow for strapping to a tree pointing in the general direction of the satellite. 

More to come on this as it progresses.

3. We have our next cohort of excellent students from INSA Lyon, France, arriving as I type.  They will be working on a number of very useful sub-projects, including bedding down the HF radio support we first demonstrated a couple of years ago, fixing bugs we identified in Vanuatu last year and so on.  I am VERY excited about seeing progress on these fronts, so that Serval Mesh Extenders can be usefully deployed by our various partners in a variety of ways.

Paul.

Thursday, November 23, 2017

Android has been a bit naughty with its location tracking

I was pointed to this article today:

https://qz.com/1131515/google-collects-android-users-locations-even-when-location-services-are-disabled/

Basically it points out that Android has been tracking location of phones for the past year or so, even when location tracking is disabled.  More specifically, it tells Google whenever you come in range of a cell tower.  By doing this for each cell tower a phone can hear, can provide a fairly good location, especially if you integrate it over time.

The use of spyware in mobile devices is a topic we have talked about previously, both for people living in dangerous places, as well as for victims of domestic violence and other contexts where being able to locate someone further compounds their vulnerability and tips the power-imbalance in the favour of an abusive person, organisation or otherwise.

The really naughty part in this current situation, is that this was happening even without a SIM card in the phone, and even when location services were disabled in Android: There was no way to know it was happening, and no way to disable it, even if you knew.  In fact, Google realised it was naughty by more or less immediately phasing it out as soon as they had been called out on it.

This leads me to a topic that we have been quietly working on in the background for the past couple of years, that is, how can we trust modern computers and communications devices, when they are so complex that it almost requires accidental discovery by dedicated researchers to find these significant privacy and safety damaging functions, which have been silently introduced to our devices -- often through software updates long after the initial purchase.

Our response to this is to explore the creation of "simply secure" communications devices, i.e., communications devices so simple, that their security can be quickly and confidently audited by a reasonably determined user, rather than requiring a team of researchers to explore.  Such devices should also make it much easier to be assured that the device cannot communicate with the outside world -- including getting a location fix -- when you don't want it to. 

Such devices are easy to make. After all, a brick is a secure communications device, in that there isn't really any way to subvert the function of a lump of burnt clay.   But it isn't useful.  This is the opposite extreme from current devices, that are almost omnipotent, but are so easy to subvert.

The challenge is to design and create devices that sit on some sweet spot in the middle, where they are still simple enough to be confident in their correct function, yet not so simple as to be practically useless.

This is exactly the kind of device that we are currently designing, in the form of a specialised smart-phone, that will still be capable of secure email, telephone calls, SMS and so on, while being much more resistent to attack or subversion, due to its simplicity and transparent auditability. 

For example, it will have physical switches to power off the cellular modem, and the cellular modem will be completely sandboxed from the rest of the phone -- including the GPS receiver, microphone and so on. Many of these modules will also be completely removable.

It will also allow full out-of-band memory inspection of the entire system, transparent to, and independent of the processor, and provide a secure compartmentalised architecture that allows a paranoid process, for example an email decryption program, to be sure that even the hypervisor cannot interrupt it to exfiltrate private information.

We know that there are some other folks active in similar spaces, including the excellent folks at Purism. We love what they are doing, and see our thinking in this space as complementary.  The Purism laptops (and soon phone) use all open-hardware, so that if you need a full-function computer, it is as trust-worthy as possible.  What we are looking to do is a little different: We want to see how simple we can go, while preserving enough function to be useful. We are expecting the core operating system to fit in kilo-bytes of memory, not mega-bytes, and applications to be tens to hundreds of kilo-bytes, not mega-bytes. 

There are lots of questions unanswered, not the least whether the thing will actually be useful enough for anyone, but we are exploring, and all going well, hope to be able to produce a few prototype devices by the end of 2018.  We have also secured the necessary defence-related export clearance for such a device, precisely because its combined security measures place it in risk of tipping over into the category of dual-use equipment, so we have a green light there.

So my questions for all of you reading:


  1. Would any of you buy a "phone for the paranoid" along the lines of what I am describing?
  2. What are the absolute core functions that you would require, compared to the list below:
    • Make and receive telephone calls (en claire, and quite possibly end-to-end encrypted).
    • Send and receive SMS messages (en claire or encrypted).
    • Send and receive Email, including GPG or similar encrypted.
    • Very basic web browsing, using a purposely cut-down browser.
    • Ability to run 3rd-party apps in a sand-box environment.



Tuesday, October 17, 2017

Setting up Mesh Extender capability within NZ Red Cross

I am briefly in Wellington, NZ, visiting NZ Red Cross on my way to the Global Humanitarian Technology Conference where we have a bunch of papers to present at the end of the week.

One purpose of the visit was to update the firmware on the Mesh Extenders we had previously provided NZ Red Cross with, and to transfer the knowledge of how to flash the Mesh Extenders to their IT & Telecomms Emergency Response Unit (IT&T ERU), so that they can do it themselves in the future.

As the ERU does not normally carry laptops running Linux, we found an old disused laptop, and installed Ubuntu on it, and replicated the build and flashing environment from my laptop.

The important parts were to setup a TFTP server on the laptop, copy the firmware files in there, and clone the Mesh Extender openwrt-packages repository from github, checkout the MeshExtender2.0 branch, compile the auto-flash program.

After that, it is just a case of running the auto-flash program with a USB to serial adapter connected to a specially made adapter cable, and connecting the Mesh Extenders and watching the output of auto-flash to see when a unit has been flashed.

Natalie from the ERU was super-helpful being our guinnea pig, and also in documenting the process.  Hopefully we will get the documentation up on the wiki in the near future, at which point I will link to it from this post.

But in the meantime, the following photo shows the completed kit, with the USB serial adapter cable, ethernet cable for TFTP, USB memory stick with Ubuntu so that it can be cloned to other laptops in the future, all in a fashionable marigold laptop case.  The Cat may object wearing gold and marigold at the same time, but we are quite happy with the result for now.




Sunday, October 8, 2017

Pandanus cable ties and a Mesh Extender tree

We are now in the process of installing Mesh Extenders into Epau, the second of the two villages we are targeting here on the island of Efaté in Vanuatu.  As with Pang Pang, the community have been very gracious and enthusiastic in working with us.

It is always interesting and educational to watch how the folks here go about installing Mesh Extenders, in ways that are appropriate for them, rather than what we might naturally think of in an infrastructure-rich first-world context.

This series of images follows the process of installing a Mesh Extender in Epau.  After talking with the community, they decided that this Mesh Extender would be better mounted in a tree next to the house, than on the house itself.

First step: Attach the Mesh Extender to a long bamboo pole, which is a convenient locally available material:

Second step: If you don't have enough cable ties, make a make-shift cable-tie from a dried Pandanus leaf (the same leaves that the Vanuatuans use to weave mats, bags and other useful things):


Here you can see it closer up, with the light-brown Pandanus leave around the lower part of the Mesh Extender:


Third step: Explore the tree to work out how to get the > 6 metre long bamboo pole up there, and firmly attached:


It didn't take very long for said person to disappear even higher up the tree:


Then the pole was passed up, and it wasn't long until the Mesh Extender tree flowered: 


The power lead was fed back down through the tree:


And the Mesh Extender lofted a bit higher, to ensure it was well clear of the foliage (and hopefully will remain so for a few months before needing adjustment, due to tree growth):


The tree itself is not that small, either: The crown would be at least six metres tall, on top of a few metres of relatively bare trunk below:


With about 1.5 metres of clearance, I'd estimate that this Mesh Extender is about 10 metres above the ground:


What I really love about this installation, is that it was made using local skills, local materials and knowledge: I have every confidence that the community can take it down when a cyclone comes, and then put it back up again, and that they won't be stymied by the lack of things that they need to order from somewhere.  

All week my children have been accusing me of making "Dad jokes", so I'll continue in that vein here, by calling this installation a blooming success.

Hopefully in the next few days we will be able to get this community installing and using the app on their phones, and get a few more Mesh Extenders installed around the village.