Alexandra Instituttet A/S // Aktuelt // Nyheder // Nyheder 2016 // Bluetooth and Wi-Fi sensors - A short update on smartphone tracking

Bluetooth and Wi-Fi sensors - A short update on smartphone tracking

Artikel

09-12-2016

Bluetooth and Wi-Fi sensors - A short update on smartphone tracking

We oftentimes get questions about how we use smartphone tracking at the Alexandra Institute as well as about future perspectives. In particular, smartphone producers have recently introduced better privacy features. We will analyse the consequences.

1 Background

Tracking users, in particular user’s position as well as counting users, is desirable in a number of domains, such as location-based services, anthropological studies, urban planning, marketing, etc. Nowadays, a very common way to do so is via smartphones.

There are two main types of approaches[1] to positioning smartphones: active and passive.

  1. In active positioning, the users themselves want to get positioned, or have installed an app that can monitor their position. This is for instance the case in order to use a map application. The telephone is calculating its position through e.g. GPS, and from its relative position (triangulation) compared to known positions of telephony masts (GSM) and known Wi-Fi access points (fingerprinting). In some scenarios, apps running on the telephone can also get additional position clues from e.g. Bluetooth beacons, especially indoors.
  2. In passive positioning, users do not have a direct interest in having their position monitored, and often do not know that their position is being monitored. In passive positioning, there are two sub-categories of scenarios, the first one being most powerful (accurate):
    1. When the tracking is done from the owners of the communication platform used by the telephone, e.g. the GSM network, or a Wi-Fi network registered on the telephone.
    2. When the tracking is done from third parties, e.g. a municipality that is interested in finding out whether urban changes have made a place more or less crowded.

Those different approaches have their respective strengths and weaknesses:

  Spatial precision* Temporal precision Hardware costs Software costs Scalability (larger areas, many users)
1. Do we have an app installed on the users’ phone? Medium Medium Low

High

Medium
2a. Do we control the Wi-Fi network used by the users? High High Variable(**) Medium Low
2b. Do we have a Wi-Fi sensor near the areas of interest? High Low Medium Low

High

 

(*) The spatial precision depends mainly on the distance from the smartphone to the sensor and is at best with a precision of a sphere of about 1 metre of diameter.

(**) High costs if one needs to deploy the Wi-Fi infrastructure, but low costs if one can re-use an existing one.

We will focus on the last point (2b), which is when we have the possibility of installing a Wi-Fi sensor close to each area of interest, without having the possibility of installing an app or controlling the Wi-Fi network used by the persons we would like to monitor.

2 Wi-Fi monitoring hardware

A Wi-Fi sensor can be implemented with various types of hardware such as repurposed Wi-Fi access points/routers, desktop/laptop computers with an appropriate Wi-Fi antenna, etc.

Inexpensive Wi-Fi routers (e.g. from TP-Link) have been popular for this task[2], some even running on battery. They are repurposed by replacing (“flashing”) their internal software by a custom operating system such as OpenWrt, after which they can be programmed like a basic Linux-based device.

For the past few years, the Raspberry Pi has been our platform of choice, due to an attractive price/performance ratio, while being easier to programme and to integrate in IoT platforms than the repurposed routers.

Any model of Raspberry Pi can be used (the Raspberry Pi B+ currently being the cheapest for this task) as long as an appropriate Wi-Fi antenna is used.

Indeed, both the Wi-Fi antenna hardware and its corresponding driver must support the “Monitor mode”, which is not always the case.

Different qualities of antennas can be used, depending on the desired sensitivity and dimension constraints.

3 Wi-Fi monitoring software

Several software packages can be used to inspect the Wi-Fi traffic nearby. Various open source solutions are available, such as Tcpdump, Horst and Kismet, which can be used with different settings depending on the use case.

For our purpose, we process the outputs of those sniffing utilities on the sensor itself to remove the noise and reduce the amount of data, before logging the data locally and/or uploading it to an IoT platform for further processing.

During this local processing, we also include an important data anonymisation stage. Indeed, in order to reduce the privacy issues of the collected data, we use a one-way cryptographic function (hashing) that transforms potentially personally identifiable data into less sensitive random identifiers. Details on how this anonymisation process is done can have a big impact on the actual privacy level (e.g. how often a randomisation function of the hashing parameters is used).

4 Wi-Fi sniffed information

When the Wi-Fi monitoring is running on a sensor, we are interested in some special packets emitted by the smartphones, sometimes called “Probe requests”.

Those probe requests are emitted by smartphones at regular time intervals to discover potential Wi-Fi access points in their vicinity, in particular when the smartphone is not connected to a Wi-Fi network. Probe requests contain the unique[3] MAC address of the smartphone. This Mac address (if not faked) is always the same even when the user might change the SIM card of the telephone. The first digits of the MAC address indicate the brand[4] and sometimes model of the telephone.

Probe requests also often contain the name (SSID) of the Wi-Fi networks the telephone is trying to find. A user generally saves several Wi-Fi networks, such as one for home (e.g. “Tietgenkollegiet”), one for work, a few hotels, etc. The names of these networks are sent in plain text. This list of Wi-Fi names is also a powerful fingerprinting and can often point at one single individual.

Finally, the radio signal strength (RSSI) of those probe requests can be measured by the Wi-Fi sensors, which provides a vague indication of the distance between the telephone and the sensor. Repeated measurements can improve the reliability, when telephones stay in the same area for some time.

4.1 Temporal precision

The amount of time between each sent probe is an important indicator of precision, as the longer between each probe diminishes the chance of knowing an exact location. There is not much public information documenting the expected time frequency that a phone (without user intervention) sends out these probes, but we can provide results based on observations we have made during the autumn 2016.

Our measurements were made in three distinctly different Danish cities in a time span stretching from the beginning of September to the end of November 2016. In one city, we have measurements from almost the entire period, while work in the other cities was undertaken at discreet time spans.

The numbers were extracted using some filters. Smartphones typically send several probe requests in burst, and we only counted one observation per telephone within a time window of 5 seconds. Similarly, we filtered out cases where a telephone had been out of reach using a threshold of 5 minutes, e.g. when a person stepped away from the sensor and came back later.

  Average delay between observations (seconds)
City 1

38.09

City 2 35.91
City 3 32.55

 

All cities have similar results from the measurements. On average, the sensors pick up probes from a specific device every ~35 seconds with an average probe frequency of ~2 per minute.

It is important to note that we observe a standard deviation of about 200 seconds. This relatively high number is expected as there are cases when phones send many probes during a short time span (such as when they are actively used) and in the other extreme situations, there are phones that might be borderline reachable by the sensors and therefore every probe request is not recorded. In the average worst cases, we can expect a time span of about 3 minutes between each observation of a device.

When there is a need of higher frequency of observations, there are a number of techniques[5] to prompt additional transmissions, but these techniques are more aggressive, with more side effects, so their legitimacy is more questionable.

5 Privacy rules

If not used carefully and with clear boundaries, Wi-Fi monitoring is a technology that bears the risk of infringing on users’ privacy, and we are very careful with that. There are some clear rules, both at Danish and European level. In general, it is best not to collect any data that is not strictly necessary for the purpose of the experiment. It is more acceptable if the experiment has a short duration, has a non-profit nature such as academic research, and when it benefits the public. Furthermore, the data should be anonymised at the source (as explained before), not archived for a long period (destroyed after analysis), while ensuring that it does not leak. In case of doubt, it can be a good idea to explicitly ask the Danish Data Protection Agency (“Datatilsynet”), which we often do.

Our first project with Bluetooth tracking was made at Copenhagen Airport in 2006, and from the very beginning we have been aware that privacy is a key aspect. In particular, the objective is important because there is a big difference between whether some data is collected for commercial use, e.g. for advertising. Or whether it is non-profit to learn about e.g. a service. It is a balance between how much personal data you give out versus what you get out of it. If it is too unbalanced, it is not working.

6 Wi-Fi privacy improvements

In recent iOS and Android versions, a strategy has been implemented to improve the users’ privacy regarding Wi-Fi sniffing. Instead of using the telephone’s true Wi-Fi MAC address (also called “universally administered address”) for the probe requests, the telephone instead uses a random “locally administered address”.

A Wi-Fi sensor can know whether the observed address is a real or fake one, by looking at a special bit of the MAC address, which indicates whether the address is “universal” or “local”.

6.1 Percentage of devices using Wi-Fi privacy measures

Based on the above-mentioned experiments, we have collected the percentage of telephones using those privacy measures over a period of three months. This gives an estimate of the adoption rate of these strategies. We found that devices running iOS 9.3.2 (published in May 2016) do utilise these privacy measures (which we call "spoofing"). Tests on common Android devices running version 6.0.1 seemed to indicate that common Android devices below this release do not try to hide their true MAC address. There are numerous applications available in Google Play Store[6], which facilitates spoofing, but they all require a rooted device and therefore it is fair to expect that this mode is not prevalent among everyday users.

As mentioned above in the Temporal Precision paragraph, we have gathered data over a three-month period in different Danish cities. The table below reports the percentage of observations in which smartphones used a spoofed MAC address. This percentage should approximately be the percentage of smartphones using those privacy measures.

  September 2016 October 2016 November 2016
City 1 12% 14,4% 15,6%
City 2 n/a 14,1% n/a
City 3 n/a n/a 15,8%

 

One can observe that the numbers are fairly similar across the cities, and a slight increase in the percentage of spoofed MAC addresses over the three months. We detected the first spoofing measures on iOS devices from May 2016, i.e. ~5 months before the reported measurements. Based on the Gemius Ranking of “operating systems that are used by persons connecting from Denmark with Danish web sites”[7], we expect an upper bound at around 43% per cent as long as iOS is the only operating system that actively spoofs the MAC address. This also indicates that the adoption rate of the newest iOS with this feature is not as high as we would expect at this moment.

7 Increased precision with trilateration

With the basic antennas we are using, which are omnidirectional, we get a vague distance estimation from the observed telephone to the sensor (thanks to the signal strength), but we do not get a direction, i.e. we do not know whether the detected telephone is e.g. north or south, right or left. This is what we call a “zone-based” setup, in which a sensor is deployed near the centre of each area of interest.

If a more precise positioning is needed, it is possible to deploy more sensors. But it is also possible to take advantage of multiple overlapping sensors, in which case a given telephone will then be observed by multiple sensors at the same time. A simple processing of the data based on geometric rules (triangulation) will then provide a much better {x, y} estimation of position.

All the recent experiments we have done with Wi-Fi monitoring were all “zone-based” (without using triangulation) because the costs of triangulation (additional hardware, configuration, deployment, energy and data processing) were not justifiable for the use-cases. Setups using a “zone-based” approach are very much plug-and-play, and reusable from site to site with almost no need for customisation.

8 Relation with Bluetooth

Another technology we are using at the Alexandra Institute is based on Bluetooth, and Bluetooth beacons in particular. There is a bit of overlap between the possibilities offered by Wi-Fi monitoring and Bluetooth beacons, but they are complementary and have their respective strengths.

Again, there are multiple ways of deploying Bluetooth-based positioning systems, which may be the topic for another article. The three Bluetooth setups that are the closest to the Wi-Fi monitoring described above are the following:

  • Bluetooth “classic”: the telephones must have Bluetooth enabled and be “visible”/“discoverable”. In this case, a telephone can be tracked mostly like with Wi-Fi monitoring, with the advantage of a much higher temporal precision (more observations per minute). This used to be a very common approach until year ~2009. Since then, most telephones do not have Bluetooth “visible” by default anymore, making this approach impractical, except if we know the Bluetooth MAC addresses of the telephones to monitor.
  • Mobile Bluetooth beacon: the persons to follow are given a Bluetooth beacon to carry. Here also, the temporal precision is much higher than Wi-Fi monitoring (configurable up to many observations per second), and the system is more reliable, but it only works in scenarios when a beacon can be given to users (e.g. for a conference, a museum visit, documentation of building inspection).
  • Bluetooth wearables: in this scenario, the users to follow must wear a Bluetooth Low Energy device such as an activity tracker. Many of such devices can be used like beacons, although with a lower temporal precision. The penetration rate (the percentage of users with whom it will work) is much lower though than with Wi-Fi.

9 Relation with GSM

Tracking of telephones using GSM (2G, 3G…) is much more powerful than the approaches we have presented, but is also more regulated[8], more complicated and with a lower spatial resolution. It is a technology used e.g. by the police[9], and it is not something we are doing at the Alexandra Institute.

10 Conclusion

The focus of the article, namely tracking smartphones thanks to fact that they broadcast their Wi-Fi MAC address, is a method that will become obsolete within 2-3 years as users will replace their devices with newer iOS and Android models, which will ship with the new privacy measures. In the meantime, it is still a very valuable technique, provided it is used responsibly. Furthermore, this method will be gradually complemented and then replaced by other approaches such as exploiting SSIDs (Wi-Fi network names), as explained above, among other strategies.

Sådan har andre brugt os

Tracking-eksperiment på konference

Tracking-eksperimentet fungerede ved, at de deltagere, der ønskede at medvirke, havde fået udleveret en Bluetooth-beacon. Deltagernes bevægelser viste så, om der opstod interessante datamønstre undervejs. Om der var oplæg eller booths, der var mere populære end andre. Bevægelserne blev registreret af 19 specialkonfigurerede Raspberry Pi-sensorer, der var opstillet bestemte steder i Musikhuset. Dataene røg gennem en første filtrering på selve sensorerne for at fjerne støj, og herefter blev de sendt videre til Alexandra Instituttets IoT-platform, hvor der blev foretaget en mere avanceret processering af data. Herefter blev dataene offentliggjort via en webserver, der både viste live-data og historiske data.

Bevægelsesmønstre af handlende i byen - butikker i Ringe og Næstved

Siden august har Alexandra Instituttet haft 15 Wi-Fi-sensorer opsat rundt om i Ringe samt på indfaldsvejene for at få et indtryk af, hvordan bevægelsesmønsteret er blandt de handlende i byen. Samtidig har DELTA med tre forskellige typer af sensorer foretaget målinger af, hvordan Tøjeksperten i Ringe bliver brugt.

  • De 15 Wi-Fi-sensorer er meget prisbillige og konstrueret til formålet ved hjælp af en billig minicomputer (Raspberry PI), en Wi-Fi-antenne og en 4G-forbindelse til en dataopsamlingsplatform.
  • Wi-Fi-sensorerne registrerer, når folk med mobiltelefoner kommer inden for rækkevidde, hvis disse telefoner har tændt for Wi-Fi. Hvis den samme telefon senere registreres på en anden sensor, får man et indtryk af, hvordan ejeren bevæger sig rundt i byen.
  • Der registreres ingen personhenførbare data om telefonernes ejere.
  • I Midtfyns Fritidscenter er WizeFloor opsat, så man har mulighed for at udforske materiale om Ringe og butikkerne, og der er mulighed for at deltage i en rabat-quiz.

Vi har også arbejdet med sensorer i samarbejde med:

  • Københavns Lufthavn: Bluetooth-tracking, lokationsbaserede ydelser eks. så de ved, hvor langt passagerne er nået.
  • Roskilde Festival: En blanding af Bluetooth og Wi-Fi for at finde ud af, hvor folk er. I samarbejde med studerende fra ITU og DTU.
  • Konsulentopgave: Kø-måling til valgsteder med Wi-Fi-tracking.
  • Novicel: Wi-FI-sensorer i deres kontorbygning i Viby. Hvad er mulighederne med teknologien ift. wayfinding men også salgsøjemed?
  • Vester Voldgade Tree.0: Raspberry Pi-baserede Wi-Fi-sensorer, som kører på batteri testet først i Vester Voldgade.
  • Ry: Første store deployment af Wi-Fi-sensorer til at indsamle data lokalt. SD-kort med data indsamles. MAC-adresse (er hashet/anonymiseret), signalstyrke.
  • Ringe: Wi-Fi-sensorer på nettet hele tiden og derfor kan de også uploade data live. Går fra at samle data fra device og kan se realtid af data og se, om dimserne er gået ned. I samarbejde med IBIZ.
  • Serviceplatform genbruger setuppet fra Ry og Ringe i Vejle og Næstved.

[1]: Verbree, E., Zlatanova, S., Van Winden, K., Van der Laan, E. B., Makri, A., Taizhou, L., & Haojun, A. (2013, December). To localise or to be localised with Wifi in the Hubei museum?. In Acquisition and Modelling of Indoor and Enclosed Environments 2013, Cape Town, South Africa, 11-13 December 2013, ISPRS Archives Volume XL-4/W4, 2013. ISPRS. http://dx.doi.org/10.5194/isprsarchives-XL-4-W4-31-2013

[3] The Mac address might be faked/spoofed, as explained in the section on privacy improvements

[4] The first three octets of the MAC address are the Organizationally Unique Identifier (OUI) and refer to the organisation that issued the device

[5] Musa, A. B. M., & Eriksson, J. (2012, November). Tracking Unmodified Smartphones Using Wi-Fi Monitors. In Proceedings of the 10th ACM conference on embedded network sensor systems (pp. 281-294). ACM. https://doi.org/10.1145/2426656.2426685

[7] http://rankings.dk/en/rankings/operating-systems.html (as of 13 Nov. 2016)

 

 
Senior Cyber-Physical Specialist
+45 61 69 52 47
Njalsgade 76, 3. sal,,
2300 København S
1. etage lokale 5B04
.