/ Technology

Strava Heatmap Implications

Update (2018-01-29): Whitequark on twitter made a tool to scrape the Strava data.[1]

Strava, a company that operates a fitness tracking application compatible with the Fitbit, has released a heatmap tool to see global usage of their heatmaps. People have found military bases[2] using the heatmap and people are concerned. In my opinion, secret military bases are the least interesting use for that data.

Finding military bases is a fun adventure. Finding people around abandoned airstrips[3] is a neat trick. Finding these bases is an interesting find for civilians far away from those bases but military personnel and civilians in the area already know about those bases. Most military organizations are very clear about where their secret bases are, like Area 51 for instance. Not to mention that countries have access to satellite imagery and drones.

What is interesting is that you can use the data from the heatmap tool to identify individuals[4]. You can also scrape that data, and it is easy to work with.[5]

Strava paths around GCHQ
Pictured: Heatmap paths around GCHQ's main building. Source: noclandor.

This data is not anonymized, as Sarah Jamie Lewis points out. Sarah is an expert on anonymity and privacy. Her thread, here, explains why location data is personally identifiable information. In order to anonymize any data you need to deidentify it, surround it with a large volume of data, and remove the context from that data.[6] If someone can map your movements then they can use that information to positively identify you.

I wonder what you can do with a large set of definitely-not-anonymous locational data of people who work at various intelligence agencies. Once you have the data you have all the time in the world to run analysis on it. If you know what kind of patterns you are looking for this data arms you with many tools.

Pattern Recognition

I have a huge database of de-identified location data and I want to see potential assets. We know where major intelligence agencies are located and we know we can identify individual users, and use that location data to identify people. With a bit of work we can find the identities of people who work there or frequent there. We've scraped the data, it isn't going anywhere. There is no pressure for time on this.

But because we have spent our time scraping the data and compiling it we can start to branch out from individual people. Metadata is powerful. Metadata is also trivial to analyze[7].

Let's start with the easy stuff. We have positive IDs for a bunch of people working at intelligence offices, or people who are often in close proximity to one. We can search for anyone who has crossed their paths but people love their routines so there will be a lot of that. We can find people's family members and close friends by frequent proximity. We can identify them easily too with the same set of data. Now we have some leverage, but again this is the trivial stuff.

As we scrape through someone's history we can look for other patterns. Does someone come into contact with another person multiple times, infrequently, but at the same location? Does someone come into contract with another person on a regular basis, but never at the same location?

Hell, we have exact path information so does someone make regular stops for coffee or dinner, but never with family, friends, or coworkers? Now I have some locations to watch because they might be meeting an asset. Or they might be seeing a side-piece, but that's still leverage.

Say someone we're looking at goes dark. They leave their tracker at home or turn it off so there are periods of time where there is no location data available. Does this happen on a schedule? We can shadow them and find out if they are seeing assets.

Man, it is really weird that James from the New York FBI field office spends about half an hour to an hour at various coffee shops four times a month. Also he really likes this strip club, but that's during off-hours. His wife is an avid jogger and seems to go everywhere with their tracker but the romance must be dead because they haven't had coffee together in months. I wonder if I should look into that?

Conclusion

I am not a professional analyst or a data scientist. This post sounds alarmist, but this is stuff we already know. This data stands out to me because of how long-term the implications are. My examples here are narrow but this kind of location data can be weaponized by almost anyone who is willing to spend the time to analyze it. Nation-states and all those wonderful people are aware of the implications of this data and nobody should be surprised if they have already been tapping into it from Strava and other companies.

Metadata is powerful. Location data is not anonymous. Exercise was a mistake.

Support the Author

NotAwful is a student studying networking and information security. You can support their studies monthly via Patreon (USD), or directly via PayPal. If you found this post useful, feel free to leave a tip.


  1. Whitequark archives Strava data. Twitter. ↩︎

  2. Nrg8000 finds military bases. Twitter. ↩︎

  3. Heupchurch finds an occupied, 'abandoned' looking airstrip. Twitter. ↩︎

  4. Paulmd199 identifies a soldier's home. One step from positive identification. Twitter. ↩︎

  5. Paulmd199 identifies users that shared a route. Twitter. ↩︎

  6. Sarah Jamie Lewis explains how data is anonymized. Twitter. ↩︎

  7. Sarah Jamie Lewis, on metadata analysis ↩︎

Strava Heatmap Implications
Share this