Using Social Media to Locate Eyewitnesses: The Twitter API

Tweet about President Obama taking the stage at Selma

This is the second of a two-part series on using social media to locate eyewitnesses to important events. In part one, I showed you how to use the Instagram API to find eyewitnesses to a live video shoot of Macklemore's in Seattle. In this part, we'll use the Twitter API to find attendees of President Obama's speech in Selma at the Edmund Pettus Bridge.

You can download code for both episodes by using the GitHub repository link in the sidebar. You may also be interested in my Tuts+ series, Building With the Twitter API.

Twitter's geosearch capabilities are more limited and therefore require a bit more detailed code to use. Geotagged posts on Twitter can only be found from the last seven days. And they are only searchable by date (not time), so you have to filter the API results for precision. 

I do participate in the discussions below. If you have a question or topic suggestion, please post a comment below. You can also reach me on Twitter @reifman or email me directly. 

What We Covered in Part One

The phones we carry in our pockets record our every move, sharing it with cell providers and often third-party software companies whose motivations generally focus on profit. 

Many people don't realize that they've left geotagging on their social media apps, fully publicizing their location with every social media post. This included GOP Congressman Aaron Schock. The AP used his Instagram account's geotags to expose his use of taxpayer funds for extravagant private flights and other luxurious activities. 

Congressman Aaron Schock in South America

So, geotagging can be used for good. In this series, we're exploring how journalists or law enforcement might locate potential eyewitnesses to important events such as a crime or accident scene using social media.

However, geotagging can also be used abusively. Berkeley computer scientists and educators built the Ready or Not? app to showcase how geotagging in Twitter and Instagram record our every move. 

Here's Apple co-founder Steve Wozniak's Twitter account in the app:

Ready or Not App Steve Wozniak Twitter Geolocation History

The geotagging on Instagram and Twitter is accurate enough to allow someone to easily determine your residence, place of work and travel routine.

In this episode, I'll guide you through using the Twitter API. I've provided a GitHub repository (the link is in the sidebar) for you to download to try out the code. My "Eyewitness app" is written in the Yii Framework for PHP, which you can learn more about in my Programming With Yii2 series for Tuts+. 

If you don't wish to share your location for the public to see—or to leave a historical trail of your travels—the Ready or Not? app offers links and guides to turning these features off (look for the link on its home page). Frankly, I've turned mine off and I encourage you to do so too.

If you're a law enforcement agency or media entity that would like more information, please feel free to contact me directly. I would also be interested in any successful uses of this code (for good)—they'd make an interesting follow-up story.

What We Did With Instagram

Last episode, we used the Instagram API to find eyewitnesses to Mackelmore's live 2013 video shoot for White Cadillac. Quite easily, we managed to find Instagram member Joshua Lewis's photo of Macklemore stepping out of his vehicle (cool, huh?):

Macklemore Arrives Found with the Instagram API

Now, let's get started using the Twitter API.

Using the Twitter API

As with Instagram, you need to sign in to your Twitter account and register a developer application. You should register an app like this:

Twitter App Registration

Twitter will show you your application details:

Twitter App Eyewitness Details

Here's the settings page:

Twitter App Details

Here are the keys and access tokens for the application. Make note of these.

Twitter App Keys and Access Tokens

Then, scroll down and create access tokens for your account. Make note of these too.

Twitter App Account Access Tokens

Add all four of these configuration keys and secrets to your /var/secure/eyew.ini file:

Then, we'll create an Active Record migration to create our Twitter model. This will store the tweets we receive from the API calls.

Just as we did in part one, you need to run the migration:

Then, I used Yii2's code generator, Gii, to create the model and CRUD controllers for the Twitter model. If you get the latest GitHub repository code using the sidebar link on this tutorial, you'll have the code as well.

Create a New Moment

Because Twitter limits geolocation searches to the past week, I eventually chose President Obama's Selma 50th Anniversary speech at the Edmund Pettus Bridge.

I used Google Maps again to get the GPS coordinates for the bridge:

Edmund Pettus Bridge Selma Alabama Geolocation

Then, I created a Moment for the speech to search. I updated it a few times to tweak the geographic radius of the search (it's a bridge) and the time range:

Edmund Pettus Bridge Selma Alabama Geolocation Moment

Search Using the Twitter API

The limitations of the Twitter API are that it only allows you to search by date, e.g. 2015-03-07, whereas Instagram is indexed by precise Unix timestamps. Therefore, we have to begin our Twitter search a full day ahead and search backwards.

Since we're likely to obtain a lot of tweets outside our desired time range, we have to make repeated calls to the Twitter API. Twitter returns up to 100 tweets per API request, and allows 180 requests per 15-minute window.

I'm using James Mallison's Twitter API Library for PHP. Here's how we set up the library to make calls:

Initially, we request 100 results from Twitter at our GPS coordinates up to a specific date. 

We only record tweets within our precise time range, ignoring the other results. As we process these, we make note of the lowest tweet ID received.

Then we loop, making repeated requests to Twitter (up to 179 more times), requesting additional records that are earlier than the previous batch's lowest tweet ID. In other words, on subsequent requests, instead of querying up to a specific date, we query to the max_id of the lowest tweet ID that we've received.

We stop when less than 100 records are returned or when returned tweets are earlier than our actual range. 

If you need access to more than 18,000 tweets, you'll need to implement a background task to call the Twitter API, as we've done in our other Twitter API series.

As we process API results, we need to filter tweets, only recording those that fall within our actual start time and end time.

Note: The Twitter API has a lot of frustrating quirks which make paging more difficult than it should be. Quite frequently Twitter returns no results without an error code. Other times, I found it returning a small number of results, but that didn't mean that another request would not return more. There are no very clear ways to know when Twitter is done returning results to you. It's inconsistent. Thus, you may notice my code has a few interesting workarounds in it, e.g. examine $count_max_repeats.

One of the first results returned included the tweet below by Fred Davenport showing President Obama on stage:

The First Results for Selma 50th via the Twitter Search API

Here's it is on Twitter:

President Obama in Selma via the Twitter API

Then, as you browse the results further, you can find many more people present tweeting about Obama—including the media:

More Selma Twitter API Results

Now, let's do a more local search.

A Second, More Local Search

Key Arena is Seattle's large concert and sports arena. This past weekend they held the Pac-12 Women's Basketball Tournament:

Key Arena Calendar of Events

Let's get our GPS coordinates for Key Arena from Google Maps:

Key Arena Geolocation on Google Maps

Then, I created and tweaked a moment to find a longer time range for the weekend of tweets:

Create a moment to search Key Arena Basketball Tournament

And, here are some of the results. My favorite is:

"I wanna leave this basketball game. I hate basketball."
Twitter Search Results for Key Arena Tournament

For the most part, it seems to me that Instagram's API is far more powerful than Twitter's and yields generally more intriguing results. However, it depends on the kind of person that you're looking for. If you just want to identify people who were there, either API works well.

What We've Learned

I hope you've enjoyed this series. I found it fascinating and was impressed by the results. And it highlights the concerns we should all have about our level of privacy in this interconnected digital age.

The APIs for Instagram and Twitter are both incredibly powerful services for finding social media users who were nearby certain places at certain times. This information can be used for good and it can be abused. You should probably consider turning off your geolocation posting—follow the links at the Ready or Not? app.

You may also want to check out my Building With the Twitter API series, also on Tuts+.

Please feel free to post your questions and comments below. You can also reach me on Twitter @reifman or email me directly. I'd especially appreciate hearing from journalists and law enforcement that make use of these examples.

You can also browse my Tuts+ instructor page to see other tutorials I've written. 

Related Links



Related Articles