Building With the Twitter API: Analyzing Your Followers

Final product image
What You'll Be Creating

Introduction

Welcome to the latest episode of our Twitter API series. In our last episode, I built Twixxr.com which will let you discover influential women on Twitter for your account to follow. Today, I'm going to turn the focus inward to look at my own followers.

While I haven't really used Facebook since 2013, I've remained active on Twitter—even as they pumped my feed with ads and annoyed me by trying to algorithmically optimize it. 

Recently, I was verified and started to gather followers at a slightly faster rate. I was hopeful that I might see more response to my tweets. Generally, I've been surprised at how little response there usually is on Twitter for the average person.

Building With the Twitter API reifman profile on twitter with verified checkmark

I have nearly 1,900 followers, but rarely do people comment or retweet pieces that I think are important and of general interest. For example, not a single person shared my piece on the sharp spike in rape reports in Seattle or commentary on Bill Gates at his most outrageously hypocritical

For a long time I've wanted to look more closely at my Twitter followers and answer some questions: Who exactly is following me? And why aren't they more interactive? Is it possible that only 10% of my followers are real people? 

Twitter's been having trouble finding a buyer, and maybe this has something to do with it.

The Twitter API is a good tool to investigate this. Yet it has a ton of rate limits which make even something simple like analyzing your followers quite complex. In today's episode, I'll show you how I worked with the rate limits to assess and build a scoreboard of my followers.

If you have any questions or feedback, please post them below in the comments or reach out to me on Twitter @reifman.

Analyze Our Twitter Followers

Building With the Twitter API Initial scoreboard

Just above, you can see the basic scoreboard I've created. Today's episode will focus mostly on the infrastructure and approach I took to create this. I hope I get a chance to write more about improving the scoring mechanism.

And yes, as you can see above, renowned gay rights leader and sex advice columnist Dan Savage follows me but never retweets anything I share. If there's time today, we'll analyze this to answer important questions like: is he real, a bot, or just following me for personal sex advice? What can we learn from his account to determine whether he's likely ever to interact with me on Twitter or, for that matter, any of my other followers?

The scoreboard code is mostly a prototype which I've built on top of the Twixxr code from the last episode, but it's not a live demo for people to use. I'm sharing so you can learn from it and build on it yourself.

Here are the basic elements of the code:

  • Creating the database to store my followers and related data.
  • Downloading my followers in pages of 20 followers each.
  • Tracking the cursors for the pages as I download 15 pages per rate limited window.
  • Storing data collected about my followers in the database.
  • Building a prototype scoring algorithm to score all of the followers.
  • Building a view to browse the scoreboard.

Diving Into the Code

Creating the Database Table Migrations

I created three different tables to store all the data and help me work with the Twitter API rate limiting. If you're not familiar with Yii database migrations, please see How to Program With Yii2: Working With the Database and Active Record.

First, I extended the SocialProfile table to record a lot more data from the follower's accounts such as whether they are verified, their location, and how many items they've favorited:

Then, I built an indexing table called SocialFriend to track followers for specific accounts. If I decide to formalize this service publicly, I'll need this. It links the User table with the user's followers in the SocialProfile table.

Next, the Twitter API requires that you page through requests of 20 followers at a time. To know the next page, you have to track the cursors, essentially tags, that mark the next page to fetch.

Since you're only allowed to make 15 requests for followers every 15 minutes, you have to store these cursors in the database. The table is called SocialCursor:

Eventually, I'll build background cron tasks to manage all this, but for today's prototype, I'm running these tasks by hand.

Collecting the Followers and Their Account Data

Next, I created a method Twitter::getFollowers() to make the request. Here's the basics of the code:

It gets the next_cursor and repeatedly asks for followers, $followers = $this->connection->get("followers/list",['cursor'=>$next_cursor]), until it hits rate limits. 

The output looks something like this as it runs through each page of 20 results:

The data is stored by those $sp->add($user_id,$u); methods. The SocialProfile::add() method is a different version of the fill() method from the Twixxr tutorial. It stores more data and manages the SocialFriend index:

It's written to save new records or update old records so that in the future you could track your follower data and update it regularly, overwriting old data.

This last section at the end makes sure there is a SocialFriend index between the User table and the SocialProfile table.

Scoring Twitter Followers

Building With the Twitter API - Accounts with huge numbers of friends unlikely to read my tweets

I had a handful of goals for my Twitter scoring:

  • Eliminate accounts that follow everyone that follows them. For example, they have 12,548 followers and follow 12,392 people (see above).
  • Eliminate accounts following more than say 1,500 accounts who are unlikely to ever see what I share. For example, Dan Savage follows 1,536 people.
  • Eliminate accounts that have very few posts or very few accounts they follow, likely abandoned accounts.
  • Eliminate accounts with few favorites—these are likely bots, not really using the app.

Similarly, I wanted to highlight some positive aspects:

  • Accounts that are verified
  • Accounts that have lots of followers 
  • Accounts that have less than 1,000 people that they follow—a sweet spot to me

Here's some rough basic code from SocialProfile::score() that highlights some of the positives:

Here's some code that eliminates some of the bad accounts:

Obviously, there's a lot to play with here and a variety of ways to improve this. I hope I get a chance to spend more time on this.

As the method runs, it looks like this, but updates the SocialProfile table with scores as it goes:

Displaying the Scoreboard

Yii's default grid makes it pretty easy to display the SocialProfile table and customize the scoreboard columns.

Here's SocialProfileController::actionIndex():

And here's the grid view customized:

Here's what the top scores look like with my initial algorithm:

Building With the Twitter API - Top Scoring Accounts

There are so many ways to improve and tune the scoring. I look forward to playing with it more.

And there's more I'd like to write code for and expand my use of the API, for example:

  • Use PHP gender to help eliminate companies from people (companies don't interact much).
  • Look up the frequency of posts that people have made and the last time they used Twitter.
  • Use Twitter's search API to see which followers have actually ever interacted with my content.
  • Provide feedback to the scoring to tune it.

Looking Ahead

I hope you find the scoring approach intriguing. There's so much more that can be done to improve this. Please feel free to play with it and post your ideas below.

If you have any questions or suggestions, please post them in the comments. If you'd like to keep up on my future Envato Tuts+ tutorials and other series, please visit my instructor page or follow @reifman. Definitely check out my startup series and Meeting Planner.

Related Links

Tags:

Comments

Related Articles