In [1]:
%matplotlib inline

import json
import codecs

Topic 2: Collecting Social Media Data

This notebook contains examples for using web-based APIs (Application Programmer Interfaces) to download data from social media platforms. Our examples will include:

  • Reddit
  • Facebook
  • Twitter

For most services, we need to register with the platform in order to use their API. Instructions for the registration processes are outlined in each specific section below.

We will use APIs because they can be much faster than manually copying and pasting data from the web site, APIs provide uniform methods for accessing resources (searching for keywords, places, or dates), and it should conform to the platform's terms of service (important for partnering and publications). Note however that each of these platforms has strict limits on access times: e.g., requests per hour, search history depth, maximum number of items returned per request, and similar.


Topic 2.1: Reddit API

Reddit's API used to be the easiest to use since it did not require credentials to access data on its subreddit pages. Unfortunately, this process has been changed, and developers now need to create a Reddit application on Reddit's app page located here: (https://www.reddit.com/prefs/apps/).

In [3]:
# For our first piece of code, we need to import the package 
# that connects to Reddit. Praw is a thin wrapper around reddit's 
# web APIs and works well

import praw

Creating a Reddit Application

Go to https://www.reddit.com/prefs/apps/. Scroll down to "create application", select "web app", and provide a name, description, and URL (which can be anything).

After you press "create app", you will be redirected to a new page with information about your application. Copy the unique identifiers below "web app" and beside "secret". These are your client_id and client_secret values, which you need below.

In [4]:
# Now we specify a "unique" user agent for our code
# This is primarily for identification, I think, and some
# user-agents of bad actors might be blocked
redditApi = praw.Reddit(client_id='xxx',
                        client_secret='xxx',
                        user_agent='crisis_informatics_v01')

Capturing Reddit Posts

Now for a given subreddit, we can get the newest posts to that sub. Post titles are generally short, so you could treat them as something similar to a tweet.

In [5]:
subreddit = "worldnews"

targetSub = redditApi.subreddit(subreddit)

submissions = targetSub.new(limit=10)
for post in submissions:
    print(post.title)
Space shuttle relic to be resurrected as deep-space habitat
Indian Muslim Woman Raped And Murdered In London 'Honour Killing' After Dating An Arab Muslim
Polish president signs 1 of 3 contested laws on judiciary:
Study shows India can integrate 175 GW of renewable energy into its electricity grid
Trump savages 'very weak' Attorney General Jeff Sessions - BBC News
An Indian IT firm is building a million-dollar empire with an army of high school graduates
Invasions by alien species and global warming form a “deadly duo”, scientists have warned, with the march of Argentine ants in the UK a new example.
China is using technology to predict who is going to commit a crime
Study: Indian monsoons have strengthened over past 15 years - A 50-year dry spell has reversed, with more rain to come.
This woman says she was trafficked by a diplomat. And it happens all the time.

Leveraging Reddit's Voting

Getting the new posts gives us the most up-to-date information. You can also get the "hot" posts, "top" posts, etc. that should be of higher quality. In theory. Caveat emptor

In [6]:
subreddit = "worldnews"

targetSub = redditApi.subreddit(subreddit)

submissions = targetSub.hot(limit=5)
for post in submissions:
    print(post.title)
Japan aims to reduce suicides by 30% in 10 years with measures to curb overwork
Iranian TV host and proponent of Islamic dress code caught drinking beer, without hijab in Switzerland
New blood test can check for 13 types of cancers
Pope Francis shuts off Vatican fountains amid Italy drought - BBC News
Saudi lobby pays $138,000 for anti-Qatar ads in the US

Following Multiple Subreddits

Reddit has a mechanism called "multireddits" that essentially allow you to view multiple reddits together as though they were one. To do this, you need to concatenate your subreddits of interesting using the "+" sign.

In [7]:
subreddit = "worldnews+aww"

targetSub = redditApi.subreddit(subreddit)
submissions = targetSub.new(limit=10)
for post in submissions:
    print(post.title)
I just wanted to share my rescue pup with everyone
Funny Cats Smile
Space shuttle relic to be resurrected as deep-space habitat
2 big loaves
Indian Muslim Woman Raped And Murdered In London 'Honour Killing' After Dating An Arab Muslim
Dogs in Istanbul almost gave me a heart attack!
Polish president signs 1 of 3 contested laws on judiciary:
They fought over this toy for half an hour and eventually tired themselves out
Study shows India can integrate 175 GW of renewable energy into its electricity grid
I've never passed up a kiss, but I've thought about it!

Accessing Reddit Comments

While you're never supposed to read the comments, for certain live streams or new and rising posts, the comments may provide useful insight into events on the ground or people's sentiment. New posts may not have comments yet though.

Comments are attached to the post title, so for a given submission, you can pull its comments directly.

Note Reddit returns pages of comments to prevent server overload, so you will not get all comments at once and will have to write code for getting more comments than the top ones returned at first. This pagination is performed using the MoreXYZ objects (e.g., MoreComments or MorePosts).

In [8]:
subreddit = "worldnews"

breadthCommentCount = 5

targetSub = redditApi.subreddit(subreddit)
submissions = targetSub.hot(limit=1)
for post in submissions:
    print (post.title)
    
    post.comment_limit = breadthCommentCount
    
    # Get the top few comments
    for comment in post.comments.list():
        if isinstance(comment, praw.models.MoreComments):
            continue
        
        print ("---", comment.name, "---")
        print ("\t", comment.body)
        
        for reply in comment.replies.list():
            if isinstance(reply, praw.models.MoreComments):
                continue
            
            print ("\t", "---", reply.name, "---")
            print ("\t\t", reply.body)
Japan aims to reduce suicides by 30% in 10 years with measures to curb overwork
--- t1_dkorzb1 ---
	 This looks very promising. From what Iv seen the largest problem in work place is over work and bullying bosses. So its good they are trying to address it. 
	 --- t1_dkp0psq ---
		 >bullying bosses

Ugh Konami can take a big hint. Hear the company is so morally bad, they would go after former employees that found new jobs in a non-competing company. Like a TV studio got contacted because they have an ex-kon working for them. https://arstechnica.com/business/2017/06/konami-reportedly-blacklisting-ex-employees-across-japanese-video-game-industry/

How Konami can do this is just shocking.
--- t1_dkoxq9c ---
	 Ha. One of the measures I've seen is Premium Friday (tl;dr: leave by 3:00 pm on the last Friday of the month). 

Of the people I've asked, only one of my friends and acquaintances CAN choose to take it. 
	 --- t1_dkoz7sh ---
		 Note that premium Friday is a suggestion by the government. Not a law. They solved people working themselves to death by suggesting to companies to let people leave early one day a month. 


I work in Japan and my company doesn't offer premium Friday, then again I have zero overtime (i work exactly 40h per week) so I don't care. Still so far all these "solutions" by the government can be called bandaids at best. 
--- t1_dkoxare ---
	 Suicide rates have been going down already but still high compared to the rest of the world.

http://www.japantimes.co.jp/news/2017/05/30/national/social-issues/preventive-efforts-seen-helping-2016-saw-another-decline-suicides-japan-21897/
--- t1_dkp0psq ---
	 >bullying bosses

Ugh Konami can take a big hint. Hear the company is so morally bad, they would go after former employees that found new jobs in a non-competing company. Like a TV studio got contacted because they have an ex-kon working for them. https://arstechnica.com/business/2017/06/konami-reportedly-blacklisting-ex-employees-across-japanese-video-game-industry/

How Konami can do this is just shocking.
--- t1_dkoz7sh ---
	 Note that premium Friday is a suggestion by the government. Not a law. They solved people working themselves to death by suggesting to companies to let people leave early one day a month. 


I work in Japan and my company doesn't offer premium Friday, then again I have zero overtime (i work exactly 40h per week) so I don't care. Still so far all these "solutions" by the government can be called bandaids at best. 

Other Functionality

Reddit has a deep comment structure, and the code above only goes two levels down (top comment and top comment reply). You can view Praw's additional functionality, replete with examples on its website here: http://praw.readthedocs.io/


Topic 2.2: Facebook API

Getting access to Facebook's API is slightly easier than Twitter's in that you can go to the Graph API explorer, grab an access token, and immediately start playing around with the API. The access token isn't good forever though, so if you plan on doing long-term analysis or data capture, you'll need to go the full OAuth route and generate tokens using the approved paths.

In [10]:
# As before, the first thing we do is import the Facebook
# wrapper

import facebook

Connecting to the Facebook Graph

Facebook has a "Graph API" that lets you explore its social graph. For privacy concerns, however, Facebook's Graph API is extremely limited in the kinds of data it can view. For instance, Graph API applications can now only view profiles of people who already have installed that particular application. These restrictions make it quite difficult to see a lot of Facebook's data.

That being said, Facebook does have many popular public pages (e.g., BBC World News), and articles or messages posted by these public pages are accessible. In addition, many posts and comments made in reply to these public posts are also publically available for us to explore.

To connect to Facebook's API though, we need an access token (unlike Reddit's API). Fortunately, for research and testing purposes, getting an access token is very easy.

Acquiring a Facebook Access Token

  1. Log in to your Facebook account
  2. Go to Facebook's Graph Explorer (https://developers.facebook.com/tools/explorer/)
  3. Copy the long string out of "Access Token" box and paste it in the code cell bedlow

In [11]:
fbAccessToken = "xxx"

Now we can use the Facebook Graph API with this temporary access token (it does expire after maybe 15 minutes).

In [12]:
# Connect to the graph API, note we use version 2.5
graph = facebook.GraphAPI(access_token=fbAccessToken, version='2.5')

Parsing Posts from a Public Page

To get a public page's posts, all you need is the name of the page. Then we can pull the page's feed, and for each post on the page, we can pull its comments and the name of the comment's author. While it's unlikely that we can get more user information than that, author name and sentiment or text analytics can give insight into bursting topics and demographics.

In [13]:
# What page to look at?
targetPage = "nytimes"

# Other options for pages:
# nytimes, bbc, bbcamerica, bbcafrica, redcross, disaster

maxPosts = 10 # How many posts should we pull?
maxComments = 5 # How many comments for each post?

post = graph.get_object(id=targetPage + '/feed')

# For each post, print its message content and its ID
for v in post["data"][:maxPosts]:
    print ("---")
    print (v["message"], v["id"])
        
    # For each comment on this post, print its number, 
    # the name of the author, and the message content
    print ("Comments:")
    comments = graph.get_object(id='%s/comments' % v["id"])
    for (i, comment) in enumerate(comments["data"][:maxComments]):
        print ("\t", i, comment["from"]["name"], comment["message"])
---
"Now we're not only going to make it permissible, we're going to celebrate this behavior." 5281959998_10151243335634999
Comments:
	 0 Arif Javed Tbell should introduce some kinda special packaging tho so the Lyft drivers cars don't smell like tacos forever tho
	 1 Nate Downs Absolute power move on their part. People are buzzed, need a safe ride home and need something to eat. Makes complete sense
	 2 Dick Carbone Taco bell is the most authentic Mexican food I've ever had .
	 3 Arif Javed Beautiful
	 4 David McRoberts ???  thats not a thing already??
---
Republicans are making another attempt to try to repeal and possibly replace the Affordable Care Act. But will they have the votes this time? 5281959998_1894448467439333
Comments:
	 0 The New York Times Read more: http://nyti.ms/2uu8pm1
	 1 Brian Velazquez Apparently, Trump thinks his supporters don't want healthcare, better jobs, and a future for their children. They just want anger from Trump. And fortunately for Trump, anger is the only thing he has to offer.
	 2 Michelle Lambeau This is their last best chance to redistribute nearly a trillion dollars from our national treasury to the 1% by taking healthcare away from ordinary Americans, so it is likely Republicans will go all out. The New York Times, it would be useful to get an estimate of how much money each individual Senator stands to pocket should this succeed. Because, make no mistake, today's vote is win-win for Congress' millionaires (listed below): Either their tax cut bill succeeds and they pocket hundreds of thousands in tax cuts - or it fails and, like John McCain, they get to keep the benefits of Obamacare, shielded by their personal wealth from any of its lingering side effects - and interest in making it better for the rest of us.
https://en.wikipedia.org/wiki/List_of_current_members_of_the_United_States_Congress_by_wealth
	 3 Maria Kondonijakos Stupid and shortsighted.  All Americans want and need access to good and comprehensive healthcare.  Hope the voters remember this vote.
	 4 Donna Straley Hopefully, the few remaining Republicans with a conscience will continue to hold fast against their colleagues who are so determined to inflict harm on millions of Americans.
---
A study of the brains of 111 NFL players shows 110 had chronic degenerative injuries. "It is no longer debatable," said the pathologist. 5281959998_10151243442539999
Comments:
	 0 John Arruda Real simple...Get rid of the helmets and guys will stop using their head as a weapon...The game will become more like rugby..Less head injuries..But better between wistle fist fights.....Problem solved.
	 1 Joe Pankowski If the NFL is to survive in the coming decades, there will need to either be a significant improvement in head protection or a major change in the rules.  For example, imagine a league where any use of the head during play by an offensive or defensive player would result in immediate ejection.
	 2 Karen Malin If I had a kid, they'd never go near a football field!
	 3 Adrian Nikonian I'm here to learn from the many brain experts found on FB.....
	 4 Beth Adams But but what about all the American football fans that love to see men smash into each other and throw each other around?  What about those poor souls?  Don't they matter.
---
Breaking News: Paul Manafort spoke to one Senate panel about the June 2016 meeting with a Russian lawyer at Trump Tower, while another subpoenaed him. 5281959998_10151243415929999
Comments:
	 0 Michael R Friend More witch hunting
	 1 Stacy Jean If Captain Crazypants fires Mueller, we protest at our local courthouses. Immediately.
	 2 Samuel Mason There needs to be indictment and soon, because its time to end the reign of the Mad King.
	 3 Joaquín Cervantes Trump surrogates six months ago: “There were no meetings with any Russians.”
Trump surrogates three months ago:  “They just forgot to report the meetings with Russians.”
Trump surrogates three weeks ago: "There's no evidence of collusion with Russia!!"
Trump surrogates two weeks ago: "Who cares if the Trump campaign colluded with Russia?"
Trump surrogates last week: “It’s Obama’s fault for not stopping the meetings with Russians.” 
Trump’s lawyer Jay Sekulow last week: “It’s the Secret Service’s fault for not stopping the meeting with Russians.”   
Trump in a meeting with his aides a few days ago: “Can I pardon myself?” 
Trump tweet early Saturday morning: "While all agree the U. S. President has the complete power to pardon…”
Trump supporters tomorrow: “Трамп собирается сделать Россию великой снова!”
	 4 Walter Moreano If Trump was innocent he would've been quiet and focused on his job. But he's guilty as hell. Which explains why he still brings up Clinton and repeatedly denounces the Russian investigation. His top campaign people and family are all GUILTY OF COLLUSION.
---
Some experts fear these policies reinforce dated stereotypes. 5281959998_10151243020039999
Comments:
	 0 Ryan Kobos Gets additional days off. Demands the same pay. Makes sense to me.
	 1 Jessica Benjamin Women who have been victims of Female Genital Mutilation (FGM), which is sometimes practiced in India and the U.S., they have much more painful periods than most women.
	 2 Rua Uí Raighan I'm sure most women would only take time off if they were in a bad way with it, which is fair enough. If you've hired the right person in the first place, I doubt compassionate rules like these would harm business.
	 3 An American in Canada If you're in pain, then you should be able to take the day off.  Period!
	 4 Tara Tyrrell Period pain, childbirth, mental health issues, simply needing a day to yourself...I would settle for judgement-free paid sick leave in general, for anyone needing it.
---
How smugglers get drugs into the United States. Read more: http://nyti.ms/2vXeHIA 5281959998_1894418574108989
Comments:
	 0 Larry Monahan Invade Mexico and wipe out the drug cartels. Do the same at point of drug production facilities all over the world. In conjunction with the country of origin. But then we would solve drug crime and cut incarceration related jobs and subsidies so that won't happen. WAR on drugs. Nope. Too many people bribed with too much money. Corruption wins. The largest military force in the world is neutered to stop the enormously horrific affects of the flow of illegal drugs into America.
	 1 Casey Ross I was under the impression that all drugs were just lobbed over the border wall one kilo at time? Wait that's totally stupid and only a halfwit would think that?
Huh..
	 2 Jr Raphael Quevedo Or you could just loosen immigration laws and legalize weed. That would solve all of this overnight. Or we could just keep dying from opioid abuse and police brutality. You're choice!
	 3 Daniel Sergent Mine the approaches, robotic sentry turrets at regular intervals, smugglers must consume the entirety of their cargo personally, anyone caught Manning a tunnel will be buried alive in it. It won't stop it, but it will definitely slow things way down.
	 4 Eduardo Landero Gutierrez I live in México,  the thing is, if the powerfull nations keep asking drugs, it always are gonna be who sales it. Double moral.
---
Ticks are spreading more pathogens and putting more Americans at risk for rare illnesses. 5281959998_10151243061474999
Comments:
	 0 Carrie Appel "The best known threat is Lyme disease. Cases in the United States increased from about 12,000 annually in 1995 to nearly 40,000 in 2015. Experts say the real number of infections is likely closer to 300,000." Thank your local antivaxxers for the increase, they're directly responsible for campaigning against a Lyme disease vaccine.
	 1 Pedram Bagheri And nobody is recognizing warming climate and global warming as a culprit?
	 2 Holly Jo And it's high time the CDC started taking seriously the stories of people whose (often severe) symptoms persist long after treatment ends. My husband and I are now in year two of battling a bout of Lyme disease that nearly killed him last August. Lyme disease (and other tick borne illnesses) are nothing to trifle with.
	 3 Brad Scott Release hundreds of guinea hens and opossums, they eat tons of ticks. Give each family fowl or opossums, they'll keep your yards clear.
	 4 Alan Hill Yeah people should be scared of ticks when they are dying bankrupt and homeless will cancer under trumpdontcare...  no cancer screening, so arrive incurable stage 4 at emergency ...see what good that does you....  I got a $222,000 bill yesterday. My wife got a $45,000 bill for 2 hours at OUTPATIENTS with a kidney stone....  My mother in England, got a full hysterectomy at age 92 with no bill. Two days wait and a great job...  Do Americans even know how things are in the 1st world ? They have been told a truck load of lies and they believe them...
---
Here's a C-SPAN watcher's guide to what's about to happen in the Senate. 5281959998_10151243123984999
Comments:
	 0 Brian M. Painter Driver "If Jesus did anything, he offered health care wherever he went — and he never charged a leper a co-pay...Like most Americans, I know the Affordable Care Act is not perfect...However we address this question, taking health care away from millions who currently have it cannot be the answer. We know that for every million people without access to health care, five thousand people will die needlessly — not because God called them home, but because those entrusted by God with the responsibility of governance failed to defend the widow, the orphan and the poor, and instead succumbed to the temptations of greed. Whatever your political philosophy or party affiliation, God’s Word is clear about the responsibility of governance: Woe to those who make unjust laws, to those who issue oppressive decrees, to deprive the poor of their rights and withhold justice from the oppressed of my people, making widows their prey and robbing the fatherless. (Isaiah 10: 1–3)" Rev. Dr. William Barber, Greenleaf Christian Church (Disciples of Christ). #MoralResistance #RepairersoftheBreach #LoveThyNeighbor #PoorPeoplesCampaign
	 1 Daniel Betts PRAY FOR YOUR LEADERS .....that light would come into their minds and that loving kindness would come into their hearts 

PRAY FOR YOUR LEADERS to become more than this ....you and I deserve it 

BUT MOST OF ALL .....pray for your leaders .....to become more than this 

THE NATION DESERVES IT B| 

~ daniel 2017
	 2 Jane Martin When the people take charge of their lives.... and start saving their own money in their own bank.....and stop giving their money to an Insurance company bank .... to save for them...When your Insurance company pays ,,,,your medical bill.,.that is only a loan to you.... you must pay that loan back...how many have had their premiums go up.... after you made a claim for your money.....you have been paying every month...if the bill is over.... what you have paid in --that insurance company bank is going to get their money back.....one way or another....If you pay your self that premium ...every month,,, would not take long for every one to be very wealthy.....
	 3 Bill Hall Never watched C span? For all the pretensions, posing, and posturing, nothing happens in the Senate.
	 4 Badri R. Narayanan What debate? What are we debating? Can we see the text of whatever it is that is being debated?
---
About 2 million pounds of illegal drugs were seized last year. Here are some of the ways they are smuggled into the U.S. 5281959998_10151243082524999
Comments:
	 0 Karen Colleen Agena Decimalizing drugs and treating addiction as part healthcare could prevent a lot of the violence.
	 1 Dave Weinberger Kill the demand and there'll be no supply problem. It would put a lot of narcs out of work and make an agency or two redundant though.
	 2 Ellen Garrison As long as there is a demand for drugs, smugglers will do anything to get them here. The Trump Wall is not going to stop the flow of drugs into the US.
	 3 Ana Silva we have a part of the Berlin wall here in Portugal to symbolize what was ... one day i hope you send us a part of yours to show the meaning of walls in future History !! Walls mean captive !!! that wall is a false symbolsm of security the real meaning is CONTROL !! you will see !!
	 4 Bill Hall Smuggling drugs? Really? When any drug can be made or produced here.
Or just go to any doctor for a prescription that's partially or wholly covered by your insurance. Smuggled drugs. What? Do they burn whale oil in lanterns in their homes for light?
---
Employees at a Wisconsin tech company can choose to have a chip injected between their thumb and index finger. 5281959998_10151243172474999
Comments:
	 0 Allison Whodat Roohi I'd do it. So I don't have to find my swipe card for the lift or the turnstile or to charge my coffee in the shop to my department. You can always have it removed and it's not mandatory.
	 1 Joanna Hatcher Kraus I had a friend in high school, whose Mom used to watch a bunch of TV evangelists.. she thought her Mom was a little nuts, but my friend would say to me.. if anyone ever tries to micro chip you.. run for the mountains.. that was 30 years ago.. 

Not a good idea
	 2 Evan Downie Here is where the real separation of individual liberty from corporate/government monitoring comes into a clearly defined position.
The GOP, and all those who (supposedly) support "freedom" should come down firmly against this. So where will they actually stand?
	 3 Roberto Moreno The only article I've seen that states that this is a voluntary program. Everyone is freaking out cause they can't do proper research. If it's optional and the employees want to do it then who are we to stop them? Is it strange? Yes. I wouldn't do it personally cause that's what employee ID cards are for.
	 4 Kelly McLaughlin This sort of 'convenience' technology makes you a virtual slave to your employer. No.


Topic 2.1: Twitter API

Twitter's API is probably the most useful and flexible but takes several steps to configure. To get access to the API, you first need to have a Twitter account and have a mobile phone number (or any number that can receive text messages) attached to that account. Then, we'll use Twitter's developer portal to create an "app" that will then give us the keys tokens and keys (essentially IDs and passwords) we will need to connect to the API.

So, in summary, the general steps are:

  1. Have a Twitter account,
  2. Configure your Twitter account with your mobile number,
  3. Create an app on Twitter's developer site, and
  4. Generate consumer and access keys and secrets.

We will then plug these four strings into the code below.

In [14]:
# For our first piece of code, we need to import the package 
# that connects to Twitter. Tweepy is a popular and fully featured
# implementation.

import tweepy

Creating Twitter Credentials

For more in-depth instructions for creating a Twitter account and/or setting up a Twitter account to use the following code, I will provide a walkthrough on configuring and generating this information.

First, we assume you already have a Twitter account. If this is not true, either create one real quick or follow along. See the attached figures.

  • Step 1. Create a Twitter account If you haven't already done this, do this now at Twitter.com.

  • Step 2. Setting your mobile number Log into Twitter and go to "Settings." From there, click "Mobile" and fill in an SMS-enabled phone number. You will be asked to confirm this number once it's set, and you'll need to do so before you can create any apps for the next step.

  • Step 3. Create an app in Twitter's Dev site Go to (apps.twitter.com), and click the "Create New App" button. Fill in the "Name," "Description," and "Website" fields, leaving the callback one blank (we're not going to use it). Note that the website must be a fully qualified URL, so it should look like: http://test.url.com. Then scroll down and read the developer agreement, checking that agree, and finally click "Create your Twitter application."

  • Step 4. Generate keys and tokens with this app After your application has been created, you will see a summary page like the one below. Click "Keys and Access Tokens" to view and manage keys. Scroll down and click "Create my access token." After a moment, your page should refresh, and it should show you four long strings of characters and numbers, a consume key, consumer secret, an access token, and an access secret (note these are case-sensitive!). Copy and past these four strings into the quotes in the code cell below.

In [15]:
# Use the strings from your Twitter app webpage to populate these four 
# variables. Be sure and put the strings BETWEEN the quotation marks
# to make it a valid Python string.

consumer_key = "xxx"
consumer_secret = "xxx"
access_token = "xxx"
access_secret = "xxx"

Connecting to Twitter

Once we have the authentication details set, we can connect to Twitter using the Tweepy OAuth handler, as below.

In [16]:
# Now we use the configured authentication information to connect
# to Twitter's API
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)

api = tweepy.API(auth)

print("Connected to Twitter!")
Connected to Twitter!

Testing our Connection

Now that we are connected to Twitter, let's do a brief check that we can read tweets by pulling the first few tweets from our own timeline (or the account associated with your Twitter app) and printing them.

In [17]:
# Get tweets from our timeline
public_tweets = api.home_timeline()

# print the first five authors and tweet texts
for tweet in public_tweets[:5]:
    print (tweet.author.screen_name, tweet.author.name, "said:", tweet.text)
NASAhistory NASA History Office said: #OTD in 1909, Louis Blériot made his daring crossing of the English Channel in his Blériot XI aircraft. https://t.co/Ti37GEuHnZ
StateDept Department of State said: When you travel abroad, know that Jeff at @usembassysweden and @statedept have your back. https://t.co/UGh7cfgAx1 #AmericanHeroesWeek
richardbranson Richard Branson said: A wonderful example of what can happen when business works together with the not for profit sector… https://t.co/LToRbvlNa8
smod4real Sweet Meteor O'Death said: cc: @foxandfriends https://t.co/T6j63GTkmk
ENERGY Energy Department said: Learn how @NNSANews helps power ⚡ @USNavy ships ⚓ like the #USSGeraldFord ➡ https://t.co/ynjYZcdWn7 https://t.co/k4ysUiCkF6
In [19]:
tweet.
Out[19]:
{'contributors': None,
 'coordinates': None,
 'created_at': 'Tue Jul 25 15:35:05 +0000 2017',
 'entities': {'hashtags': [{'indices': [59, 73], 'text': 'USSGeraldFord'}],
  'media': [{'display_url': 'pic.twitter.com/k4ysUiCkF6',
    'expanded_url': 'https://twitter.com/ENERGY/status/889871635234160640/photo/1',
    'id': 889861601142206465,
    'id_str': '889861601142206465',
    'indices': [100, 123],
    'media_url': 'http://pbs.twimg.com/media/DFlsaVtXcAExoCt.jpg',
    'media_url_https': 'https://pbs.twimg.com/media/DFlsaVtXcAExoCt.jpg',
    'sizes': {'large': {'h': 1367, 'resize': 'fit', 'w': 2048},
     'medium': {'h': 801, 'resize': 'fit', 'w': 1200},
     'small': {'h': 454, 'resize': 'fit', 'w': 680},
     'thumb': {'h': 150, 'resize': 'crop', 'w': 150}},
    'type': 'photo',
    'url': 'https://t.co/k4ysUiCkF6'}],
  'symbols': [],
  'urls': [{'display_url': 'nnsa.energy.gov/ourmission/pow…',
    'expanded_url': 'https://nnsa.energy.gov/ourmission/poweringnavy',
    'indices': [76, 99],
    'url': 'https://t.co/ynjYZcdWn7'}],
  'user_mentions': [{'id': 48148030,
    'id_str': '48148030',
    'indices': [10, 19],
    'name': 'NNSA',
    'screen_name': 'NNSANews'},
   {'id': 54885400,
    'id_str': '54885400',
    'indices': [34, 41],
    'name': 'U.S. Navy',
    'screen_name': 'USNavy'}]},
 'extended_entities': {'media': [{'display_url': 'pic.twitter.com/k4ysUiCkF6',
    'expanded_url': 'https://twitter.com/ENERGY/status/889871635234160640/photo/1',
    'id': 889861601142206465,
    'id_str': '889861601142206465',
    'indices': [100, 123],
    'media_url': 'http://pbs.twimg.com/media/DFlsaVtXcAExoCt.jpg',
    'media_url_https': 'https://pbs.twimg.com/media/DFlsaVtXcAExoCt.jpg',
    'sizes': {'large': {'h': 1367, 'resize': 'fit', 'w': 2048},
     'medium': {'h': 801, 'resize': 'fit', 'w': 1200},
     'small': {'h': 454, 'resize': 'fit', 'w': 680},
     'thumb': {'h': 150, 'resize': 'crop', 'w': 150}},
    'type': 'photo',
    'url': 'https://t.co/k4ysUiCkF6'}]},
 'favorite_count': 12,
 'favorited': False,
 'geo': None,
 'id': 889871635234160640,
 'id_str': '889871635234160640',
 'in_reply_to_screen_name': None,
 'in_reply_to_status_id': None,
 'in_reply_to_status_id_str': None,
 'in_reply_to_user_id': None,
 'in_reply_to_user_id_str': None,
 'is_quote_status': False,
 'lang': 'en',
 'place': None,
 'possibly_sensitive': False,
 'possibly_sensitive_appealable': False,
 'retweet_count': 5,
 'retweeted': False,
 'source': '<a href="https://studio.twitter.com" rel="nofollow">Media Studio</a>',
 'text': 'Learn how @NNSANews helps power ⚡ @USNavy ships ⚓ like the #USSGeraldFord ➡ https://t.co/ynjYZcdWn7 https://t.co/k4ysUiCkF6',
 'truncated': False,
 'user': {'contributors_enabled': False,
  'created_at': 'Tue Jul 13 18:17:40 +0000 2010',
  'default_profile': False,
  'default_profile_image': False,
  'description': 'Building the new energy economy. Reducing nuclear dangers & environmental risks. Expanding the frontiers of knowledge via innovative scientific research.',
  'entities': {'description': {'urls': []},
   'url': {'urls': [{'display_url': 'energy.gov',
      'expanded_url': 'http://energy.gov',
      'indices': [0, 23],
      'url': 'https://t.co/5NskeFjlam'}]}},
  'favourites_count': 2856,
  'follow_request_sent': False,
  'followers_count': 749323,
  'following': True,
  'friends_count': 538,
  'geo_enabled': False,
  'has_extended_profile': False,
  'id': 166252256,
  'id_str': '166252256',
  'is_translation_enabled': False,
  'is_translator': False,
  'lang': 'en',
  'listed_count': 6995,
  'location': 'Washington, DC',
  'name': 'Energy Department',
  'notifications': False,
  'profile_background_color': '464646',
  'profile_background_image_url': 'http://pbs.twimg.com/profile_background_images/378800000138649492/AAX2Itip.jpeg',
  'profile_background_image_url_https': 'https://pbs.twimg.com/profile_background_images/378800000138649492/AAX2Itip.jpeg',
  'profile_background_tile': False,
  'profile_banner_url': 'https://pbs.twimg.com/profile_banners/166252256/1489099959',
  'profile_image_url': 'http://pbs.twimg.com/profile_images/875374673470664704/Hr9dUs2u_normal.jpg',
  'profile_image_url_https': 'https://pbs.twimg.com/profile_images/875374673470664704/Hr9dUs2u_normal.jpg',
  'profile_link_color': '61AD00',
  'profile_sidebar_border_color': '000000',
  'profile_sidebar_fill_color': 'B3DB86',
  'profile_text_color': '333333',
  'profile_use_background_image': True,
  'protected': False,
  'screen_name': 'ENERGY',
  'statuses_count': 14478,
  'time_zone': 'Eastern Time (US & Canada)',
  'translator_type': 'none',
  'url': 'https://t.co/5NskeFjlam',
  'utc_offset': -14400,
  'verified': True}}

Searching Twitter for Keywords

Now that we're connected, we can search Twitter for specific keywords with relative ease just like you were using Twitter's search box. While this search only goes back 7 days and/or 1,500 tweets (whichever is less), it can be powerful if an event you want to track just started.

Note that you might have to deal with paging if you get lots of data. Twitter will only return you one page of up to 100 tweets at a time.

In [20]:
# Our search string
queryString = "earthquake"

# Perform the search
matchingTweets = api.search(queryString)

print ("Searched for:", queryString)
print ("Number found:", len(matchingTweets))

# For each tweet that matches our query, print the author and text
print ("\nTweets:")
for tweet in matchingTweets:
    print (tweet.author.screen_name, tweet.text)
Searched for: earthquake
Number found: 15

Tweets:
2010BestGreen How earthquake scientists eavesdrop on North Korea’s nuclear blasts https://t.co/a2NDP6JmlY https://t.co/E76rMpr9a8
mes200000 RT @HaikuVikingGal: Crude awakening: 37 years of oil spills in Alberta. And they wonder why BC opposes pipelines in an earthquake zone: htt…
IEgg_95 #일본지진 01:07:16 震度1 茨城 埼玉 東京 千葉  #earthquake https://t.co/XXBhK3M7JQ
twtaka_jp 【地震情報】 01:07:16 震度1 茨城 埼玉 東京 千葉  #地震 #jishin #earthquake #saigai #速報値 https://t.co/rdVaGEdG6A
kazkktn 01:07:16 震度1 茨城 埼玉 東京 千葉  #earthquake #Japan #地震 #terremoto https://t.co/E01eS6N0mX
EqMSNet [ #強震モニタ  ] #地震 #earthquake
01:08:02検出 独自第4報

推定震度: 1.3
10都道府県で検出: 茨城県 15.8gal,栃木県 3.2gal,群馬県 3.2gal,埼玉県 3.2gal… https://t.co/zjhMtjRJx6
jp_kizuna #強震モニタ  「 #地震情報 」🔊 01:07:16 震度1 茨城 埼玉 東京 千葉  #速報 #地震 #災害 #jishin #saigai  #津波  #earthquake #鹿児島県 #熊本県  #避難 https://t.co/vfXfxtT5JN
jaro_EarthQuake .@YuRi_jaro 01:07:17 震度1 茨城 埼玉 東京 千葉  #地震
yokuvariAlice 地震速報 01:07:17 震度1 茨城 埼玉 東京 千葉  #地震 #earthquake https://t.co/9kJaDCijob
Strong_monitor 強震速報【サブPCより配信中】 01:07:16 震度1 茨城 埼玉 東京 千葉  #地震 #jishin #earthquake https://t.co/qdFdpLZG9I
nao2577_B 【地震速報】 01:07:16 震度1 茨城 埼玉 東京   #地震 #earthquake https://t.co/ogTXPiec1I
earthquake_all 【極微小地震速報 北海道2/13】
2017/07/26 0:35:33 JST, 
日本 北海道 上士幌町役場の北西24km, 
M0.8, TNT239.0g, 深さ5.4km, 
MAP https://t.co/NRUC8p41Os 1524
ReadytoEvacuat 緊急地震速報 強い揺れに備えてください [地震情報https://t.co/Y7rsP9N78G] 01:07:15 震度1 茨城 #地震 https://t.co/fVr03K1sn6
earthquake_pro #earthquake #地震 01:07:15 震度1 茨城県 笠間 つくば南 下妻 常北 #地震 https://t.co/LH24MuoSmN
Linusdaddy2 自動:ゆれ感知 01:07:15 震度1 茨城 #地震 #jishin #earthquake https://t.co/tTlpWX9JTa

More Complex Queries

Twitter's Search API exposes many capabilities, like filtering for media, links, mentions, geolocations, dates, etc. We can access these capabilities directly with the search function.

For a list of operators Twitter supports, go here: https://dev.twitter.com/rest/public/search

In [21]:
# Lets find only media or links about earthquakes
queryString = "earthquake (filter:media OR filter:links)"

# Perform the search
matchingTweets = api.search(queryString)

print ("Searched for:", queryString)
print ("Number found:", len(matchingTweets))

# For each tweet that matches our query, print the author and text
print ("\nTweets:")
for tweet in matchingTweets:
    print (tweet.author.screen_name, tweet.text)
Searched for: earthquake (filter:media OR filter:links)
Number found: 9

Tweets:
Takumi_lovesong RT @earthquake_jp: [気象庁情報]26日 00時50分頃 福島県沖(N37.3/E141.6)にて 最大震度2(M4.4)の地震が発生。 震源の深さは40km。( https://t.co/q5GRe2vnSA ) #saigai #jishin #earth…
Keith_Event I think Event: earthquake has occurred but I can't find where
Tue Jul 25 11:09:25 2017 CDT https://t.co/x15EhpfN5Q
LarkinWarren Opinion | The Playboy President and Women’s Health https://t.co/PRONQGUBq6   "Trump is a tremor compared to the earthquake coming."
bca027761 RT @_7777777: Earthquake 01:07:56 震度1 茨城 #地震 #earthquake2017 https://t.co/kg4wqZXXVB
earthquake_all 【微小地震速報 福島県3/14】
2017/07/26 0:36:27 JST, 
日本 福島県 富岡町役場の北東59km, 
M1.8, TNT7.6kg, 深さ34.5km, 
MAP https://t.co/NT8vLbajoD 1524
earthquake_pro #earthquake #地震 [波形]  01:07:15 茨城県 取手, 千葉県 松戸, 埼玉県 川口, 東京都 新宿 府中 #地震 https://t.co/TCJ9LLSuKy
MagicalIslandW RT @eew_jp: 地震速報 2017/07/26 00:50頃、福島県沖の深さ50kmでマグニチュード4.7の地震が発生しました。予想される最大震度は震度3です。 https://t.co/GRdSP8nROk #jishin #earthquake
cutie14377 RT @gagrulenet: Magnitude 3.0 earthquake strikes Armenia https://t.co/Tn1hfWhfAc https://t.co/vZ1fo9Yqgt
taylor_atx An awesome tutorial visualizing earthquake data with Python and Bokeh from @joannecheng 📈 https://t.co/qaK8dXoX4L

Dealing with Pages

As mentioned, Twitter serves results in pages. To get all results, we can use Tweepy's Cursor implementation, which handles this iteration through pages for us in the background.

In [22]:
# Lets find only media or links about earthquakes
queryString = "earthquake (filter:media OR filter:links)"

# How many tweets should we fetch? Upper limit is 1,500
maxToReturn = 100

# Perform the search, and for each tweet that matches our query, 
# print the author and text
print ("\nTweets:")
for status in tweepy.Cursor(api.search, q=queryString).items(maxToReturn):
    print (status.author.screen_name, status.text)
Tweets:
avanzadarescate #SISMO  M 5.2, South of Kermadec Islands https://t.co/T1X2Q5NNJn #Quake #earthquake  #Jishin
travelmoneyfind Turkey: Summary - removal of information on the earthquake off the coast of Bodrum on 21 July 2017 #TravelTuesday https://t.co/9vjNYXc3yl
Gelatonilove411 RT @earthquake_jp: [気象庁情報]26日 01時07分頃 茨城県南部(N36.1/E139.9)にて 最大震度2(M3.2)の地震が発生。 震源の深さは50km。( https://t.co/eRjxavSwEf ) #saigai #jishin #eart…
fripSide_pad RT @earthquake_jp: [気象庁情報]26日 01時07分頃 茨城県南部(N36.1/E139.9)にて 最大震度2(M3.2)の地震が発生。 震源の深さは50km。( https://t.co/eRjxavSwEf ) #saigai #jishin #eart…
vw0234 RT @tenkijp_jishin: 26日0時50分頃、宮城県・福島県で最大震度2を観測する地震がありました。震源地は福島県沖、M4.4。この地震による津波の心配はありません。 https://t.co/mFaJHYjWeH #jishin
angelicscorn I added a video to a @YouTube playlist https://t.co/q6Z59djxuV 7/24/2017 -- English Channel Earthquake struck as expected -- New
TomoK0827 RT @tenkijp_jishin: 26日1時7分頃、茨城県で最大震度2を観測する地震がありました。震源地は茨城県南部、M3.2。この地震による津波の心配はありません。 https://t.co/ZebB5jjzxy #jishin
Moerunpa RT @earthquake_jp: [気象庁情報]26日 01時07分頃 茨城県南部(N36.1/E139.9)にて 最大震度2(M3.2)の地震が発生。 震源の深さは50km。( https://t.co/eRjxavSwEf ) #saigai #jishin #eart…
_Millhiore_F_39 RT @tenkijp_jishin: 26日0時50分頃、宮城県・福島県で最大震度2を観測する地震がありました。震源地は福島県沖、M4.4。この地震による津波の心配はありません。 https://t.co/mFaJHYjWeH #jishin
kasumisou33 RT @earthquake_jp: [気象庁情報]26日 01時07分頃 茨城県南部(N36.1/E139.9)にて 最大震度2(M3.2)の地震が発生。 震源の深さは50km。( https://t.co/eRjxavSwEf ) #saigai #jishin #eart…
filmloader RT @earthquake_jp: [気象庁情報]26日 01時07分頃 茨城県南部(N36.1/E139.9)にて 最大震度2(M3.2)の地震が発生。 震源の深さは50km。( https://t.co/eRjxavSwEf ) #saigai #jishin #eart…
emic_maihime RT @earthquake_jp: [気象庁情報]26日 01時07分頃 茨城県南部(N36.1/E139.9)にて 最大震度2(M3.2)の地震が発生。 震源の深さは50km。( https://t.co/eRjxavSwEf ) #saigai #jishin #eart…
Prioris_EVI 【震源・震度情報】
26日 1時7分頃、茨城県南部を震源とするM3.2の地震がありました。
この地震による津波の心配はありません。
https://t.co/7JfvIwMkJk https://t.co/ox8DhfIZKC
nonbirishiteru RT @tenkijp_jishin: 26日1時7分頃、茨城県で最大震度2を観測する地震がありました。震源地は茨城県南部、M3.2。この地震による津波の心配はありません。 https://t.co/ZebB5jjzxy #jishin
ricka_may RT @earthquake_jp: [気象庁情報]26日 01時07分頃 茨城県南部(N36.1/E139.9)にて 最大震度2(M3.2)の地震が発生。 震源の深さは50km。( https://t.co/eRjxavSwEf ) #saigai #jishin #eart…
hot0402 RT @earthquake_jp: [気象庁情報]26日 01時07分頃 茨城県南部(N36.1/E139.9)にて 最大震度2(M3.2)の地震が発生。 震源の深さは50km。( https://t.co/eRjxavSwEf ) #saigai #jishin #eart…
YouichiroS RT @earthquake_jp: [気象庁情報]26日 01時07分頃 茨城県南部(N36.1/E139.9)にて 最大震度2(M3.2)の地震が発生。 震源の深さは50km。( https://t.co/eRjxavSwEf ) #saigai #jishin #eart…
key_w_corculum RT @earthquake_jp: [気象庁情報]26日 01時07分頃 茨城県南部(N36.1/E139.9)にて 最大震度2(M3.2)の地震が発生。 震源の深さは50km。( https://t.co/eRjxavSwEf ) #saigai #jishin #eart…
Yumemi_1203 RT @earthquake_jp: [気象庁情報]26日 01時07分頃 茨城県南部(N36.1/E139.9)にて 最大震度2(M3.2)の地震が発生。 震源の深さは50km。( https://t.co/eRjxavSwEf ) #saigai #jishin #eart…
yossiknm RT @earthquake_jp: [気象庁情報]26日 01時07分頃 茨城県南部(N36.1/E139.9)にて 最大震度2(M3.2)の地震が発生。 震源の深さは50km。( https://t.co/eRjxavSwEf ) #saigai #jishin #eart…
Lfhsbnovr RT @earthquake_jp: [気象庁情報]26日 01時07分頃 茨城県南部(N36.1/E139.9)にて 最大震度2(M3.2)の地震が発生。 震源の深さは50km。( https://t.co/eRjxavSwEf ) #saigai #jishin #eart…
east_nakano_fun RT @earthquake_jp: [気象庁情報]26日 01時07分頃 茨城県南部(N36.1/E139.9)にて 最大震度2(M3.2)の地震が発生。 震源の深さは50km。( https://t.co/eRjxavSwEf ) #saigai #jishin #eart…
nmzu 【確定情報】26日1時7分頃に茨城県南部で震度2(M3.2)の地震が発生。震源の深さは50km。 https://t.co/hP7efKBa0f #jishin #earthquake #eqjp
toride_bot RT @tenkijp_jishin: 26日1時7分頃、茨城県で最大震度2を観測する地震がありました。震源地は茨城県南部、M3.2。この地震による津波の心配はありません。 https://t.co/ZebB5jjzxy #jishin
earthquake_all 【微小地震速報 千葉県1/15】
2017/07/26 0:36:49 JST, 
日本 千葉県 銚子市役所の東南東25km, 
M1.2, TNT951.5g, 深さ14.9km, 
MAP https://t.co/6l9E7HeC7f 1524
earthquake_jp [気象庁情報]26日 01時07分頃 茨城県南部(N36.1/E139.9)にて 最大震度2(M3.2)の地震が発生。 震源の深さは50km。( https://t.co/eRjxavSwEf ) #saigai #jishin #earthquake
MrTAKA4 RT @tenkijp_jishin: 26日1時7分頃、茨城県で最大震度2を観測する地震がありました。震源地は茨城県南部、M3.2。この地震による津波の心配はありません。 https://t.co/ZebB5jjzxy #jishin
yabattfit1962 RT @tenkijp_jishin: 26日1時7分頃、茨城県で最大震度2を観測する地震がありました。震源地は茨城県南部、M3.2。この地震による津波の心配はありません。 https://t.co/ZebB5jjzxy #jishin
kashimonome32 RT @tenkijp_jishin: 26日1時7分頃、茨城県で最大震度2を観測する地震がありました。震源地は茨城県南部、M3.2。この地震による津波の心配はありません。 https://t.co/ZebB5jjzxy #jishin
fairy_m_s_7 RT @tenkijp_jishin: 26日1時7分頃、茨城県で最大震度2を観測する地震がありました。震源地は茨城県南部、M3.2。この地震による津波の心配はありません。 https://t.co/ZebB5jjzxy #jishin
earthquake_all5 【小地震速報 茨城県南部】
26日01時07分頃 JST, 
最大震度2, 
震源地 茨城県 坂東市役所の北6km, 
Mj3.2, TNT951.5kg, 
深さ 約50km, https://t.co/TCZbZdUhK5
0807u_u RT @tenkijp_jishin: 26日1時7分頃、茨城県で最大震度2を観測する地震がありました。震源地は茨城県南部、M3.2。この地震による津波の心配はありません。 https://t.co/ZebB5jjzxy #jishin
ame_inu RT @earthquake_jp: [気象庁情報]26日 00時50分頃 福島県沖(N37.3/E141.6)にて 最大震度2(M4.4)の地震が発生。 震源の深さは40km。( https://t.co/q5GRe2vnSA ) #saigai #jishin #earth…
SciSeekFeed How earthquake scientists eavesdrop on North Korea’s nuclear blasts https://t.co/2lrckiP5X5
Sakura_Rin0208 RT @tenkijp_jishin: 26日1時7分頃、茨城県で最大震度2を観測する地震がありました。震源地は茨城県南部、M3.2。この地震による津波の心配はありません。 https://t.co/ZebB5jjzxy #jishin
twinews2 【震】[震源地] 茨城県南部 [最大震度] 震度2 (2017年7月26日 01時07分頃発生) - goo天気 https://t.co/fBMXJFjarZ
earthquake_mlt 2017年7月26日 1時07分ごろ、茨城県南部を震源とする地震がありました。震源の深さは50km、地震の規模はM3.2程度、最大震度は2と推定されています。この地震による津波の心配はありません。… https://t.co/szWMSAZTj3
supasio2267 RT @tenkijp_jishin: 26日0時50分頃、宮城県・福島県で最大震度2を観測する地震がありました。震源地は福島県沖、M4.4。この地震による津波の心配はありません。 https://t.co/mFaJHYjWeH #jishin
6thFleetHQ RT @tenkijp_jishin: 26日1時7分頃、茨城県で最大震度2を観測する地震がありました。震源地は茨城県南部、M3.2。この地震による津波の心配はありません。 https://t.co/ZebB5jjzxy #jishin
filmloader RT @tenkijp_jishin: 26日1時7分頃、茨城県で最大震度2を観測する地震がありました。震源地は茨城県南部、M3.2。この地震による津波の心配はありません。 https://t.co/ZebB5jjzxy #jishin
love_flower009 RT @tenkijp_jishin: 26日1時7分頃、茨城県で最大震度2を観測する地震がありました。震源地は茨城県南部、M3.2。この地震による津波の心配はありません。 https://t.co/ZebB5jjzxy #jishin
NewScienceWrld How earthquake scientists eavesdrop on North Korea’s #Nuclear blasts https://t.co/i3w9E4GCHb
tenkijp_jishin 26日1時7分頃、茨城県で最大震度2を観測する地震がありました。震源地は茨城県南部、M3.2。この地震による津波の心配はありません。 https://t.co/ZebB5jjzxy #jishin
yabattfit1962 RT @tenkijp_jishin: 26日0時50分頃、宮城県・福島県で最大震度2を観測する地震がありました。震源地は福島県沖、M4.4。この地震による津波の心配はありません。 https://t.co/mFaJHYjWeH #jishin
ichoosejoynow Have Fun Learning w/ Historical Stories of Survival- Great #SanFrancisco #Earthquake of 1906 - https://t.co/QAAc7TjUva #ihsnet #homeschool
mo_uuu6 RT @ReadytoEvacuat: 緊急地震速報 強い揺れに備えてください [地震情報https://t.co/Y7rsP9N78G] 01:07:15 震度1 茨城 #地震 https://t.co/fVr03K1sn6
Takumi_lovesong RT @earthquake_jp: [気象庁情報]26日 00時50分頃 福島県沖(N37.3/E141.6)にて 最大震度2(M4.4)の地震が発生。 震源の深さは40km。( https://t.co/q5GRe2vnSA ) #saigai #jishin #earth…
Keith_Event I think Event: earthquake has occurred but I can't find where
Tue Jul 25 11:09:25 2017 CDT https://t.co/x15EhpfN5Q
LarkinWarren Opinion | The Playboy President and Women’s Health https://t.co/PRONQGUBq6   "Trump is a tremor compared to the earthquake coming."
bca027761 RT @_7777777: Earthquake 01:07:56 震度1 茨城 #地震 #earthquake2017 https://t.co/kg4wqZXXVB
earthquake_all 【微小地震速報 福島県3/14】
2017/07/26 0:36:27 JST, 
日本 福島県 富岡町役場の北東59km, 
M1.8, TNT7.6kg, 深さ34.5km, 
MAP https://t.co/NT8vLbajoD 1524
earthquake_pro #earthquake #地震 [波形]  01:07:15 茨城県 取手, 千葉県 松戸, 埼玉県 川口, 東京都 新宿 府中 #地震 https://t.co/TCJ9LLSuKy
MagicalIslandW RT @eew_jp: 地震速報 2017/07/26 00:50頃、福島県沖の深さ50kmでマグニチュード4.7の地震が発生しました。予想される最大震度は震度3です。 https://t.co/GRdSP8nROk #jishin #earthquake
cutie14377 RT @gagrulenet: Magnitude 3.0 earthquake strikes Armenia https://t.co/Tn1hfWhfAc https://t.co/vZ1fo9Yqgt
taylor_atx An awesome tutorial visualizing earthquake data with Python and Bokeh from @joannecheng 📈 https://t.co/qaK8dXoX4L
_7777777 Earthquake 01:07:56 震度1 茨城 #地震 #earthquake2017 https://t.co/kg4wqZXXVB
nao2577_B 【地震速報】 [波形] 01:07 茨城県 つくば南 取手、千葉県 市川北 松戸、東京都 奥戸 東白鬚  #地震 #earthquake https://t.co/6xFBLoc403
akinco0403 RT @earthquake_jp: [速報LV1]26日 00時50分頃 福島県沖(N37.3/E141.7)(推定)にて M4.5(推定)の地震が発生。 震源の深さは推定30km。( https://t.co/pwtJb9DGvu ) #saigai #jishin #ea…
pcalleypbvusd RT @mom2teachk1: Having fun exploring NGSS today.  #GAFE4Littles Build structure to withstand an earthquake. #NTI https://t.co/fls2hCD1uK
DCIVIL_NET RT @GeoSciTweeps: The Dogger Bank earthquake of 7th June 1931 was, and still is, the largest earthquake ever recorded in the United Kingdom…
7kappe7 RT @tenkijp_jishin: 26日0時50分頃、宮城県・福島県で最大震度2を観測する地震がありました。震源地は福島県沖、M4.4。この地震による津波の心配はありません。 https://t.co/mFaJHYjWeH #jishin
NZAlert #NZ #Earthquake M3.2 quake causing weak shaking near Seddon https://t.co/9I4AWWSjnk … #CTCorp https://t.co/jdc7unvMne
2010BestGreen How earthquake scientists eavesdrop on North Korea’s nuclear blasts https://t.co/a2NDP6JmlY https://t.co/E76rMpr9a8
mes200000 RT @HaikuVikingGal: Crude awakening: 37 years of oil spills in Alberta. And they wonder why BC opposes pipelines in an earthquake zone: htt…
IEgg_95 #일본지진 01:07:16 震度1 茨城 埼玉 東京 千葉  #earthquake https://t.co/XXBhK3M7JQ
twtaka_jp 【地震情報】 01:07:16 震度1 茨城 埼玉 東京 千葉  #地震 #jishin #earthquake #saigai #速報値 https://t.co/rdVaGEdG6A
kazkktn 01:07:16 震度1 茨城 埼玉 東京 千葉  #earthquake #Japan #地震 #terremoto https://t.co/E01eS6N0mX
EqMSNet [ #強震モニタ  ] #地震 #earthquake
01:08:02検出 独自第4報

推定震度: 1.3
10都道府県で検出: 茨城県 15.8gal,栃木県 3.2gal,群馬県 3.2gal,埼玉県 3.2gal… https://t.co/zjhMtjRJx6
jp_kizuna #強震モニタ  「 #地震情報 」🔊 01:07:16 震度1 茨城 埼玉 東京 千葉  #速報 #地震 #災害 #jishin #saigai  #津波  #earthquake #鹿児島県 #熊本県  #避難 https://t.co/vfXfxtT5JN
yokuvariAlice 地震速報 01:07:17 震度1 茨城 埼玉 東京 千葉  #地震 #earthquake https://t.co/9kJaDCijob
Strong_monitor 強震速報【サブPCより配信中】 01:07:16 震度1 茨城 埼玉 東京 千葉  #地震 #jishin #earthquake https://t.co/qdFdpLZG9I
nao2577_B 【地震速報】 01:07:16 震度1 茨城 埼玉 東京   #地震 #earthquake https://t.co/ogTXPiec1I
earthquake_all 【極微小地震速報 北海道2/13】
2017/07/26 0:35:33 JST, 
日本 北海道 上士幌町役場の北西24km, 
M0.8, TNT239.0g, 深さ5.4km, 
MAP https://t.co/NRUC8p41Os 1524
ReadytoEvacuat 緊急地震速報 強い揺れに備えてください [地震情報https://t.co/Y7rsP9N78G] 01:07:15 震度1 茨城 #地震 https://t.co/fVr03K1sn6
earthquake_pro #earthquake #地震 01:07:15 震度1 茨城県 笠間 つくば南 下妻 常北 #地震 https://t.co/LH24MuoSmN
Linusdaddy2 自動:ゆれ感知 01:07:15 震度1 茨城 #地震 #jishin #earthquake https://t.co/tTlpWX9JTa
Earthquake_nyan 地震検知2017/07/26 01:07:59
第3報
時刻:01:07:50
計測震度:1.3
地震検知都道府県数:9
埼玉県 3.2gal,茨城県 15.8gal,栃木県 3.2gal,東京都 0.6gal… #地震 https://t.co/2J8DPohwwj
EqMSNet [ #強震モニタ  ] #地震 #earthquake
01:07:47検出 独自第3報

推定震度: 1.3
9都道府県で検出: 茨城県 15.8gal,栃木県 3.2gal,群馬県 3.2gal,埼玉県 3.2gal… https://t.co/7zLgmTKcRG
walkileaks Australian Survivor filmed during an earthquake https://t.co/ckS0ji1wTZ https://t.co/PGl2IbgoZO
Earthquake_nyan 地震検知2017/07/26 01:07:44
第2報
時刻:01:07:35
計測震度:1.3
地震検知都道府県数:7
埼玉県 3.2gal,茨城県 15.8gal,栃木県 3.2gal,東京都 0.6gal… #地震 https://t.co/AqGGvXyQpN
EqMSNet [ #強震モニタ  ] #地震 #earthquake
01:07:32検出 独自第2報

推定震度: 1.3
7都道府県で検出: 茨城県 15.8gal,栃木県 3.2gal,群馬県 3.2gal,埼玉県 3.2gal… https://t.co/0MYpKcLzP3
PropertyscoutsN M3.2 quake causing weak shaking near Seddon https://t.co/LzwouV6Ip1
ReadytoEvacuat 緊急地震速報 強い揺れに備えてください [地震情報https://t.co/Y7rsP9N78G] 01:07:15 群馬 栃木 茨城 東京 神奈川 予想震度1 #地震
Earthquake_nyan 地震検知2017/07/26 01:07:29
第1報
時刻:01:07:19
計測震度:0.1
地震検知都道府県数:1
埼玉県 0.4gal #地震 https://t.co/Pb6mphicUE
EqMSNet [ #強震モニタ  ] #地震 #earthquake
01:07:16検出 独自第1報

推定震度: 0未満
1都道府県で検出: 茨城県 3.2gal https://t.co/KDSZHNTbAE
77717771xyz RT @earthquake_jp: [速報LV4]26日 00時50分頃 福島県沖(N37.3/E141.6)(推定)にて M4.2(推定)の地震が発生。 震源の深さは推定44km。( https://t.co/7KYrKoZyDE ) #saigai #jishin #ea…
Mojo4Melo RT @seattletimes: A magnitude 2.7 earthquake rattled Bremerton area just after 2 a.m. https://t.co/opGq4hux8P https://t.co/q6tMvI9zNy
earthquake_all 【微小地震速報 鹿児島県3/12】
2017/07/26 0:31:01 JST, 
日本 鹿児島県 西之表市役所の東48km, 
M1.6, TNT3.8kg, 深さ28.7km, 
MAP https://t.co/b6I3kXmETb 1522
Earthquake_Lake RT @BleacherReport: CTE found in more than 99% of deceased NFL players' brains in study https://t.co/0TeGBXdVeK https://t.co/QXTDUpwks1
Sciencestweet New: How earthquake scientists eavesdrop on North Korea’s nuclear blasts https://t.co/q9JrqeUoKV
donapioii RT @USGSBigQuakes: Prelim M5.6 earthquake off the east coast of Honshu, Japan Jul-23 15:35 UTC, updates https://t.co/B8QRsZbDI2
donapioii RT @USGSBigQuakes: Prelim M5.5 earthquake Minahasa, Sulawesi, Indonesia Jul-23 07:55 UTC, updates https://t.co/SmrX9L7XcJ
kimunco545 RT @earthquake_jp: [気象庁情報]26日 00時50分頃 福島県沖(N37.3/E141.6)にて 最大震度2(M4.4)の地震が発生。 震源の深さは40km。( https://t.co/q5GRe2vnSA ) #saigai #jishin #earth…
PtrSerg RT @shizu__: @lana_liss @kimukimukimu81 @sushilpershad @mujahidgrw @Misultan75 @0226baba @SrgPtr @bcavalli3333 @AliAmohse @schettanya @PtrS…
science___news #Science How earthquake scientists eavesdrop on North Korea’s nuclear blasts https://t.co/sM3fDmHOqX
catrisky Earthquake breaks water fissure seal at GEM's Ghagoo mine - Mining MX https://t.co/PlgkfhngBG https://t.co/bfqLq21a7H
Tuned64 26日00:50 [ 最大震度 ] 震度 2 [ 震源地 ] 福島県沖 https://t.co/3DxxinD0fQ
5914f38359c44d2 RT @RatanSharda55: Intelligent Mom. Knows #RahulBabs speaking would create EarthQuake - of laughter 😁 https://t.co/BVw5RCuCun
VolcanoWatching RT @geonet: M3.2 quake causing weak shaking near Seddon https://t.co/j39jM6KwHk
PintoBeanz11 RT @mom2teachk1: Having fun exploring NGSS today.  #GAFE4Littles Build structure to withstand an earthquake. #NTI https://t.co/fls2hCD1uK

Other Search Functionality

The Tweepy wrapper and Twitter API is pretty extensive. You can do things like pull the last 3,200 tweets from other users' timelines, find all retweets of your account, get follower lists, search for users matching a query, etc.

More information on Tweepy's capabilities are available at its documentation page: (http://tweepy.readthedocs.io/en/v3.5.0/api.html)

Other information on the Twitter API is available here: (https://dev.twitter.com/rest/public/search).

Twitter Streaming

Up to this point, all of our work has been retrospective. An event has occurred, and we want to see how Twitter responded over some period of time.

To follow an event in real time, Twitter and Tweepy support Twitter streaming. Streaming is a bit complicated, but it essentially lets of track a set of keywords, places, or users.

To keep things simple, I will provide a simple class and show methods for printing the first few tweets. Larger solutions exist specifically for handling Twitter streaming.

You could take this code though and easily extend it by writing data to a file rather than the console. I've marked where that code could be inserted.

In [ ]:
# First, we need to create our own listener for the stream
# that will stop after a few tweets
class LocalStreamListener(tweepy.StreamListener):
    """A simple stream listener that breaks out after X tweets"""
    
    # Max number of tweets
    maxTweetCount = 10
    
    # Set current counter
    def __init__(self):
        tweepy.StreamListener.__init__(self)
        self.currentTweetCount = 0
        
        # For writing out to a file
        self.filePtr = None
        
    # Create a log file
    def set_log_file(self, newFile):
        if ( self.filePtr ):
            self.filePtr.close()
            
        self.filePtr = newFile
        
    # Close log file
    def close_log_file(self):
        if ( self.filePtr ):
            self.filePtr.close()
    
    # Pass data up to parent then check if we should stop
    def on_data(self, data):

        print (self.currentTweetCount)
        
        tweepy.StreamListener.on_data(self, data)
            
        if ( self.currentTweetCount >= self.maxTweetCount ):
            return False

    # Increment the number of statuses we've seen
    def on_status(self, status):
        self.currentTweetCount += 1
        
        # Could write this status to a file instead of to the console
        print (status.text)
        
        # If we have specified a file, write to it
        if ( self.filePtr ):
            self.filePtr.write("%s\n" % status._json)
        
    # Error handling below here
    def on_exception(self, exc):
        print (exc)

    def on_limit(self, track):
        """Called when a limitation notice arrives"""
        print ("Limit", track)
        return

    def on_error(self, status_code):
        """Called when a non-200 status code is returned"""
        print ("Error:", status_code)
        return False

    def on_timeout(self):
        """Called when stream connection times out"""
        print ("Timeout")
        return

    def on_disconnect(self, notice):
        """Called when twitter sends a disconnect notice
        """
        print ("Disconnect:", notice)
        return

    def on_warning(self, notice):
        print ("Warning:", notice)
        """Called when a disconnection warning message arrives"""

Now we set up the stream using the listener above

In [ ]:
listener = LocalStreamListener()
localStream = tweepy.Stream(api.auth, listener)
In [ ]:
# Stream based on keywords
localStream.filter(track=['earthquake', 'disaster'])
In [ ]:
listener = LocalStreamListener()
localStream = tweepy.Stream(api.auth, listener)

# List of screen names to track
screenNames = ['bbcbreaking', 'CNews', 'bbc', 'nytimes']

# Twitter stream uses user IDs instead of names
# so we must convert
userIds = []
for sn in screenNames:
    user = api.get_user(sn)
    userIds.append(user.id_str)

# Stream based on users
localStream.filter(follow=userIds)
In [ ]:
listener = LocalStreamListener()
localStream = tweepy.Stream(api.auth, listener)

# Specify coordinates for a bounding box around area of interest
# In this case, we use San Francisco
swCornerLat = 36.8
swCornerLon = -122.75
neCornerLat = 37.8
neCornerLon = -121.75

boxArray = [swCornerLon, swCornerLat, neCornerLon, neCornerLat]

# Say we want to write these tweets to a file
listener.set_log_file(codecs.open("tweet_log.json", "w", "utf8"))

# Stream based on location
localStream.filter(locations=boxArray)

# Close the log file
listener.close_log_file()
In [ ]: