Twitter Data Retrieval

Mridu Bhatnagar
3 min readJul 18, 2018

--

Sample User Timeline on Twitter

TASK

  1. Fetch 100 tweets by a user.
  2. Retrieve tweet_id, tweet_text, created_at time.
  3. Store the retrieved data in a CSV file.

CONSTRAINTS

  • Object Oriented Programming.
  • Try using magic methods to make the UserTweet object iterable.

STEP 0

For fetching data from twitter access tokens need to be generated first.
Go to https://apps.twitter.com/. Create new App. Fill in the required
details.

Click on Create New App on top right corner.

STEP -1
Fill in the required details and click on create Twitter Application.

Details to be filled up in order to generate access tokens

STEP-03

Select the application you have created. Take the consumer key. Click on generate access token. Add all the tokens generated in a separate file config.py. These tokens are used to access twitter using python API client
for twitter, tweepy.

STEP-04
Using tweepy do the authentication.

Authentication of the application

THE CODE

  • Based on the twitter handle tweets were to be fetched. I was using Tweepy. So, went through the api documentation to figure out which method is being used to fetch the user timeline, are there any limitations? Task was to fetch 100.
  • Came across user_timeline() method. Which returns tweets from the mentioned twitter handle. Here, was a little catch to be noticed. It does not return all the tweets by the user. Instead it by default returns only 20 tweets.
  • To fetch tweets more than 20. There is a parameter count which needs to be passed. Along with the twitter handle.
Class UserTweets

get_tweets() method is used to store the tweets fetched from user’s timeline. Based on handle. And TWEET_COUNT. Iterate over the returned tweet_obj.
Out of complete data structure all I needed was tweet_id, created_at and tweet text. Fetch and stored those in a named tuple.

For per tweet these fields were to be stored. So, appended the tuple in a list.
And, finally self._tweets stores 100 tweets.

Fetched tweets were to be written to CSV

Last requirement was to implement magic methods __len__() and __getitem__() to make the UserTweet object Iterable.

All through, there was good recap of concepts.

Learnings

  1. Good use case to apply object oriented programming concepts,
  2. Implementation of magic methods.
  3. For storing tweets used named tuple within list. Named tuples helps in accessing elements using names instead of position.
  4. Recap of how named tuples are initialised. Wrong initialisation was leading to storing of tweet objects instead of a named tuple with required values. Multiple iterations and fixed the same.

POSSIBLE EXTENSIONS

Data Cleaning
Sentiment Analysis
Similarity between timelines
Fetch some more data points. Analyse data. Plot graphs. And, customer feedback for services and products can be observed.
And many more …….

Here’s the complete code .

--

--

Mridu Bhatnagar
Mridu Bhatnagar

Written by Mridu Bhatnagar

Honest, straight from the heart things. I care about, bother about, think about, I experience. I share them here. This has a purpose behind.

Responses (1)