Twitter’s favourite swear words*

* From a sample of 3.2 million tweets from 2927 users.


If you’re likely to get even the slightest bit offended, DO NOT READ the rest of this post.


As a minor diversion from some more serious/formal research, I thought it’s be interesting to see what the most popular swear words were by using the 400+ available here and looking through the 3.2ish million tweets in my CouchDB corpus.  Those 3.2m tweets cames from the public tweet stream of 2927 people.

I parsed the tweets and ended up with ass. Enjoy

Tag Cloud of Twitter Swear words

Leave a Comment

Getting Twitter User Information with Tweepy and jsonpickle

Yesterday I shared some simple Python code to dump a Twitter users information. See post here.

I managed to recreate this in Tweepy using jsonpickle (see code snippet below).

For the interested, I still didn’t manage to figure out the Tweepy Python object navigation. There’s a <cough> remarkably </cough> similar problem posted on stackoverflow here.  Thanks to MarcW for summarizing what I’m attempting to do as: “so basically, I want to loop through user.__getstate__() and if I find an object which requires further iteration, loop through that too”

Ultimately I’ll want a nice JSON object to pipe into CouchDB, so I found jsonpickle.  Some short playing around and I have a solution (I’m still curious about the object iteration/navigation FWIW as I’m sure it’s something stupid I’m missing).

Here’s the code, it converts the Python object to JSON, so no need to patch Tweepy.

# -*- coding: utf-8 -*-

import sys

import tweepy
import json
import jsonpickle
from pprint import pprint

api = tweepy.API()

def main():
    print "Starting."
    user = api.get_user('TheSuggmeister',include_entities=1)

    print "================ type ================="
    print type(user)

    print "================ dir ================="
    print dir(user) 

    print "================ user.status ================="
    pickled = jsonpickle.encode(user)
    print(json.dumps(json.loads(pickled), indent=4, sort_keys=True))   #you could just print pickled, but this makes it pretty
    print "================= end ================="

if __name__ == "__main__":

Leave a Comment

Simple Python code to get Twitter user information

I’d been searching high and low for a simple script to get Twitter user information and spit out the JSON object.  Tweepy felt like a sledgehammer to crack a nut.  More specifically,  navigating the Tweep class object returned is going to take me some more work.

So if you’re looking for a quick way to dump a users information, here’s some code that works.  I’ll start putting these on GitHub shortly…

# -*- coding: utf-8 -*-

import sys
import json
import urllib2

# A simple script to get twitter user information
# Authentication not required
# Twitter API call ref at
# Author: Chris Sumner 27th Feb 2012

def main():
    print 'starting'
    # Use urllib2 to make our Twitter API call
    # For more information on urllib2, take a look at 'The Missing Manual' at
    req = urllib2.Request('')
    response = urllib2.urlopen(req)
    the_page =

    # print JSON object
    print json.loads(the_page) 

    # Neater way to print Twitter JSON data
    # From :

    print(json.dumps(json.loads(the_page), indent=4, sort_keys=True))

if __name__ == "__main__":

Comments (1)