In late 2012, we collected data relating to the Twitter accounts of a list of 248 English Premier League football players and clubs active on Twitter. These accounts were manually assigned to 20 disjoint "ground truth" communities, each corresponding to a different Premier League club: Arsenal, Aston-Villa, Chelsea, Everton, Fulham, Liverpool, Manchester City, Manchester Utd, Newcastle, Norwich, QPR, Reading, Southampton, Spurs, Stoke, Sunderland, Swansea, West Brom, West Ham, Wigan

In total, we collected ~351k tweets, ~8k user lists, and ~4k follower links within the set of 248 users. From the tweets, we extracted mention and retweet information. By combining these different "views" of the data using a rank aggregation method, we constructed a "unified" graph representation of the relations between the Twitter accounts, which preserves the most informative underlying associations between users in the original views. A detailed description of the methodology is provided in this paper.

Below is a visualization of the unified graph representation for the users in the data, produced using Gephi and sigma.js. Users are coloured according to their community (i.e. club). The size of each node is proportional to its in-degree (i.e. number of incoming links).

By rolling over a node with the mouse, you can view the node's corresponding Twitter screen name and hide all nodes and edges, apart from the ones that are connected to the highlighted node. Left clicking on a node will open the user's Twitter page in a new window.


[Download GEXF File]   [Datasets]   [Paper]