Irish Economic Sentiment Dataset

We provide here a new text sentiment analysis collection, produced from three Irish online news sources: RTE, The Irish Times, and The Irish Independent.


Summary

This collection was produced using the UCD Sentiment Analyzer system (no longer online), during a three month period (July to October 2009). A subset of all articles collected was annotated by a group of 33 volunteer users, who were encouraged to label the articles as positive, negative, or irrelevant.


Structure

The collection consists of two datasets (user annotations and term frequencies):

Warm-up: The first month constituted a "warm-up" period, which allowed us to train the relevance classifier. This provided an initial dataset containing 3858 articles, with 2693 user annotations covering 354 individual articles.

Main: For the latter two months of the experiment, we collected a dataset for evaluating the machine learning questions arising from the sentiment analysis task. This second "main" dataset comprises 12469 documents, with 6910 user annotations resulting in 1306 labeled articles.


Download

This dataset is made available for non-commercial and research purposes only. The text of the news articles is provided in pre-processed sparse format. All rights, including copyright, in the content of the original abstracts are owned by the original authors. 

>> Download Irish Economic Sentiment Collection (November 2009) (16MB)