DatasetsA collection of novel and benchmark datasets produced by members of the Machine Learning Work and used in their experimental work: A collection of Twitter datasets for evaluating multi-view analysis methods. A collection of Twitter datasets for evaluating criteria for Twitter user list curation. A dataset that was collected in order to permit the investigation of contemporary spam comment activity. Detecting Grand Tours of Europe with Geo-Tags Supplementary data for an analysis of tourist behaviour based on the analysis of a collection of 95 million Flickr photos for which precise geographic coordinates (geo-tags) are known. Irish Economic Sentiment Collection A new text sentiment analysis collection, produced from three Irish online news sources. A multi-view text corpus, constructed from news articles from three online news services. A set of synthetic text datasets for the evaluation of multi-view learning algorithms. A new text corpus, mined from biomedical literature, which refers to the terms used to describe S. cerevisiae ORFs. The network constructed from the publications of the CBR conference series (1993-2008). Two text corpora consisting of news articles, particularly suited to evaluating cluster analysis techniques. Image dataset for multi-label image classification using Active Learning with SVMs. A large number of artificially constructed text datasets. A dataset to train recommendation systems on Bronchiolitis treatment. |