An Analysis of Research Themes in the CBR Conference Literature

Derek Greene, Jill Freyne, Barry Smyth, and Pádraig Cunningham

Presentation slides: 15 Years of CBR Conferences (4MB PDF)


After fifteen years of CBR conferences, this research sets out to examine the themes that have evolved in CBR research as revealed by the implicit and explicit relationships between the conference papers. We have examined a number of metrics for demonstrating connections between papers and between authors and have found that a clustering based on co-citation of papers appears to produce the most meaningful organisation. We have employed an Ensemble Non-negative Matrix Factorisation (NMF) approach that produces a “soft” hierarchical clustering, where papers can belong to more than one cluster. This is useful as papers can naturally relate to more than one research area. We have produced timelines for each of these clusters that highlight influential papers and illustrate the life-cycle of research themes over the last fifteen years. The insights afforded by this analysis are presented in detail. In addition to the analysis of the sub-structure of CBR research, this research also presents some global statistics on the CBR conference literature.


To examine the results of our experiments in detail, we developed the NMF Tree Browser tool, a cross-platform Java application for visually inspecting a soft hierarchy as produced by the Ensemble NMF algorithm. In the main window, the clustering is graphically arranged in a tree view, where the user can click on any node (i.e. cluster) to reveal its contents, in terms of relevant papers, authors and descriptive terms.

A number of different views of the clustering are provided, including the "Leaf Node Browser" (accessible from the menu item Tools->Leaf Node Browser), which allows the user to investigate the content of the leaf clusters of the tree.

The software is made freely available for research purposes (the application requires Java 1.5 or higher, and includes the JFreeChartJCommons and MTJ packages): 

>> Download NMF Tree Browser for CBR (November 2008)

A open-source C implementation of the Ensemble NMF algorithm itself is available here.

Data & Results

We make the CBR dataset used in our experiments publicly available for research purposes. This dataset consists of a total of 672 papers from the CBR conference series, published by 828 individual authors. The following archive also contains the tree files produced by the Ensemble NMF procedure, which can be viewed with the Tree Browser tool:

>> Download CBR dataset and tree files (November 2008)

Please note that the file cbr.tree includes a truncated tree highlighting the clusters discussed in the paper, while the file cbr.full.tree includes the full tree generated using the ensemble algorithm.