Ensemble Non-negative Matrix Factorization

This page contains supplementary material for the paper:
Greene, D., Cagney, G., Krogan, N and Cunningham, P. (2008), "Ensemble Non-negative Matrix Factorization Methods for Clustering Protein-Protein Interactions", Bioinformatics, 24, 15: 1722--1728.
[PDF] [BibTeX]


Summary

We propose a new algorithm for aggregating a diverse collection of matrix factorizations to produce a superior clustering solution, which takes the form of a "soft" hierarchy of clusters. We apply the proposed Ensemble NMF algorithm to a high-quality assembly of binary protein interactions derived from two proteome-wide studies in yeast. Our experimental evaluation demonstrates that the algorithm lends itself to discovering small localized structures in this data, which correspond to known functional groupings of complexes.

Below we provide supplementary material (software, data and results) for this paper.


NMF Tree Browser

To examine the results of our experiments in detail, we developed the NMF Tree Browser tool, a cross-platform Java application for visually inspecting a soft hierarchy as produced by the Ensemble NMF algorithm. In the main window, the clustering is graphically arranged in a tree view, where the user can click on any node to reveal its contents, in terms of relevant papers, authors and descriptive terms.

The software is made freely available, and the source is licensed under the GPL. Note that the application requires Java 1.5 or higher, and includes the MTJ and Netlib libraries: 

>> Download NMF Tree Browser

>> Manual for NMF Tree Browser (PDF)

Compiling the source version of the application requires the MTJ and Netlib libraries:  

>> Download NMF Tree Browser (source code)


Data and Results

We also provide the protein interaction dataset used in our experiments (from Collins et al., 2007), together with the tree files produced by the Ensemble NMF procedure, which can be viewed with the Tree Browser tool:

>> Download Collins dataset and tree files

The file collins.tree includes the hierarchy produced by the Ensemble NMF clustering as discussed in the paper, while the file collins.al.tree includes the tree generated using the average linkage hierarchical clustering algorithm.

>> Download supplementary material (PDF)


Ensemble NMF Implementation

We also provide a C implementation of the Ensemble NMF algorithm described in the paper. This software is open source, and is licensed under the GPL. To compile the software requires a BLAS implementation, such as ATLAS, and the CBLAS interface.

>> Download Ensemble NMF implementation (Linux binary - 32-bit and 64-bit)

>> Download Ensemble NMF implementation (Linux binary - 64-bit only)

>> Download Ensemble NMF implementation (source code)

>> Manual for Ensemble NMF (PDF)