|Ensemble Non-negative Matrix Factorization Methods for Clustering Protein-Protein Interactions|
Greene, D., Cagney, G., Krogan, N and Cunningham, P. (2008), "Ensemble Non-negative Matrix Factorization Methods for Clustering Protein-Protein Interactions", Bioinformatics, 24, 15: 1722--1728. [PDF] [BibTeX]
We propose a new algorithm for aggregating a diverse collection of matrix factorizations to produce a superior clustering solution, which takes the form of a "soft" hierarchy of clusters. We apply the proposed Ensemble NMF algorithm to a high-quality assembly of binary protein interactions derived from two proteome-wide studies in yeast. Our experimental evaluation demonstrates that the algorithm lends itself to discovering small localized structures in this data, which correspond to known functional groupings of complexes.
Below we provide supplementary material (software, data and results) for this paper.
NMF Tree Browser
Data and Results
We also provide the protein interaction dataset used in our experiments (from Collins et al., 2007), together with the tree files produced by the Ensemble NMF procedure, which can be viewed with the Tree Browser tool:
The file collins.tree includes the hierarchy produced by the Ensemble NMF clustering as discussed in the paper, while the file collins.al.tree includes the tree generated using the average linkage hierarchical clustering algorithm.
Ensemble NMF Implementation
We also provide a C implementation of the Ensemble NMF algorithm described in the paper. This software is open source, and is licensed under the GPL. To compile the software requires a BLAS implementation, such as ATLAS, and the CBLAS interface.