PICA - Software for integrating and exploring multiple connected datasets
Parallel Integration Clustering Algorithm (PICA) is a new data integration approach for performing multi-view clustering in domains where two or more related datasets are available.
The algorithm was first described in the publication: Greene, D., Bryan, K. and Cunningham, P. (2008), "Parallel Integration of Heterogeneous Genome-Wide Data Sources", Proc. 8th International Conference on BioInformatics and BioEngineering (BIBE 2008).
[PDF] [BibTeX]
Summary
To visually explore the models produced by the PICA algorithm, we developed the PICA Browser tool, a cross-platform Java application for visually exploring a soft clustering produced by integrating data from multiple connected views. The application highlights the contributions of each view and the frequency of appearance of the clusters (i.e. reliability), with the aim of providing insight into the provenance of the cluster relationships in the model. The software is made freely available for research purposes, and makes use of the MTJ library.
PICA Browser
PICA - Social Network Analysis of CBR conference series data
>> Download CBR browser, data and results
PICA - Integration of diverse genome-wide biological data
>> Download Bio browser, data and results
PICA Browser
>> Generic browser only
PICA Implementation
Here we provide a Java-based implementation of the PICA framework.
The software is made freely available for research purposes, and makes use of the args4j and MTJ libraries.Please
consult the included file README.txt for usage instructions.
>> Download PICA binary
Sample data files from the CBR dataset are provided here:
>> Download CBR data
Related Links
An Analysis of Research Themes in the CBR Conference Literature
Yeast Literature Corpus