| Data Mining - Between Unsupervised and Supervised Techniques |
IntroductionThe DaM-BUST project focuses on real-life applications for machine learning that do not fit neatly into the traditional distinction between supervised and unsupervised techniques. This research specifically involves the analyis of data from three different application areas: Bioinformatics (genome & proteomics), Multimedia (image & text) and Manufacturing. We will develop tools and techniques that will help analyse this data. Research FocusA wide variety of techniques have been applied to the problems of interest here. In this project we will focus on two broad sets of techniques that complement each other:
Our research in The Knowledge Discovery Project and in the Muscle Network of Excellence indicates that these are the leading techniques for problems of this type. By focusing on just two related families of techniques we will establish a high level of expertise with these techniques. ApplicationsThe research in the DaM-BUST project will focus on applications of Machine Learning techniques in three distinct areas: ![]()
1. Image Indexing System: Many of the research problems that motivate this research proposal occur in the processing of multimedia content. For this reason we will develop a prototype system for organising personal digital photo collections that will act as a test-bed for our research. This image indexing application is representative of a large set of image analysis problems that arise with images from a range of sources. In the course of this research we will collaborate with HP in Galway on the application of these image annotation techniques to the annotation and analysis of medical images. ![]() 2. Microarray Data: The initial focus of the research in Bioinformatics will be on the cross-platform analysis of microarray data in collaboration with Prof. Des Higgins from the Conway Institute in UCD and Dr. Aedin Culhane from the Harvard School of Public Health and Dana-Farber Cancer Research Institute. 3. Inspection and Process Control: A number of interesting challenges for machine learning arise in industrial inspection and testing. For instance, in the automatic inspection of solder joints in electronic assembly there is a ready supply of images of good joints from working devices but examples of bad joints are very scarce. A similar situation exists in process monitoring where there is plenty of data on the process operating correctly but getting data on the manner in which the process might drift out of tolerance is more problematic. This situation arises also in the food industry where determining the authenticity of food is an important consideration Research StructureThe research in the DaM-BUST project will be organised around three central themes: Project OutputsThe major outputs of this research project are software systems, review documents and peer-reviewed research papers. The special characteristic of the subfield of machine learning that is identified here is that it is motivated by rather unusual problem formulations that arise in real-life situations. Thus, if we find effective solutions to these problems this research will have a considerable impact. |

