Abstract
The analysis of gene expression allow to study the functions of genes and their roles in difierent processes in the cell of a living system, including the cell cycle. Clustering is widely used in the analysis of high-throughput gene expression data to flnd patterns of similarity that enable related gene groups and functions to be identifled. Clustering algorithms are very sensitive to the choice of initial conditions and optimal number of clusters. In this paper, we investigate the impact of metrics and cluster parametrisation for three clustering models and propose a method for optimisation of cluster parameters based on cluster compactness and separation. A case study presents the analysis of gene expression data for E.coli bacteria.
Original language | English |
---|---|
Pages (from-to) | 1-11 |
Journal | Journal of Physics Conference Series |
Volume | 128 |
Issue number | 1 |
DOIs | |
Publication status | Published - 2008 |