The Second International Symposium on Optimization and Systems Biology (OSB’08)
Abstract:Single sequence repeats (SSRs) are DNA sequences composed of tandem repetitions of relatively short motifs. They are not only considered as genetic markers but also play an important role in gene regulatory networks, One of the greatest challenges of functional genomics. In order to identify key SSR regulators among functional gene sets, we have developed an efficient algorithm for SSR pattern retrieval and a verification mechanism based on cross-species comparison and statistical analysis. Cross-species comparison and orthologous relationship provide an added level of validation for identifying multiple genes that are likely to be regulated similarly. In addition, statistical analysis of appearance frequency rates confirms the retrieved SSRs as significant regulators from a given set of related genes. In this study, the target gene set with growth factor related genes was evaluated and several well known “CA”, “CCG” and “CAG” repeat pattern were successfully identified by our proposed system. Accordingly, the novel pattern mining methods proposed to analyze large-scale genome datasets are successfully achieved for all organisms and genes of interest, and the proposed mechanism can be applied to retrieve important candidates of SSR regulators for further biological experiments.