seGOsa: Software environment for gene ontology-driven similarity assessment. In 2010 IEEE International Conference on Bioinformatics and Biomedicine, Hong Kong, China, 18-21 December 2010. (Book Chapter)
In recent years there has been a growing trend towards the adoption of ontologies to support comprehensive, large-scale functional genomics research. This paper introduces seGOsa, a user-friendly cross-platform system to support large-scale assessment of Gene Ontology (GO)-driven similarity among gene products. Using information-theoretic approaches, the system exploits both topological features of the GO (i.e., between-term relationships in the hierarchy) and statistical features of the model organism databases annotated to the GO (i.e., term frequency) to assess functional similarity among gene products. Based on the assumption that the more information two terms share in common, the more similar they are, three GO-driven similarity measures (Resnik's, Lin's and Jiang's metrics) have been implemented to measure betweenterm similarity within each of the GO hierarchies. Meanwhile, seGOsa offers two approaches (simple and highest average similarity) to assessing the similarity between gene products based on the aggregation of between-term similarities. The program is freely available for non-profit use on request from the authors.