Olfactory receptors (ORs) constitute the largest gene-family in the vertebrate genome. We have attempted to provide a comprehensive view of the OR universe through diverse tools of bioinformatics and computational biology. Among others, we have constructed the Human Olfactory Receptor Data Exploratorium (HORDE, http://bioportal.weizmann.ac.il/HORDE/) as a free online resource, which integrates information on ORs from different species. We studied the genomic organization of 853 human ORs and divided the repertoire into 135 clusters, accessible through our new cluster viewer feature. An analysis of intact and pseudogenized ORs in different clusters, as well as of OR expression patterns, is provided, relevant to OR transcription control. Coding single nucleotide polymorphisms were integrated; these are to be used for genotype-phenotype correlation studies. HORDE allows a unique opportunity for discerning protein structural and functional information of the individual OR proteins. By applying novel data analysis strategies to the >3000 OR genes of mouse, dog and human within HORDE, we have generated a set of refined rhodopsin-based homology models for ORs. For model improvement, we employed a novel analysis of specific positions along the seven transmembrane helices at which prolines generate helix-breaking kinks. The model shows family-specif ic structural features, including idiosyncratic kink patterns, which lead to significant differences in the inferred odorant binding site structure. Such analyses form a basis for a comprehensive sequence-based classification of OR proteins in terms of potential odorant binding specificities.
Key words: Olfactory receptor, HORDE, Computational data mining, Database, Homology modeling, Sequence analysis.