Marcos Catanho, Daniel Mascarenhas, Wim Degrave, Antonio Basílio de Miranda
Published: March 31, 2006
Genet. Mol. Res. 5 (1) : 115-126
Cite this Article:
M. Catanho, D. Mascarenhas, W. Degrave, A.Basílio de Miranda (2006). GenoMycDB: a database for comparative analysis of mycobacterial genes and genomes. Genet. Mol. Res. 5(1): 115-126.
About the Authors
Marcos Catanho, Daniel Mascarenhas, Wim Degrave, Antonio Basílio de Miranda
Corresponding author
A.B. de Miranda
E-mail: antonio@fiocruz.br
ABSTRACT
Several databases and computational tools have been created with the aim of organizing, integrating and analyzing the wealth of information generated by large-scale sequencing projects of mycobacterial genomes and those of other organisms. However, with very few exceptions, these databases and tools do not allow for massive and/or dynamic comparison of these data. GenoMycDB (http://www.dbbm.fiocruz.br/GenoMycDB) is a relational database built for large-scale comparative analyses of completely sequenced mycobacterial genomes, based on their predicted protein content. Its central structure is composed of the results obtained after pair-wise sequence alignments among all the predicted proteins coded by the genomes of six mycobacteria: Mycobacterium tuberculosis (strains H37Rv and CDC1551), M. bovis AF2122/97, M. avium subsp. paratuberculosis K10, M. leprae TN, and M. smegmatis MC2 155. The database stores the computed similarity parameters of every aligned pair, providing for each protein sequence the predicted subcellular localization, the assigned cluster of orthologous groups, the features of the corresponding gene, and links to several important databases. Tables containing pairs or groups of potential homologs between selected species/strains can be produced dynamically by user-defined criteria, based on one or multiple sequence similarity parameters. In addition, searches can be restricted according to the predicted subcellular localization of the protein, the DNA strand of the corresponding gene and/or the description of the protein. Massive data search and/or retrieval are available, and different ways of exporting the result are offered. GenoMycDB provides an on-line resource for the functional classification of mycobacterial proteins as well as for the analysis of genome structure, organization, and evolution.
Key words: Perl programming, Mycobacteria, Genome evolution, Functional classification, FASTA, MySQL.