S.R.M. Oliveira, G.V. Almeida, K.R.R. Souza, D.N. Rodrigues, P.R. Kuser-Falcão, M.E.B. Yamagishi, E.H. Santos, F.D. Vieira, J.G. Jardine, G. Neshich
Published October 05, 2007
Genet. Mol. Res. 6 (4): 911-922 (2007)
About the authors
S.R.M. Oliveira, G.V. Almeida, K.R.R. Souza, D.N. Rodrigues, P.R. Kuser-Falcão, M.E.B. Yamagishi, E.H. Santos, F.D. Vieira, J.G. Jardine, G. Neshich
Corresponding author
G. Neshich
E-mail: neshich@cbi.cnptia.embrapa.br
ABSTRACT
An effective strategy for managing protein databases is to provide mechanisms to transform raw data into consistent, accurate and reliable information. Such mechanisms will greatly reduce operational inefficiencies and improve one’s ability to better handle scientific objectives and interpret the research results. To achieve this challenging goal for the STING project, we introduce Sting_RDB, a relational database of structural parameters for protein analysis with support for data warehousing and data mining. In this article, we highlight the main features of Sting_RDB and show how a user can explore it for efficient and biologically relevant queries. Considering its importance for molecular biologists, effort has been made to advance Sting_RDB toward data quality assessment. To the best of our knowledge, Sting_RDB is one of the most comprehensive data repositories for protein analysis, now also capable of providing its users with a data quality indicator. This paper differs from our previous study in many aspects. First, we introduce Sting_RDB, a relational database with mechanisms for efficient and relevant queries using SQL. Sting_rdb evolved from the earlier, text (flat file)-based database, in which data consistency and integrity was not guaranteed. Second, we provide support for data warehousing and mining. Third, the data quality indicator was introduced. Finally and probably most importantly, complex queries that could not be posed on a text-based database, are now easily implemented. Further details are accessible at the Sting_RDB demo web page: http://www.cbi.cnptia. embrapa.br/StingRDB.
Key words: Sting database, Protein structure analysis, Data warehousing, Data mining, Data mart