Overview of Genome Database Systems
The
study of genes and proteins has become an extremely important area in the
modern day biology and they are better known as genomics and proteomics. In
these areas larger number of biological data is being used frequently.
Therefore the databases which contain these data play a vital role in fields of
biology and medicine. The term genomes refer to the total amount of genetic
code present in the cells of an organism. Genomics consists of two component
areas. Namely structural genomics and functional genomics. Genome databases
store this information and differently from gene databases the genome databases
contain both coding and non-coding intergenic sequences. Following are some
examples for genome databases.
- Saccharomyces genome database
- Mouse genome database
- Human genome database
- European mutant mouse pathology database
- Mito Map
- Kyoto Encyclopedia of Genes and Genomes
Characteristics of Genome Database Systems
- Data are highly complex when compared with most other domains and applications.
Compared to other domains’ data types genome data have the
highest possible complexity. This can be explained using the following example.
In MITO Map database it stores the human mitochondrial genome. “This single
genome is a small, circular piece of DNA encompassing information about 16,569
nucleotide bases; 52 gene encoding messenger RNA, ribosomal RNA, and transfer
RNA; 1000 known population variants; over 60 known disease associations.” These
types of data should be stored in way that can be processed by the computers
and also should have the ability to be handled by biologists as well. At the
first relational DBMS and Object Oriented DBMS approaches were taken to model
this data but then scientists moved on to their own ways of representing these
data. But currently relational DBMS is used for the sake of long time
maintenance and ease of curation.
- Schemas change at a rapid pace.
Therefore in
order to have improved information flow management in released databases, there
should be features to support data object migration and schema evolution. Most relational and object databases have a
fixed schema. Therefore in order to maintain this characteristic, some
databases, release new schema releases in every two or three years’ time. e.g.:
GenBank
- Representations of the same data by different biologists will likely be different.
Therefore there
should be mechanisms to maintain the uniformity of the database. In order to
implement this queries which can be interrelate and link different schemas have
been used.
- The amount and range of variability in data is high. Hence, biological systems must be flexible in handling data types and values.
- Defining and executing complex queries.
Mainly these
databases are used by the biologists who do not possess a great technological
knowledge in how the queries are structured and how these data are being
stored. Therefore simple interfaces should be implemented with integrating
query templates.
So guys I hope this information will be helpful for you if you are a bioinformatics enthusiast like me and craving for more insight in this field. In the next post I would like to discuss some insight details on genome database systems. Till then Good bye. Learn and Empower yourselves.
No comments:
Post a Comment