Biological databases are stores of biological information.[1] The journal Nucleic Acids Research regularly publishes special issues on biological databases and has a list of such databases. The 2018 issue has a list of about 180 such databases and updates to previously described databases.[2] Omics Discovery Index can be used to browse and search several biological databases. Furthermore, the NIAID Data Ecosystem Discovery Portal developed by the National Institute of Allergy and Infectious Diseases (NIAID) enables searching across databases.
Meta databases are databases of databases that collect data about data to generate new data. They are capable of merging information from different sources and making it available in a new and more convenient form, or with an emphasis on a particular disease or organism.[metadatabase is a database model for metadata management, global query of independent database, and distributed data processing. The word metadatabase is an addition to the dictionary]. originally, metadata was only common term referring simply to data about data such a tags, keywords, and markup headers.
Model organism databases provide in-depth biological data for intensively studied organisms.
The primary databases make up the International Nucleotide Sequence Database (INSD). The include:
DDBJ (Japan), GenBank (USA) and European Nucleotide Archive (Europe) are repositories for nucleotide sequence data from all organisms. All three accept nucleotide sequence submissions, and then exchange new and updated data on a daily basis to achieve optimal synchronisation between them. These three databases are primary databases, as they house original sequence data. They collaborate with Sequence Read Archive (SRA), which archives raw reads from high-throughput sequencing instruments.
Secondary databases are:[clarification needed]
Other databases
Generic gene expression databases
Microarray gene expression databases
These databases collect genome sequences, annotate and analyze them, and provide public access. Some add curation of experimental literature to improve computed annotations. These databases may hold many species genomes, or a single model organism genome.
(See also: List of proteins in the human body)
Several publicly available data repositories and resources have been developed to support and manage protein related information, biological knowledge discovery and data-driven hypothesis generation.[15] The databases in the table below are selected from the databases listed in the Nucleic Acids Research (NAR) databases issues and database collection and the databases cross-referenced in the UniProtKB. Most of these databases are cross-referenced with UniProt / UniProtKB so that identifiers can be mapped to each other.[15]
Proteins in human:
There are about ~20,000 protein coding genes in the standard human genome. (Roughly ~1200 already have Wikipedia articles - the Gene Wiki - about them) if we are Including splice variants, there could be as many as 500,000 unique human proteins[16]
Page Template:COVID-19 pandemic data/styles2.css has no content.
DB name | DB website | Provider | Data sources | Revenue/Sponsors sources | Integrates | Wiki article | Desc. | Size | DB type | Actively maintained |
---|---|---|---|---|---|---|---|---|---|---|
InterPro | http://www.ebi.ac.uk/interpro/ | ELIXIR infrastructure | European Bioinformatics Institute | EMBL, The Welcome trust, BBSRC | CATH-Gene3D, CDD, HAMAP, MobiDB, PANTHER, Pfam, SMART, SUPERFAMILY, SFLD, TIGRFAMs, | InterPro | classifies proteins into families and predicts the presence of domains and sites | Protein sequence databases | Yes | |
NextProt | https://www.nextprot.org/ | CALIPHO (is a group at the SIB) | Swiss Institute of Bioinformatics | https://www.sib.swiss/about/funding-sources | UniProt, Cellosaurus, Gnomad, IntAct, SRAA Atlas, Uniprot - GOA, BGEE, COSMIC, MassIVE, Peptide atlas | neXtProt | a human protein-centric knowledge resource | Protein sequence databases | Yes | |
Wiki-pi | http://severus.dbmi.pitt.edu/wiki-pi/ | Madhavi K. Ganapathiraju | At present Wiki-Pi contains 48,419 unique interactions among 10,492 proteins. However it is not clear if this is unique proteins[13] | Protein interactoin Database | ?? | |||||
Human Protein Reference Database | Institute of Bioinformatics (IOB), Bangalore, India | Human Protein Reference Database | One source claims 15000 [17] proteins. But it is unclear how many of these are unique | |||||||
Sanger Institute | Pfam | protein families database of alignments and HMMs | Protein sequence databases | |||||||
Human Proteinpedia | Institute of Bioinformatics (IOB), Bangalore and Johns Hopkins University, | Human Proteinpedia | The human Proteinpedia is based on HPRD (Human protein reference database)which is a repository hosting over 30,000 human proteins. However it is unclear how many of these are unique proteins | |||||||
Human Protein Atlas | The Swedish Government | Human Protein Atlas | It contains roughly 10 million IHC images of a bit less than 25,000 antibodies. But once again it is unclear how many of these are unique | |||||||
Manchester University | PRINTS | a compendium of protein fingerprints | Protein sequence databases | |||||||
PROSITE | database of protein families and domains | Protein sequence databases | ||||||||
Georgetown University Medical Center [GUMC] | Protein Information Resource | Protein sequence databases | ||||||||
SUPERFAMILY | library of HMMs representing superfamilies and database of (superfamily and family) annotations for all completely sequenced organisms | Protein sequence databases | ||||||||
Swiss Institute of Bioinformatics | Swiss-Prot | protein knowledgebase | Protein sequence databases | |||||||
NCBI | protein sequence and knowledgebase (National Center for Biotechnology Information) | Protein sequence databases | ||||||||
Protein DataBank in Europe (PDBe),[18] ProteinDatabank in Japan (PDBj),[19] Research Collaboratory for Structural Bioinformatics (RCSB)[20] | Protein Data Bank | (PDB) | Protein structure databases | |||||||
Structural Classification of Proteins (SCOP) | Protein structure databases | |||||||||
Protein Structure Classification database | CATH : | Protein structure databases | ||||||||
Sali Lab, UCSF | ModBase | database of comparative protein structure models | Protein model databases | |||||||
Similarity Matrix of Proteins | SIMAP | database of protein similarities computed using FASTA | Protein model databases | |||||||
Swiss-model | server and repository for protein structure models | Protein model databases | ||||||||
AAindex | database of amino acid indices, amino acid mutation matrices, and pair-wise contact potentials | Protein model databases | ||||||||
Samuel Lunenfeld Research Institute | BioGRID | general repository for interaction datasets | Protein-protein and other molecular interactions | |||||||
RNA-binding protein databas | Protein-protein and other molecular interactions | |||||||||
Univ. of California | Database of Interacting Proteins | Protein-protein and other molecular interactions | ||||||||
(EMBL-EBI) | IntAct:[21] | open-source database for molecular interactions | Protein-protein and other molecular interactions | |||||||
String | an open source molecular interaction database to study interactions between proteins | Protein-protein and other molecular interactions | ||||||||
Human Protein Atlas | aims at mapping all the human proteins in cells, tissues and organs | Protein expression databases | ||||||||
ProteinModelPortal | Protein Model Portal of the PSI-Nature Structural Biology Knowledgebase | ?? | ?? | 3D structure protein databases | ||||||
SMR | Database of annotated 3D protein structure models | University of Basel | The Swiss government | 3D structure protein databases | ||||||
DisProt | Database of Protein Disorder | ELIXIR infrastructure | Indiana University School of Medicine, Temple University, University of Padua | funding from the European Union's Horizon 2020 | Swiss Prot/Uni Prot, CATH, Pfam, Europe PMC, BITEM, ECO, Geneontology | DisProt | database of experimental evidences of disorder in proteins | 3D structure protein databases, Protein sequence databases | ||
MobiDB | Database of intrinsically disordered and mobile proteins | John Moult, Christine Orengo, Predrag Radivojac | University of Padua | Italian Government | MobiDB | database of intrinsic protein disorder annotation | 3D structure protein databases, Protein sequence databases | |||
ModBase | Database of Comparative Protein Structure Models | Ursula Pieper, Ben Webb, Narayanan Eswar, Andrej Sali Roberto Sanchez | UCSF, Sali Lab | 3D structure protein databases | ||||||
PDBsum | Pictorial database of 3D structures in the Protein Data Bank | European Bioinformatics Institute 2013 | Wellcome Trust | 3D structure protein databases | ||||||
CCDS | The Consensus CDS protein set database | NCBI | ?? | Sequence databases | ||||||
DDBJ | DNA Data Bank of Japan | ?? | ?? | Sequence databases | ||||||
ENA | European Nucleotide Archive | ?? | ?? | Sequence databases | ||||||
GenBank | GenBank nucleotide sequence database | ?? | ?? | Sequence databases | ||||||
Refseq | NCBI Reference Sequence Database | ?? | ?? | Sequence databases | ||||||
UniGene | Database of computationally identifies transcripts from the same locus | ?? | ?? | Sequence databases | ||||||
UniProtKB | Universal Protein Resource (UniProt) | ?? | ?? | Sequence databases | ||||||
Swiss Prot/Uni Prot | https://www.sib.swiss/swiss-prot and https://www.uniprot.org/ | SIB Swiss Institute of Bioinformatics | European Bioinformatics Institute (EMBL-EBI) | Swiss-Prot has collected over 81 000 variants in roughly 13,000 human protein sequence records from peer-reviewed literature. It is unclear how many unique proteins types are present in the database. |
Numerous databases collect information about species and other taxonomic categories. The Catalogue of Life is a special case as it is a meta-database of about 150 specialized "global species databases" (GSDs) that have collected the names and other information on (almost) all described and thus "known" species.
Images play a critical role in biomedicine, ranging from images of anthropological specimens to zoology. However, there are relatively few databases dedicated to image collection, although some projects such as iNaturalist collect photos as a main part of their data. A special case of "images" are 3-dimensional images such as protein structures or 3D-reconstructions of anatomical structures. Image databases include, among others:[22]
Original source: https://en.wikipedia.org/wiki/List of biological databases.
Read more |