Virome refers to the assemblage of viruses[1][2] that is often investigated and described by metagenomic sequencing of viral nucleic acids[3] that are found associated with a particular ecosystem, organism or holobiont. The word is frequently used to describe environmental viral shotgun metagenomes. Viruses, including bacteriophages, are found in all environments, and studies of the virome have provided insights into nutrient cycling,[4][5] development of immunity,[6] and a major source of genes through lysogenic conversion.[7] Also, the human virome has been characterized in nine organs (colon, liver, lung, heart, brain, kidney, skin, blood, hair) of 31 Finnish individuals using qPCR and NGS methodologies.[8]
The first comprehensive studies of viromes were by shotgun community sequencing,[9] which is frequently referred to as metagenomics. In the 2000s, the Rohwer lab sequenced viromes from seawater,[9][10] marine sediments,[11] adult human stool,[12] infant human stool,[13] soil,[14] and blood.[15] This group also performed the first RNA virome with collaborators from the Genomic Institute of Singapore.[16] From these early works, it was concluded that most of the genomic diversity is contained in the global virome and that most of this diversity remains uncharacterized.[17] This view was supported by individual genomic sequencing project, particularly the mycobacterium phage.[18]
By the late 2010s advances in sequencing technologies have allowed for a deep probing of viromes.[19] The virome of the human gut in particular has gained increased attention as a result of these advancements.[20][21]
In order to study the virome, virus-like particles are separated from cellular components, usually using a combination of filtration, density centrifugation, and enzymatic treatments to get rid of free nucleic acids.[22] The nucleic acids are then sequenced and analyzed using metagenomic methods. Alternatively, there are recent computational methods that use directly metagenomic assembled sequences to discover viruses.[23]
The Global Ocean Viromes (GOV) is a dataset consisting of deep sequencing from over 150 samples collected across the world's oceans in two survey periods by an international team.[24]
Viruses are the most abundant biological entities on Earth, but challenges in detecting, isolating, and classifying unknown viruses have prevented exhaustive surveys of the global virome.[25] Over 5 Tb of metagenomic sequence data were used from 3,042 geographically diverse samples to assess the global distribution, phylogenetic diversity, and host specificity of viruses.[25]
In August 2016, over 125,000 partial DNA viral genomes, including the largest phage yet identified, increased the number of known viral genes by 16-fold.[25] A suite of computational methods was used to identify putative host virus connections.[25] The isolate viral host information was projected onto a group, resulting in host assignments for 2.4% of viral groups.[25]
Then the CRISPR–Cas prokaryotic immune system which holds a "library" of genome fragments from phages (proto-spacers) that have previously infected the host.[25] Spacers from isolate microbial genomes with matches to metagenomic viral contigs (mVCs) were identified for 4.4% of the viral groups and 1.7% of singletons.[25] The hypothesis was explored that viral transfer RNA (tRNA) genes originate from their host.[25]
Viral tRNAs identified in 7.6% of the mVCs were matched to isolate genomes from a single species or genus.[25] The specificity of tRNA-based host viral assignment was confirmed by CRISPR–Cas spacer matches showing a 94% agreement at the genus level. These approaches identified 9,992 putative host–virus associations enabling host assignment to 7.7% of mVCs.[25] The majority of these connections were previously unknown, and include hosts from 16 prokaryotic phyla for which no viruses have previously been identified.[25]
Many viruses specialize in infecting related hosts.[25] Viral generalists that infect hosts across taxonomic orders may exist.[25] Most CRISPR spacer matches were from viral sequences to hosts within one species or genus.[25] Some mVCs were linked to multiple hosts from higher taxa. A viral group composed of macs from human oral samples contained three distinct photo-spacers with nearly exact matches to spacers in Actionbacteria and Bacillota.[25]
In January 2017, the IMG/VR system [26] -the largest interactive public virus database contained 265,000 metagenomic viral sequences and isolate viruses. This number scaled up to over 760,000 in November 2018 (IMG/VR v.2.0).[27] The IMG/VR systems serve as a starting point for the sequence analysis of viral fragments derived from metagenomic samples.
The human virome encompasses the diverse viral communities residing in the body. Prior advances in high-throughput sequencing (HTS) revealed insights into their diversity, evolutionary dynamics, and genome integrations. However, due to shallow sequencing in the past, the genetic composition and diversity of tissue-resident viruses remained poorly characterized, hindering understanding of their roles in pathogenesis and viral evolution. In 2024, a study of the virome examined persistent viruses in multiple organs from individuals who died of non-viral causes, revealing that viral sequences were highly conserved within each person, indicating persistence from single dominant strains. Increased viral diversity in two cases suggested that reactivation may influence variability. The study also identified selective pressures from the host and unexpected viral genome integrations, including MCPyV truncations and novel links between herpesvirus 6B and mitochondrial DNA, even in non-cancerous individuals, offering new insights into tissue-resident viruses and their potential health impacts.[28]
^Wegley L, Edwards R, Rodriguez-Brito B, Liu H, Rohwer F (November 2007). "Metagenomic analysis of the microbial community associated with the coral Porites astreoides". Environmental Microbiology. 9 (11): 2707–2719. Bibcode:2007EnvMi...9.2707W. doi:10.1111/j.1462-2920.2007.01383.x. PMID17922755.