Gene transfer agents (GTAs) are DNA-containing virus-like particles that are produced by some bacteria and archaea and mediate horizontal gene transfer. Different GTA types have originated independently from viruses in several bacterial and archaeal lineages. These cells produce GTA particles containing short segments of the DNA present in the cell. After the particles are released from the producer cell, they can attach to related cells and inject their DNA into the cytoplasm. The DNA can then become part of the recipient cells' genome.[1][2][3][4]
GTAs are classified as viriforms in the ICTV taxonomy. Among the GTAs mentioned by the article, RcGTA and DsGTA are now in the family Rhodogtaviriformidae, BaGTA in Bartogtaviriformidae, and VSH-1 in Brachygtaviriformidae.[5] Dd1 and VTA do not yet have a classification.
The first GTA system was discovered in 1974, when mixed cultures of Rhodobacter capsulatus strains produced a high frequency of cells with new combinations of genes.[6] The factor responsible was distinct from known gene-transfer mechanisms in being independent of cell contact, insensitive to deoxyribonuclease, and not associated with phage production. Because of its presumed function it was named gene transfer agent (GTA, now RcGTA) More recently other gene transfer agent systems have been discovered by incubating filtered (cell-free) culture medium with a genetically distinct strain.[3]
The genes specifying GTAs are derived from bacteriophage (phage) DNA that has integrated into a host chromosome. Such prophages often acquire mutations that make them defective and unable to produce phage particles. Many bacterial genomes contain one or more defective prophages that have undergone more-or less-extensive mutation and deletion. Gene transfer agents, like defective prophages, arise by mutation of prophages, but they retain functional genes for the head and tail components of the phage particle (structural genes) and the genes for DNA packaging. The phage genes specifying its regulation and DNA replication have typically been deleted, and expression of the cluster of structural genes is under the control of cellular regulatory systems. Additional genes that contribute to GTA production or uptake are usually present at other chromosome locations. Some of these have regulatory functions, and others contribute directly to GTA production (e.g. the phage-derived lysis genes) or uptake and recombination (e.g. production of cell-surface capsule and DNA transport proteins) These GTA-associated genes are often under coordinated regulation with the main GTA gene cluster.[7] Phage-derived cell-lysis proteins (holin and endolysin) then weaken the cell wall and membrane, allowing the cell to burst and release the GTA particles. The number of GTA particles produced by each cell is not known.
Some GTA systems appear to be recent additions to their host genomes, but others have been maintained for many millions of years. Where studies of sequence divergence have been done (dN/dS analysis), they indicate that the genes are being maintained by natural selection for protein function (i.e. defective versions are being eliminated).[8][9]
However, the nature of this selection is not clear. Although the discoverers of GTA assumed that gene transfer was the function of the particles, the presumed benefits of gene transfer come at a substantial cost to the population. Most of this cost arises because GTA-producing cells must lyse (burst open) to release their GTA particles, but there are also genetic costs associated with making new combinations of genes because most new combinations will usually be less fit than the original combination.[10] One alternative explanation is that GTA genes persist because GTAs are genetic parasites that spread infectiously to new cells. However this is ruled out because GTA particles are typically too small to contain the genes that encode them. For example, the main RcGTA cluster (see below) is 14 kb long, but RcGTA particles can contain only 4–5 kb of DNA.
Most bacteria have not been screened for the presence of GTAs, and many more GTA systems may await discovery. Although DNA-based surveys for GTA-related genes have found homologs in many genomes, but interpretation is hindered by the difficulty of distinguishing genes that encode GTAs from ordinary prophage genes.[8] [9]
In laboratory cultures, production of GTAs is typically maximized by particular growth conditions that induce transcription of the GTA genes; most GTAs are not induced by the DNA-damaging treatments that induce many prophages. Even under maximally inducing conditions only a small fraction of the culture produces GTAs, typically less than 1%.[11][12]
The steps in GTA production are derived from those of phage infection. The structural genes are first transcribed and translated, and the proteins assembled into empty heads and unattached tails. The DNA packaging machinery then packs DNA into each head, cutting the DNA when the head is full, attaching a tail to the head, and then moving the newly-created DNA end on to a new empty head. Unlike prophage genes, the genes encoding GTAs are not excised from the genome and replicated for packaging in GTA particles. The two best studied GTAs (RcGTA and BaGTA) randomly package all of the DNA in the cell, with no overrepresentation of GTA-encoding genes.[11][13] The number of GTA particles produced by each cell is not known.
Whether release of GTA particles leads to transfer of DNA to new genomes depends on several factors. First, the particles must survive in the environment – little is known about this, although particles are reported to be quite unstable under laboratory conditions.[14] Second, particles must encounter and attach to suitable recipient cells, usually members of the same or a closely related species. Like phages, GTAs attach to specific protein or carbohydrate structures on the recipient cell surface before injecting their DNA. Unlike phage, the well-studied GTAs appear to inject their DNA only across the first of the two membranes surrounding the recipient cytoplasm, and they use a different system, competence-derived rather than phage-derived, to transport one strand of the double-stranded DNA across the inner membrane into the cytoplasm.[15][16]
If the cell's recombinational repair machinery finds a chromosomal sequence very similar to the incoming DNA, it replaces the former with the latter by homologous recombination, mediated by the cell's RecA protein. If the sequences are not identical this will produce a cell with a new genetic combination. However, if the incoming DNA is not closely related to DNA sequences in the cell it will be degraded, and the cell will reuse its nucleotides for DNA replication.
The GTA produced by the alphaproteobacterium Rhodobacter capsulatus, named R. capsulatus GTA (RcGTA), is currently the best studied GTA. When laboratory cultures of R. capsulatus enter stationary phase, a subset of the bacterial population induces production of RcGTA, and the particles are subsequently released from the cells through cell lysis.[12] Most of the RcGTA structural genes are encoded in a ~ 15 kb genetic cluster on the bacterial chromosome. However, other genes required for RcGTA function, such as the genes required for cell lysis, are located separately.[2][17] RcGTA particles contain 4.5 kb DNA fragments, with even representation of the whole chromosome except for a 2-fold dip at the site of the RcGTA gene cluster.
Regulation of GTA production and transduction has been best studied in R. capsulatus, where a quorum-sensing system and a CtrA-phosphorelay control expression of not only the main RcGTA gene cluster, but also a holin/endolysin cell lysis system, particle head spikes, an attachment protein (possibly tail fibers), and the capsule and DNA processing genes needed for RcGTA recipient function. An uncharacterized stochastic process further limits expression of the gene cluster is to only 0.1-3% of the cells.
RcGTA-like clusters are found in a large subclade of the alphaproteobacteria, although the genes also appear to be frequently lost by deletion. Recently, several members of the order Rhodobacterales have been demonstrated to produce functional RcGTA-like particles. Groups of genes with homology to the RcGTA are present in the chromosomes of various types of alphaproteobacteria.[8]
D. shibae, like R. capsulatus, is a member of the Order Rhodobacterales, and its GTA shares a common ancestor and many features with RcGTA, including gene organization, packaging of short DNA fragments (4.2 kb) and regulation by quorum sensing and a CtrA phosphorelay.[18] However, its DNA packaging machinery has much more specificity, with sharp peaks and valleys of coverage suggestion that it may preferentially initiate packaging at specific sites in the genome. The DNA of the major DsGTA gene cluster is packaged very poorly.
Bartonella species are members of the Alphaproteobacteria like R. capsulatus and D. shibae, but BaGTA is not related to RcGTA and DsGTA.[19] BaGTA particles are larger than RcGTA and contain 14 kb DNA fragments. Although this capacity could in principle allow BaGTA to package and transmit its 14 kb GTA cluster, measurements of DNA coverage show reduced coverage of the cluster. An adjacent region of high coverage is thought to be due to local DNA replication.[13]
Brachyspira is a genus of spirochete; several species have been shown to carry homologous GTA gene clusters. Particles contain 7.5 kb DNA fragments. Production of VSH-1 is stimulated by the DNA-damaging agent mitomycin C and by some antibiotics. It is also associated with detectable cell lysis, indicating that a substantial fraction of the culture may be producing VSH-1.[20]
D. desulfuricans is a soil bacterium in the deltaproteobacteria; Dd1 packages 13.6 kb of DNA fragments. It is unclear which genes encode for this GTA: there is one 17.8 kb area with phage-like structural genes in the bacterial genome, but their link to GTA production is not yet experimentally proven.[21]
M. voltae is an archaean; its GTA is known to transfer 4.4 kb DNA fragments but has not been otherwise characterized,[22] although a defective provirus related to Methanococcus head-tailed viruses (Caudoviricetes) in M. voltae A3 genome has been suggested to represent the GTA locus.[23] A possible terL terminase (D7DSG2) was again identified in 2019.[24]