Restriction sites, or restriction recognition sites, are located on a DNA molecule containing specific (4-8 base pairs in length[1]) sequences of nucleotides, which are recognized by restriction enzymes. These are generally palindromic sequences[2] (because restriction enzymes usually bind as homodimers), and a particular restriction enzyme may cut the sequence between two nucleotides within its recognition site, or somewhere nearby.
For example, the common restriction enzyme EcoRI recognizes the palindromic sequence GAATTC and cuts between the G and the A on both the top and bottom strands. This leaves an overhang (an end-portion of a DNA strand with no attached complement) known as a sticky end[2] on each end of AATT. The overhang can then be used to ligate in (see DNA ligase) a piece of DNA with a complementary overhang (another EcoRI-cut piece, for example).
Some restriction enzymes cut DNA at a restriction site in a manner which leaves no overhang, called a blunt end.[2] Blunt ends are much less likely to be ligated by a DNA ligase because the blunt end doesn't have the overhanging base pair that the enzyme can recognize and match with a complementary pair.[3] Sticky ends of DNA however are more likely to successfully bind with the help of a DNA ligase because of the exposed and unpaired nucleotides. For example, a sticky end trailing with AATTG is more likely to bind with a ligase than a blunt end where both the 5' and 3' DNA strands are paired. In the case of the example the AATTG would have a complementary pair of TTAAC which would reduce the functionality of the DNA ligase enzyme.[4]
Restriction sites can be used for multiple applications in molecular biology such as identifying restriction fragment length polymorphisms (RFLPs). Restriction sites are also important consideration to be aware of when designing plasmids.
Several databases exist for restriction sites and enzymes, of which the largest noncommercial database is REBASE.[5][6] Recently, it has been shown that statistically significant nullomers (i.e. short absent motifs which are highly expected to exist) in virus genomes are restriction sites indicating that viruses have probably got rid of these motifs to facilitate invasion of bacterial hosts.[7] Nullomers Database contains a comprehensive catalogue of minimal absent motifs many of which might potentially be not-yet-known restriction motifs.