The most successful (and most widely used) RQM is relativistic quantum field theory (QFT), in which elementary particles are interpreted as field quanta. A unique consequence of QFT that has been tested against other RQMs is the failure of conservation of particle number, for example in matter creation and annihilation.[7]
Paul Dirac's work between 1927 and 1933 shaped the synthesis of special relativity and quantum mechanics.[8] His work was instrumental, as he formulated the Dirac equation and also originated quantum electrodynamics, both of which were successful in combining the two theories.[9]
Every particle has a non-negative spin quantum numbers. The number 2s is an integer, odd for fermions and even for bosons. Each s has 2s + 1z-projection quantum numbers; σ = s, s − 1, ... , −s + 1, −s.[a] This is an additional discrete variable the wavefunction requires; ψ(r, t, σ).
These equations are used together with the energy and momentumoperators, which are respectively:
to construct a relativistic wave equation (RWE): a partial differential equation consistent with the energy–momentum relation, and is solved for ψ to predict the quantum dynamics of the particle. For space and time to be placed on equal footing, as in relativity, the orders of space and time partial derivatives should be equal, and ideally as low as possible, so that no initial values of the derivatives need to be specified. This is important for probability interpretations, exemplified below. The lowest possible order of any differential equation is the first (zeroth order derivatives would not form a differential equation).
The Heisenberg picture is another formulation of QM, in which case the wavefunction ψ is time-independent, and the operators A(t) contain the time dependence, governed by the equation of motion:
This equation is also true in RQM, provided the Heisenberg operators are modified to be consistent with SR.[11][12]
A more modern approach to RWEs, first introduced during the time RWEs were developing for particles of any spin, is to apply representations of the Lorentz group.
In classical mechanics and non-relativistic QM, time is an absolute quantity all observers and particles can always agree on, "ticking away" in the background independent of space. Thus in non-relativistic QM one has for a many particle systemψ(r1, r2, r3, ..., t, σ1, σ2, σ3...).
In relativistic mechanics, the spatial coordinates and coordinate time are not absolute; any two observers moving relative to each other can measure different locations and times of events. The position and time coordinates combine naturally into a four-dimensional spacetime positionX = (ct, r) corresponding to events, and the energy and 3-momentum combine naturally into the four-momentumP = (E/c, p) of a dynamic particle, as measured in somereference frame, change according to a Lorentz transformation as one measures in a different frame boosted and/or rotated relative the original frame in consideration. The derivative operators, and hence the energy and 3-momentum operators, are also non-invariant and change under Lorentz transformations.
where D(Λ) is a finite-dimensional representation, in other words a (2s + 1)×(2s + 1)square matrix . Again, ψ is thought of as a column vector containing components with the (2s + 1) allowed values of σ. The quantum numberss and σ as well as other labels, continuous or discrete, representing other quantum numbers are suppressed. One value of σ may occur more than once depending on the representation.
and substituting this into the above Schrödinger equation gives a non-relativistic QM equation for the wavefunction: the procedure is a straightforward substitution of a simple expression. By contrast this is not as easy in RQM; the energy–momentum equation is quadratic in energy and momentum leading to difficulties. Naively setting:
is not helpful for several reasons. The square root of the operators cannot be used as it stands; it would have to be expanded in a power series before the momentum operator, raised to a power in each term, could act on ψ. As a result of the power series, the space and time derivatives are completely asymmetric: infinite-order in space derivatives but only first order in the time derivative, which is inelegant and unwieldy. Again, there is the problem of the non-invariance of the energy operator, equated to the square root which is also not invariant. Another problem, less obvious and more severe, is that it can be shown to be nonlocal and can even violate causality: if the particle is initially localized at a point r0 so that ψ(r0, t = 0) is finite and zero elsewhere, then at any later time the equation predicts delocalization ψ(r, t) ≠ 0 everywhere, even for |r| > ct which means the particle could arrive at a point before a pulse of light could. This would have to be remedied by the additional constraint ψ(|r| > ct, t) = 0.[15]
There is also the problem of incorporating spin in the Hamiltonian, which isn't a prediction of the non-relativistic Schrödinger theory. Particles with spin have a corresponding spin magnetic moment quantized in units of μB, the Bohr magneton:[16][17]
where g is the (spin) g-factor for the particle, and S the spin operator, so they interact with electromagnetic fields. For a particle in an externally applied magnetic fieldB, the interaction term[18]
has to be added to the above non-relativistic Hamiltonian. On the contrary; a relativistic Hamiltonian introduces spin automatically as a requirement of enforcing the relativistic energy-momentum relation.[19]
Relativistic Hamiltonians are analogous to those of non-relativistic QM in the following respect; there are terms including rest mass and interaction terms with externally applied fields, similar to the classical potential energy term, as well as momentum terms like the classical kinetic energy term. A key difference is that relativistic Hamiltonians contain spin operators in the form of matrices, in which the matrix multiplication runs over the spin index σ, so in general a relativistic Hamiltonian:
is a function of space, time, and the momentum and spin operators.
The Klein–Gordon and Dirac equations for free particles
Substituting the energy and momentum operators directly into the energy–momentum relation may at first sight seem appealing, to obtain the Klein–Gordon equation:[20]
and was discovered by many people because of the straightforward way of obtaining it, notably by Schrödinger in 1925 before he found the non-relativistic equation named after him, and by Klein and Gordon in 1927, who included electromagnetic interactions in the equation. This isrelativistically invariant, yet this equation alone isn't a sufficient foundation for RQM for a at least two reasons: one is that negative-energy states are solutions,[2][21] another is the density (given below), and this equation as it stands is only applicable to spinless particles. This equation can be factored into the form:[22][23]
where α = (α1, α2, α3) and β are not simply numbers or vectors, but 4 × 4 Hermitian matrices that are required to anticommute for i ≠ j:
so that terms with mixed second-order derivatives cancel while the second-order derivatives purely in space and time remain. The first factor:
is the Dirac equation. The other factor is also the Dirac equation, but for a particle of negative mass.[22] Each factor is relativistically invariant. The reasoning can be done the other way round: propose the Hamiltonian in the above form, as Dirac did in 1928, then pre-multiply the equation by the other factor of operators E + cα · p + βmc2, and comparison with the KG equation determines the constraints on α and β. The positive mass equation can continue to be used without loss of continuity. The matrices multiplying ψ suggest it isn't a scalar wavefunction as permitted in the KG equation, but must instead be a four-component entity. The Dirac equation still predicts negative energy solutions,[6][24] so Dirac postulated that negative energy states are always occupied, because according to the Pauli principle, electronic transitions from positive to negative energy levels in atoms would be forbidden. See Dirac sea for details.
where ∂μ is the four-gradient. Since the initial values of both ψ and ∂ψ/∂t may be freely chosen, the density can be negative.
Instead, what appears look at first sight a "probability density" and "probability current" has to be reinterpreted as charge density and current density when multiplied by electric charge. Then, the wavefunction ψ is not a wavefunction at all, but reinterpreted as a field.[15] The density and current of electric charge always satisfy a continuity equation:
as charge is a conserved quantity. Probability density and current also satisfy a continuity equation because probability is conserved, however this is only possible in the absence of interactions.
Spin and electromagnetically interacting particles
Including interactions in RWEs is generally difficult. Minimal coupling is a simple way to include the electromagnetic interaction. For one charged particle of electric chargeq in an electromagnetic field, given by the magnetic vector potentialA(r, t) defined by the magnetic field B = ∇ × A, and electric scalar potentialϕ(r, t), this is:[27]
that is, the total energy of the particle is approximately the rest energy for small electric potentials, and the momentum is approximately the classical momentum.
In RQM, the KG equation admits the minimal coupling prescription;
In the case where the charge is zero, the equation reduces trivially to the free KG equation so nonzero charge is assumed below. This is a scalar equation that is invariant under the irreducible one-dimensional scalar (0,0) representation of the Lorentz group. This means that all of its solutions will belong to a direct sum of (0,0) representations. Solutions that do not belong to the irreducible (0,0) representation will have two or more independent components. Such solutions cannot in general describe particles with nonzero spin since spin components are not independent. Other constraint will have to be imposed for that, e.g. the Dirac equation for spin 1/2, see below. Thus if a system satisfies the KG equation only, it can only be interpreted as a system with zero spin.
The electromagnetic field is treated classically according to Maxwell's equations and the particle is described by a wavefunction, the solution to the KG equation. The equation is, as it stands, not always very useful, because massive spinless particles, such as the π-mesons, experience the much stronger strong interaction in addition to the electromagnetic interaction. It does, however, correctly describe charged spinless bosons in the absence of other interactions.
The KG equation is applicable to spinless charged bosons in an external electromagnetic potential.[2] As such, the equation cannot be applied to the description of atoms, since the electron is a spin 1/2 particle. In the non-relativistic limit the equation reduces to the Schrödinger equation for a spinless charged particle in an electromagnetic field:[18]
by means of the 2 × 2 Pauli matrices, and ψ is not just a scalar wavefunction as in the non-relativistic Schrödinger equation, but a two-component spinor field:
where the subscripts ↑ and ↓ refer to the "spin up" (σ = +1/2) and "spin down" (σ = −1/2) states.[b]
In RQM, the Dirac equation can also incorporate minimal coupling, rewritten from above;
and was the first equation to accurately predict spin, a consequence of the 4 × 4 gamma matricesγ0 = β, γ = (γ1, γ2, γ3) = βα = (βα1, βα2, βα3). There is a 4 × 4 identity matrix pre-multiplying the energy operator (including the potential energy term), conventionally not written for simplicity and clarity (i.e. treated like the number 1). Here ψ is a four-component spinor field, which is conventionally split into two two-component spinors in the form:[c]
The 2-spinor ψ+ corresponds to a particle with 4-momentum (E, p) and charge q and two spin states (σ = ±1/2, as before). The other 2-spinor ψ− corresponds to a similar particle with the same mass and spin states, but negative 4-momentum −(E, p) and negative charge −q, that is, negative energy states, time-reversed momentum, and negated charge. This was the first interpretation and prediction of a particle and corresponding antiparticle. See Dirac spinor and bispinor for further description of these spinors. In the non-relativistic limit the Dirac equation reduces to the Pauli equation (see Dirac equation for how). When applied a one-electron atom or ion, setting A = 0 and ϕ to the appropriate electrostatic potential, additional relativistic terms include the spin–orbit interaction, electron gyromagnetic ratio, and Darwin term. In ordinary QM these terms have to be put in by hand and treated using perturbation theory. The positive energies do account accurately for the fine structure.
Within RQM, for massless particles the Dirac equation reduces to:
the first of which is the Weyl equation, a considerable simplification applicable for massless neutrinos.[28] This time there is a 2 × 2 identity matrix pre-multiplying the energy operator conventionally not written. In RQM it is useful to take this as the zeroth Pauli matrix σ0 which couples to the energy operator (time derivative), just as the other three matrices couple to the momentum operator (spatial derivatives).
The Pauli and gamma matrices were introduced here, in theoretical physics, rather than pure mathematics itself. They have applications to quaternions and to the SO(2) and SO(3)Lie groups, because they satisfy the important commutator [ , ] and anticommutator [ , ]+ relations respectively:
(This can be extended to curved spacetime by introducing vierbeins, but is not the subject of special relativity).
In 1929, the Breit equation was found to describe two or more electromagnetically interacting massive spin 1/2 fermions to first-order relativistic corrections; one of the first attempts to describe such a relativistic quantum many-particle system. This is, however, still only an approximation, and the Hamiltonian includes numerous long and complicated sums.
where p is the momentum operator, S the spin operator for a particle of spin s, E is the total energy of the particle, and m0 its rest mass. Helicity indicates the orientations of the spin and translational momentum vectors.[29] Helicity is frame-dependent because of the 3-momentum in the definition, and is quantized due to spin quantization, which has discrete positive values for parallel alignment, and negative values for antiparallel alignment.
An automatic occurrence in the Dirac equation (and the Weyl equation) is the projection of the spin 1/2 operator on the 3-momentum (times c), σ · cp, which is the helicity (for the spin 1/2 case) times .
For massless particles the helicity simplifies to:
The Dirac equation can only describe particles of spin 1/2. Beyond the Dirac equation, RWEs have been applied to free particles of various spins. In 1936, Dirac extended his equation to all fermions, three years later Fierz and Pauli rederived the same equation.[30] The Bargmann–Wigner equations were found in 1948 using Lorentz group theory, applicable for all free particles with any spin.[31][32] Considering the factorization of the KG equation above, and more rigorously by Lorentz group theory, it becomes apparent to introduce spin in the form of matrices.
where the expression on the right is the Hermitian conjugate. For a massive particle of spin s, there are 2s + 1 components for the particle, and another 2s + 1 for the corresponding antiparticle (there are 2s + 1 possible σ values in each case), altogether forming a 2(2s + 1)-component spinor field:
with the + subscript indicating the particle and − subscript for the antiparticle. However, for massless particles of spin s, there are only ever two-component spinor fields; one is for the particle in one helicity state corresponding to +s and the other for the antiparticle in the opposite helicity state corresponding to −s:
According to the relativistic energy-momentum relation, all massless particles travel at the speed of light, so particles traveling at the speed of light are also described by two-component spinors. Historically, Élie Cartan found the most general form of spinors in 1913, prior to the spinors revealed in the RWEs following the year 1927.
For equations describing higher-spin particles, the inclusion of interactions is nowhere near as simple minimal coupling, they lead to incorrect predictions and self-inconsistencies.[33] For spin greater than ħ/2, the RWE is not fixed by the particle's mass, spin, and electric charge; the electromagnetic moments (electric dipole moments and magnetic dipole moments) allowed by the spin quantum number are arbitrary. (Theoretically, magnetic charge would contribute also). For example, the spin 1/2 case only allows a magnetic dipole, but for spin 1 particles magnetic quadrupoles and electric dipoles are also possible.[28] For more on this topic, see multipole expansion and (for example) Cédric Lorcé (2009).[34][35]
The Schrödinger/Pauli velocity operator can be defined for a massive particle using the classical definition p = mv, and substituting quantum operators in the usual way:[36]
which has eigenvalues that take any value. In RQM, the Dirac theory, it is:
The Hamiltonian operators in the Schrödinger picture are one approach to forming the differential equations for ψ. An equivalent alternative is to determine a Lagrangian (really meaning Lagrangian density), then generate the differential equation by the field-theoretic Euler–Lagrange equation:
For some RWEs, a Lagrangian can be found by inspection. For example, the Dirac Lagrangian is:[37]
and Klein–Gordon Lagrangian is:
This is not possible for all RWEs; and is one reason the Lorentz group theoretic approach is important and appealing: fundamental invariance and symmetries in space and time can be used to derive RWEs using appropriate group representations. The Lagrangian approach with field interpretation of ψ is the subject of QFT rather than RQM: Feynman's path integral formulation uses invariant Lagrangians rather than Hamiltonian operators, since the latter can become extremely complicated, see (for example) Weinberg (1995).[38]
In non-relativistic QM, the angular momentum operator is formed from the classical pseudovector definition L = r × p. In RQM, the position and momentum operators are inserted directly where they appear in the orbital relativistic angular momentum tensor defined from the four-dimensional position and momentum of the particle, equivalently a bivector in the exterior algebra formalism:[39][d]
which are six components altogether: three are the non-relativistic 3-orbital angular momenta; M12 = L3, M23 = L1, M31 = L2, and the other three M01, M02, M03 are boosts of the centre of mass of the rotating object. An additional relativistic-quantum term has to be added for particles with spin. For a particle of rest mass m, the total angular momentum tensor is:
In 1926, the Thomas precession is discovered: relativistic corrections to the spin of elementary particles with application in the spin–orbit interaction of atoms and rotation of macroscopic objects.[42][43] In 1939 Wigner derived the Thomas precession.
so the non-relativistic spin interaction Hamiltonian becomes:[44]
where the first term is already the non-relativistic magnetic moment interaction, and the second term the relativistic correction of order (v/c)², but this disagrees with experimental atomic spectra by a factor of 1⁄2. It was pointed out by L. Thomas that there is a second relativistic effect: An electric field component perpendicular to the electron velocity causes an additional acceleration of the electron perpendicular to its instantaneous velocity, so the electron moves in a curved path. The electron moves in a rotating frame of reference, and this additional precession of the electron is called the Thomas precession. It can be shown[45] that the net result of this effect is that the spin–orbit interaction is reduced by half, as if the magnetic field experienced by the electron has only one-half the value, and the relativistic correction in the Hamiltonian is:
In the case of RQM, the factor of 1⁄2 is predicted by the Dirac equation.[44]
The events which led to and established RQM, and the continuation beyond into quantum electrodynamics (QED), are summarized below [see, for example, R. Resnick and R. Eisberg (1985),[46] and P.W Atkins (1974)[47]]. More than half a century of experimental and theoretical research from the 1890s through to the 1950s in the new and mysterious quantum theory as it was up and coming revealed that a number of phenomena cannot be explained by QM alone. SR, found at the turn of the 20th century, was found to be a necessary component, leading to unification: RQM. Theoretical predictions and experiments mainly focused on the newly found atomic physics, nuclear physics, and particle physics; by considering spectroscopy, diffraction and scattering of particles, and the electrons and nuclei within atoms and molecules. Numerous results are attributed to the effects of spin.
Relativistic description of particles in quantum phenomena
In 1935, Einstein, Rosen, Podolsky published a paper[50] concerning quantum entanglement of particles, questioning quantum nonlocality and the apparent violation of causality upheld in SR: particles can appear to interact instantaneously at arbitrary distances. This was a misconception since information is not and cannot be transferred in the entangled states; rather the information transmission is in the process of measurement by two observers (one observer has to send a signal to the other, which cannot exceed c). QM does not violate SR.[51][52] In 1959, Bohm and Aharonov publish a paper[53] on the Aharonov–Bohm effect, questioning the status of electromagnetic potentials in QM. The EM field tensor and EM 4-potential formulations are both applicable in SR, but in QM the potentials enter the Hamiltonian (see above) and influence the motion of charged particles even in regions where the fields are zero. In 1964, Bell's theorem was published in a paper on the EPR paradox,[54] showing that QM cannot be derived from local hidden-variable theories if locality is to be maintained.
In 1947, the Lamb shift was discovered: a small difference in the 2S1⁄2 and 2P1⁄2 levels of hydrogen, due to the interaction between the electron and vacuum. Lamb and Retherford experimentally measure stimulated radio-frequency transitions the 2S1⁄2 and 2P1⁄2 hydrogen levels by microwave radiation.[55] An explanation of the Lamb shift is presented by Bethe. Papers on the effect were published in the early 1950s.[56]
^Other common notations include ms and sz etc., but this would clutter expressions with unnecessary subscripts. The subscripts σ labeling spin values are not to be confused for tensor indices nor the Pauli matrices.
^This spinor notation is not necessarily standard; the literature usually writes or etc., but in the context of spin 1/2, this informal identification is commonly made.
^Again this notation is not necessarily standard, the more advanced literature usually writes
etc.,
but here we show informally the correspondence of energy, helicity, and spin states.
^Some authors, including Penrose, use Latin letters in this definition, even though it is conventional to use Greek indices for vectors and tensors in spacetime.
^Schweber, Silvan S. (1994). QED and the Men Who Made It: Dyson, Feynman, Schwinger, and Tomonaga. Princeton, N.J: Princeton University Press. p. 573. ISBN978-0-691-21328-6.
^Masakatsu, K. (2012). "Superradiance Problem of Bosons and Fermions for Rotating Black Holes in Bargmann–Wigner Formulation". arXiv:1208.0644 [gr-qc].
^Lorcé, Cédric (2009). "Electromagnetic Properties for Arbitrary Spin Particles: Part 1 − Electromagnetic Current and Multipole Decomposition". arXiv:0901.4199 [hep-ph].
Corben, H.C. (1993). "Factors of 2 in magnetic moments, spin–orbit coupling, and Thomas precession". Am. J. Phys. 61 (6): 551. Bibcode:1993AmJPh..61..551C. doi:10.1119/1.17207.