Chapter 6: The Electromagnetic Field and Gauge Redundancy
In the previous chapters we met the electromagnetic field as one of the free quantum fields of nature. We now return to it with a sharper question:
What is the correct field variable for the photon?
For the Dirac field, the answer was relatively direct. The field \(\psi(x)\) transforms as a spinor, obeys the Dirac equation, and its quantized excitations are electrons and positrons. For electromagnetism the situation is subtler. The familiar electric and magnetic fields,
\[ \mathbf{E}(x), \qquad \mathbf{B}(x), \]
are physically measurable, but the most useful relativistic field variable is not \(\mathbf{E}\) or \(\mathbf{B}\) directly. It is the four-potential
\[ A_\mu(x). \]
The price of using \(A_\mu\) is that it contains more mathematical information than the physical electromagnetic field. Different potentials can describe the same \(\mathbf{E}\) and \(\mathbf{B}\). This surplus description is called gauge redundancy.
A redundancy is not an ordinary physical symmetry. A physical symmetry maps one possible physical state to another possible physical state. A gauge redundancy maps one mathematical description of a state to another mathematical description of the same state. This distinction is central to QED.
The purpose of this chapter is to understand why the photon is naturally described with a redundant field, how gauge fixing removes the redundancy in calculations, and why physical observables must be gauge invariant.
Throughout we use natural units,
\[ \hbar=c=1, \]
and the mostly-minus metric convention,
\[ \eta_{\mu\nu}=\mathrm{diag}(1,-1,-1,-1). \]
Repeated Lorentz indices are summed.
6.1 From electric and magnetic fields to the four-potential
Classical electromagnetism can be written in terms of the electric and magnetic fields \(\mathbf{E}\) and \(\mathbf{B}\). In vacuum, Maxwell’s equations are
\[ \nabla\cdot\mathbf{E}=0, \qquad \nabla\times\mathbf{B}-\frac{\partial \mathbf{E}}{\partial t}=0, \]
\[ \nabla\cdot\mathbf{B}=0, \qquad \nabla\times\mathbf{E}+\frac{\partial \mathbf{B}}{\partial t}=0. \]
With a charge density \(\rho\) and current density \(\mathbf{j}\), the first two equations become
\[ \nabla\cdot\mathbf{E}=\rho, \qquad \nabla\times\mathbf{B}-\frac{\partial \mathbf{E}}{\partial t}=\mathbf{j}. \]
These equations are experimentally and historically the foundation of classical electrodynamics, and their covariant formulation in terms of field strengths and four-potentials is standard in modern treatments (Jackson 1999).
The two homogeneous equations,
\[ \nabla\cdot\mathbf{B}=0, \qquad \nabla\times\mathbf{E}+\frac{\partial \mathbf{B}}{\partial t}=0, \]
are automatically solved if we introduce a scalar potential \(\Phi\) and a vector potential \(\mathbf{A}\) such that
\[ \mathbf{B}=\nabla\times\mathbf{A}, \]
\[ \mathbf{E}=-\nabla \Phi-\frac{\partial \mathbf{A}}{\partial t}. \]
In relativistic notation, \(\Phi\) and \(\mathbf{A}\) are assembled into a four-vector potential,
\[ A^\mu=(\Phi,\mathbf{A}). \]
Equivalently, one defines the electromagnetic field-strength tensor
\[ F_{\mu\nu}=\partial_\mu A_\nu-\partial_\nu A_\mu. \]
This antisymmetric tensor contains both \(\mathbf{E}\) and \(\mathbf{B}\). With our metric convention, one common convention is
\[ F^{0i}=-E^i, \qquad F^{ij}=-\epsilon^{ijk}B^k. \]
The two inhomogeneous Maxwell equations combine into
\[ \partial_\mu F^{\mu\nu}=j^\nu, \]
where
\[ j^\mu=(\rho,\mathbf{j}) \]
is the electromagnetic four-current. The two homogeneous Maxwell equations become the compact identity
\[ \partial_\lambda F_{\mu\nu} +\partial_\mu F_{\nu\lambda} +\partial_\nu F_{\lambda\mu}=0. \]
This identity is often called the Bianchi identity. It follows automatically from writing \(F_{\mu\nu}\) as derivatives of \(A_\mu\), because partial derivatives commute.
The classical electromagnetic action coupled to an external conserved current is
\[ S[A]=\int d^4x\, \left( -\frac14 F_{\mu\nu}F^{\mu\nu} -j^\mu A_\mu \right). \]
Varying \(A_\mu\) gives
\[ \partial_\mu F^{\mu\nu}=j^\nu. \]
This is Maxwell’s equation in covariant form.
6.2 Gauge transformations: same fields, different potentials
The four-potential \(A_\mu\) is not unique. If \(\alpha(x)\) is any sufficiently smooth scalar function, then
\[ A_\mu(x)\longrightarrow A'_\mu(x)=A_\mu(x)+\partial_\mu \alpha(x) \]
leaves the field strength unchanged:
\[ F'_{\mu\nu} = \partial_\mu A'_\nu-\partial_\nu A'_\mu = \partial_\mu(A_\nu+\partial_\nu\alpha) -\partial_\nu(A_\mu+\partial_\mu\alpha). \]
Expanding,
\[ F'_{\mu\nu} = \partial_\mu A_\nu-\partial_\nu A_\mu +\partial_\mu\partial_\nu\alpha -\partial_\nu\partial_\mu\alpha. \]
Since ordinary partial derivatives commute,
\[ \partial_\mu\partial_\nu\alpha = \partial_\nu\partial_\mu\alpha, \]
we obtain
\[ F'_{\mu\nu}=F_{\mu\nu}. \]
Thus the transformation
\[ A_\mu\to A_\mu+\partial_\mu\alpha \]
does not change \(\mathbf{E}\), \(\mathbf{B}\), or any local quantity constructed only from \(F_{\mu\nu}\).
This is a gauge transformation.
Example: a pure gauge potential
Start from
\[ A_\mu=0. \]
Choose
\[ \alpha(x)=\lambda t, \]
where \(\lambda\) is a constant. Then
\[ A'_0=\partial_0\alpha=\lambda, \qquad A'_i=\partial_i\alpha=0. \]
So \(A'_\mu=(\lambda,0,0,0)\). But
\[ F'_{\mu\nu}=0. \]
The potential is nonzero, but the electric and magnetic fields vanish. This potential describes no electromagnetic field. It is a pure gauge configuration.
The phrase “pure gauge” means that the potential is entirely of the form
\[ A_\mu=\partial_\mu\alpha \]
for some function \(\alpha\), so its field strength is zero.
6.3 Gauge redundancy is not an ordinary physical symmetry
It is tempting to say that gauge transformations are symmetries. This is true in one sense but misleading in another.
A transformation is a symmetry of the action if it leaves the action unchanged, possibly up to a boundary term. Gauge transformations do that. The electromagnetic kinetic term is invariant because \(F_{\mu\nu}\) is invariant:
\[ -\frac14 F_{\mu\nu}F^{\mu\nu} \longrightarrow -\frac14 F_{\mu\nu}F^{\mu\nu}. \]
The coupling to a current changes by
\[ -j^\mu A_\mu \longrightarrow -j^\mu(A_\mu+\partial_\mu\alpha) = -j^\mu A_\mu -j^\mu\partial_\mu\alpha. \]
The change in the action is
\[ \Delta S = -\int d^4x\, j^\mu \partial_\mu\alpha. \]
Integrating by parts,
\[ \Delta S = \int d^4x\, \alpha\,\partial_\mu j^\mu \]
up to a boundary term. Therefore the action is gauge invariant if
\[ \partial_\mu j^\mu=0. \]
This is the local conservation of electric charge.
So gauge invariance of the coupling to \(A_\mu\) requires current conservation. Conversely, in QED the local gauge structure will enforce the conserved electromagnetic current associated with charged matter.
But the deeper point remains: two potentials related by
\[ A_\mu\to A_\mu+\partial_\mu\alpha \]
do not represent two physically distinct electromagnetic states. They represent the same state. Gauge redundancy is therefore a redundancy in our coordinates on the space of physical configurations.
An analogy is useful. Suppose a point in the plane is described by polar coordinates \((r,\theta)\). The pairs
\[ (r,\theta) \quad\text{and}\quad (r,\theta+2\pi) \]
describe the same point. The change \(\theta\to\theta+2\pi\) is not a new physical point; it is a different coordinate description. Gauge transformations are a field-theoretic version of this idea, but with infinitely many possible functions \(\alpha(x)\).
There is one important caveat. Gauge transformations that do not vanish suitably at boundaries can sometimes act nontrivially on physical states, especially in discussions of total charge, infrared physics, and asymptotic symmetries. In this chapter we focus on the standard local redundancy in ordinary perturbative QED, where gauge functions are taken to have appropriate boundary behavior.
6.4 Why the photon needs a redundant description
A classical four-vector field \(A_\mu\) has four components:
\[ A_0,A_1,A_2,A_3. \]
But a physical photon has only two polarization states. For a photon moving in the \(z\)-direction, these can be chosen as two transverse polarizations, for example along the \(x\)- and \(y\)-directions.
This mismatch is the central reason gauge redundancy appears.
A massive spin-1 particle has three physical polarization states. A massless spin-1 particle has only two helicity states, usually labeled
\[ h=+1,\qquad h=-1. \]
This statement follows from the representation theory of the Poincaré group: massless particles are classified by helicity rather than by the three spin projections available in the massive rest frame, because a massless particle has no rest frame (Weinberg 1995).
The photon is massless and spin \(1\). Thus it should have two physical polarizations, not four.
If we insist on using a Lorentz four-vector \(A_\mu\), we have introduced extra components. Gauge redundancy is the mechanism that removes the unphysical ones while preserving locality and Lorentz covariance. This is one of the structural reasons that covariant descriptions of massless spin-1 particles naturally involve gauge invariance (Weinberg 1995; Peskin and Schroeder 1995).
Massive comparison: Proca field
It is useful to compare the photon with a massive vector field. A massive vector field \(V_\mu\) can be described by the Proca Lagrangian,
\[ \mathcal{L}_{\text{Proca}} = -\frac14 V_{\mu\nu}V^{\mu\nu} +\frac12 m^2 V_\mu V^\mu, \]
where
\[ V_{\mu\nu}=\partial_\mu V_\nu-\partial_\nu V_\mu. \]
The mass term
\[ \frac12 m^2 V_\mu V^\mu \]
is not invariant under
\[ V_\mu\to V_\mu+\partial_\mu\alpha. \]
So the Proca field has no gauge redundancy. Its equations imply a constraint that removes one component, leaving three physical polarizations, as appropriate for a massive spin-1 particle.
For the photon, a mass term
\[ \frac12 m^2 A_\mu A^\mu \]
would break gauge invariance and introduce an additional physical polarization. Ordinary QED instead uses the massless gauge-invariant Lagrangian
\[ \mathcal{L}_{\text{EM}} = -\frac14 F_{\mu\nu}F^{\mu\nu}. \]
6.5 Constraints: not every component is dynamical
Gauge redundancy is closely related to constraints.
A dynamical variable is one whose time evolution is determined by an equation containing its time derivative in the usual way. A constraint is an equation that restricts the allowed fields at a given time but does not itself determine a new independent time evolution.
In electromagnetism, Gauss’s law,
\[ \nabla\cdot\mathbf{E}=\rho, \]
is a constraint. It relates the electric field at a time to the charge density at that same time. It does not contain a second time derivative of the potential.
This is visible in the canonical formulation. The momentum conjugate to \(A_0\) vanishes:
\[ \pi^0=0. \]
That is, the Lagrangian contains no independent \(\dot A_0\) term. Therefore \(A_0\) is not a propagating degree of freedom. It enforces Gauss’s law.
Dirac developed the general Hamiltonian theory of constrained systems, including the distinction between constraints that generate gauge transformations and constraints that remove physical degrees of freedom (Dirac 1964). In modern language, electromagnetism has first-class constraints, meaning constraints associated with gauge redundancy; a systematic treatment is given in standard accounts of constrained Hamiltonian systems (Henneaux and Teitelboim 1992).
The physical content can be summarized as follows:
- \(A_\mu\) has four components.
- The equation for \(A_0\) is constrained rather than dynamical.
- Gauge transformations identify potentials that describe the same physical field.
- Only two independent radiative degrees of freedom remain.
These two degrees of freedom are the two photon polarizations.
6.6 Gauge fixing: choosing one representative
Because gauge-related potentials represent the same physics, calculations with \(A_\mu\) contain redundancy. To calculate efficiently, we often impose an additional condition on \(A_\mu\). This is called gauge fixing.
A gauge condition chooses one representative, or at least a smaller set of representatives, from each class of gauge-equivalent potentials.
A gauge orbit is the set of all potentials related by gauge transformations:
\[ [A_\mu] = \{A_\mu+\partial_\mu\alpha\;|\;\alpha \text{ allowed}\}. \]
Gauge fixing is like choosing a coordinate slice through these orbits.
Lorenz gauge
The Lorenz gauge condition is
\[ \partial_\mu A^\mu=0. \]
It is often mistakenly called “Lorentz gauge,” but it is named after Ludvig Lorenz. The condition is Lorentz covariant, meaning that its form is preserved under Lorentz transformations.
Can every potential be transformed into Lorenz gauge? Starting from \(A_\mu\), perform
\[ A_\mu\to A'_\mu=A_\mu+\partial_\mu\alpha. \]
Then
\[ \partial_\mu A'^\mu = \partial_\mu A^\mu+\Box\alpha, \]
where
\[ \Box=\partial_\mu\partial^\mu = \frac{\partial^2}{\partial t^2}-\nabla^2 \]
is the d’Alembertian. To impose Lorenz gauge, choose \(\alpha\) satisfying
\[ \Box\alpha=-\partial_\mu A^\mu. \]
Thus, at least locally and with suitable boundary conditions, one can reach Lorenz gauge.
However, Lorenz gauge does not completely remove the redundancy. If \(A_\mu\) already satisfies
\[ \partial_\mu A^\mu=0, \]
then the transformed potential also satisfies it if
\[ \Box\alpha=0. \]
This leftover freedom is called residual gauge freedom.
Example: residual freedom in Lorenz gauge
Suppose \(A_\mu\) is in Lorenz gauge. Let
\[ \alpha(x)=\alpha_0 e^{-ik\cdot x} \]
with
\[ k^2=0. \]
Then
\[ \Box\alpha=0. \]
Therefore
\[ A_\mu\to A_\mu+\partial_\mu\alpha \]
preserves Lorenz gauge. Even after imposing \(\partial_\mu A^\mu=0\), there is still gauge redundancy associated with wave-like gauge functions.
Coulomb gauge
The Coulomb gauge condition is
\[ \nabla\cdot\mathbf{A}=0. \]
This condition is not manifestly Lorentz covariant, because it separates space and time. But it has a clear physical advantage: it makes the transverse photon degrees of freedom explicit.
Any vector field \(\mathbf{A}\) can be decomposed into transverse and longitudinal parts:
\[ \mathbf{A}=\mathbf{A}_T+\mathbf{A}_L, \]
where
\[ \nabla\cdot\mathbf{A}_T=0, \qquad \nabla\times\mathbf{A}_L=0. \]
The longitudinal part can be written as a gradient,
\[ \mathbf{A}_L=\nabla\chi. \]
A gauge transformation changes
\[ \mathbf{A}\to \mathbf{A}+\nabla\alpha. \]
Thus an appropriate choice of \(\alpha\) can remove the longitudinal part of \(\mathbf{A}\), leaving only \(\mathbf{A}_T\).
In Coulomb gauge, \(A_0\) is not a propagating photon field. Instead, it is determined by Gauss’s law. For a static charge distribution,
\[ -\nabla^2 A_0=\rho, \]
with solution
\[ A_0(\mathbf{x}) = \int d^3y\,\frac{\rho(\mathbf{y})}{4\pi|\mathbf{x}-\mathbf{y}|}. \]
This is the Coulomb potential. The propagating electromagnetic waves are contained in the transverse vector potential \(\mathbf{A}_T\).
Coulomb gauge is physically transparent but less convenient for manifestly relativistic perturbation theory. Covariant gauges are usually preferred for Feynman-diagram calculations.
6.7 Covariant gauges and the gauge-fixing parameter
In covariant quantization, one often adds a gauge-fixing term to the electromagnetic Lagrangian:
\[ \mathcal{L}_{\text{gf}} = -\frac{1}{2\xi}(\partial_\mu A^\mu)^2. \]
The parameter \(\xi\) labels a family of gauges. The full free gauge-fixed Lagrangian is
\[ \mathcal{L} = -\frac14 F_{\mu\nu}F^{\mu\nu} -\frac{1}{2\xi}(\partial_\mu A^\mu)^2. \]
The free equation of motion becomes
\[ \Box A^\nu - \left(1-\frac1\xi\right) \partial^\nu(\partial_\mu A^\mu) = 0. \]
Special choices include:
\[ \xi=1 \quad\text{Feynman gauge}, \]
\[ \xi=0 \quad\text{Landau gauge}. \]
In momentum space, the corresponding photon propagator is
\[ D_{\mu\nu}(k) = \frac{-i}{k^2+i\epsilon} \left[ \eta_{\mu\nu} -(1-\xi)\frac{k_\mu k_\nu}{k^2+i\epsilon} \right]. \]
For \(\xi=1\), this simplifies to
\[ D_{\mu\nu}(k) = \frac{-i\eta_{\mu\nu}}{k^2+i\epsilon}. \]
This is why Feynman gauge is often algebraically convenient. Covariant gauge fixing and the associated propagators are standard tools in perturbative QED (Peskin and Schroeder 1995).
At this stage, the gauge parameter \(\xi\) may look dangerous. If the propagator depends on \(\xi\), will predictions depend on an arbitrary choice?
They must not. Physical observables cannot depend on gauge fixing. The cancellation of gauge-dependent terms is enforced by current conservation and, in full QED, by Ward identities. We will study those identities in Chapter 11.
Example: conserved currents remove gauge-dependent exchange terms
Consider two conserved currents exchanging a photon. The gauge-dependent part of the propagator is proportional to
\[ k_\mu k_\nu. \]
The amplitude contains a factor of the form
\[ j_1^\mu(k)D_{\mu\nu}(k)j_2^\nu(-k). \]
The gauge-dependent contribution is proportional to
\[ j_1^\mu k_\mu\, k_\nu j_2^\nu. \]
If the currents are conserved, then in momentum space
\[ k_\mu j^\mu(k)=0. \]
Therefore the gauge-dependent part vanishes. This is the simplest preview of why gauge choices do not affect physical scattering amplitudes.
6.8 Plane waves and physical polarization vectors
A free electromagnetic wave can be written as a plane wave,
\[ A_\mu(x)=\epsilon_\mu(k)e^{-ik\cdot x}, \]
where \(k^\mu\) is the wave four-vector and \(\epsilon_\mu(k)\) is the polarization vector.
For a massless photon,
\[ k^2=0. \]
The Lorenz gauge condition gives
\[ \partial_\mu A^\mu=0 \quad\Longrightarrow\quad k_\mu \epsilon^\mu(k)=0. \]
This is a transversality condition in four-dimensional form.
But there is still gauge freedom. A gauge transformation with
\[ \alpha(x)=c\,e^{-ik\cdot x} \]
shifts the polarization vector by a multiple of \(k_\mu\):
\[ \epsilon_\mu(k)\to \epsilon_\mu(k)-ic\,k_\mu. \]
The unimportant factor \(-ic\) depends on convention; the physical statement is
\[ \epsilon_\mu(k)\sim \epsilon_\mu(k)+\lambda k_\mu. \]
Polarization vectors differing by a multiple of \(k_\mu\) describe the same physical photon.
Example: photon moving in the \(z\)-direction
Let
\[ k^\mu=(\omega,0,0,\omega). \]
Then
\[ k^2=\omega^2-\omega^2=0. \]
A convenient pair of transverse linear polarization vectors is
\[ \epsilon_x^\mu=(0,1,0,0), \]
\[ \epsilon_y^\mu=(0,0,1,0). \]
Both satisfy
\[ k_\mu\epsilon^\mu=0. \]
They correspond to electric fields oscillating in the \(x\)- and \(y\)-directions.
One can also form circular polarization vectors,
\[ \epsilon_+^\mu = \frac{1}{\sqrt2}(0,1,i,0), \]
\[ \epsilon_-^\mu = \frac{1}{\sqrt2}(0,1,-i,0). \]
These are helicity eigenstates. For a photon, helicity is the projection of angular momentum along the direction of motion. The two physical photon states have helicities
\[ +1 \quad\text{and}\quad -1. \]
The apparent time-like and longitudinal polarizations are not independent physical photon states. They are removed by the constraint and gauge equivalence.
6.9 The problem of covariant quantization
There is a tension:
- Lorentz covariance suggests using all four components of \(A_\mu\).
- Physical photon states require only two polarizations.
- Quantizing all four components naively introduces unphysical states.
In a covariant quantization, one allows intermediate states associated with time-like and longitudinal modes. These modes are not physical external photons. The physical subspace is selected by a condition that removes the unphysical combinations.
One traditional method is the Gupta-Bleuler formalism, where the Lorenz condition is imposed not as a strict operator equation on \(A_\mu\), but as a condition on physical states using the positive-frequency part of \(\partial_\mu A^\mu\) (Gupta 1950; Bleuler 1950). Schematically,
\[ (\partial_\mu A^\mu)^{(+)}|\text{phys}\rangle=0. \]
The reason for this weaker condition is that imposing
\[ \partial_\mu A^\mu=0 \]
as an operator identity is incompatible with the canonical commutation relations in covariant quantization.
In modern path-integral language, covariant gauge fixing is handled by adding gauge-fixing terms and, for non-Abelian gauge theories, Faddeev-Popov ghosts. In QED the ghosts decouple because the gauge group is Abelian, but that topic belongs to Chapter 13.
For now, the conceptual lesson is enough: covariant calculations may contain unphysical components internally, but physical states and physical observables contain only the two transverse photon polarizations.
6.10 Gauge-invariant observables
A gauge-invariant observable is a quantity unchanged by gauge transformations. Since gauge-related potentials describe the same physical situation, only gauge-invariant quantities can represent direct physical observables.
The simplest examples are the electromagnetic fields themselves:
\[ \mathbf{E}, \qquad \mathbf{B}. \]
Equivalently, the field-strength tensor
\[ F_{\mu\nu} \]
is gauge invariant.
Other local gauge-invariant quantities include
[ F_{\mu