Introduction
Quantum information begins with a simple question:
What can we learn from a quantum system?
In ordinary classical life, measurement often feels passive. If a coin is on the table, we look at it and discover whether it is heads or tails. The coin already has a definite face upward, and our observation merely reveals it. Quantum theory is different. A quantum system is not usually described as carrying a list of pre-existing answers to all possible measurements. Instead, a quantum state is a mathematical object that tells us the probabilities of different outcomes when we choose a measurement. This probability-centered view is part of the standard formalism of quantum mechanics and quantum information theory (Nielsen and Chuang, 2010; Watrous, 2018).
This book is about one of the central mathematical facts behind modern quantum measurement:
Naimark’s dilation theorem says, roughly, that every generalized quantum measurement can be realized as an ordinary projective measurement on a larger Hilbert space.
That sentence contains several important words: generalized measurement, projective measurement, larger Hilbert space, and realized. The goal of this book is to unpack all of them carefully, from first principles, until the theorem becomes not only believable but provable.
We will work mainly in finite-dimensional Hilbert spaces. This is the setting most common in introductory quantum information: qubits, finite quantum registers, finite sets of measurement outcomes, and matrices. The finite-dimensional version already contains the essential idea of the theorem and is mathematically accessible with undergraduate linear algebra. Later, in Chapter 22, we briefly look toward the infinite-dimensional and measure-theoretic setting, where the full theorem belongs historically and technically.
Why measurement needs more than projections
In the simplest textbook model of quantum measurement, measurement outcomes are represented by orthogonal projections. A projection is a linear operator \(P\) satisfying
\[ P^2 = P, \qquad P^\ast = P. \]
The equation \(P^2=P\) means that applying the projection twice is the same as applying it once. The equation \(P^\ast=P\) means that the projection is self-adjoint, which in finite-dimensional complex vector spaces corresponds to being equal to its conjugate transpose.
Geometrically, a projection represents asking whether a vector lies in a certain subspace. For example, in the two-dimensional qubit space \(\mathbb{C}^2\), the matrix
\[ P_0 = \begin{pmatrix} 1 & 0\\ 0 & 0 \end{pmatrix} \]
projects onto the line spanned by
\[ |0\rangle = \begin{pmatrix} 1\\ 0 \end{pmatrix}. \]
If a qubit is in the state \(|0\rangle\), the outcome corresponding to \(P_0\) occurs with probability \(1\). If it is in the state
\[ |1\rangle = \begin{pmatrix} 0\\ 1 \end{pmatrix}, \]
the outcome corresponding to \(P_0\) occurs with probability \(0\).
Projective measurements are beautiful and important. They are connected to the spectral theorem, one of the central results of linear algebra and functional analysis. In quantum mechanics, the traditional association of observables with self-adjoint operators leads naturally to projection-valued measurements through spectral decompositions (Reed and Simon, 1980; Nielsen and Chuang, 2010).
But projective measurements are not enough for quantum information.
Here is a first reason. Suppose we want to distinguish between two quantum states that are not orthogonal. For example, consider the two qubit states
\[ |0\rangle = \begin{pmatrix} 1\\ 0 \end{pmatrix}, \qquad |+\rangle = \frac{1}{\sqrt{2}} \begin{pmatrix} 1\\ 1 \end{pmatrix}. \]
These states are different, but they are not orthogonal because
\[ \langle 0|+\rangle = \frac{1}{\sqrt{2}} \neq 0. \]
Quantum theory does not allow us to distinguish nonorthogonal states perfectly in a single shot. However, we can ask for the best possible measurement strategy: perhaps one that sometimes guesses correctly with maximum probability, or perhaps one that sometimes says “I do not know” but never makes a wrong identification. These tasks naturally lead to measurements more general than projective measurements on the original system alone (Helstrom, 1976; Chefles, 2000).
A second reason is experimental. Real detectors are not perfect. They may be noisy, inefficient, or coarse-grained. A detector may fail to click. Two distinct microscopic outcomes may be recorded as the same macroscopic outcome. A measurement device may interact with an environment before producing a classical result. The mathematical framework of positive operator-valued measures, or POVMs, captures these possibilities in a clean way and is standard in quantum information and quantum measurement theory (Busch, Lahti, and Mittelstaedt, 1996; Heinosaari and Ziman, 2012).
A POVM is a collection of positive operators
\[ E_1,E_2,\dots,E_m \]
on a Hilbert space such that
\[ E_i \geq 0 \quad \text{for each } i, \qquad \sum_{i=1}^m E_i = I. \]
The symbol \(E_i \geq 0\) means that \(E_i\) is positive semidefinite: for every vector \(|\psi\rangle\),
\[ \langle \psi|E_i|\psi\rangle \geq 0. \]
The equation \(\sum_i E_i=I\) is the normalization condition. It guarantees that the total probability of all possible outcomes is \(1\).
If the system is in a pure state \(|\psi\rangle\), the probability of outcome \(i\) is
\[ p(i)=\langle \psi|E_i|\psi\rangle. \]
If the system is in a mixed state represented by a density matrix \(\rho\), the probability is
\[ p(i)=\operatorname{Tr}(\rho E_i). \]
This is the Born rule for POVMs. We will study it carefully later. For now, the key point is simple: a POVM keeps exactly what probability theory needs. Each probability is nonnegative, and the probabilities add to \(1\).
A first example of a POVM
Let us look at the simplest kind of non-projective POVM. Consider a qubit and define
\[ E_0 = \begin{pmatrix} 0.8 & 0\\ 0 & 0.2 \end{pmatrix}, \qquad E_1 = \begin{pmatrix} 0.2 & 0\\ 0 & 0.8 \end{pmatrix}. \]
Both matrices are positive semidefinite because their eigenvalues are nonnegative. Also,
\[ E_0+E_1= \begin{pmatrix} 1 & 0\\ 0 & 1 \end{pmatrix} =I. \]
So \(\{E_0,E_1\}\) is a POVM.
But \(E_0\) is not a projection, because
\[ E_0^2 = \begin{pmatrix} 0.64 & 0\\ 0 & 0.04 \end{pmatrix} \neq \begin{pmatrix} 0.8 & 0\\ 0 & 0.2 \end{pmatrix} =E_0. \]
This POVM can be interpreted as an unsharp measurement of the computational basis. If the system is \(|0\rangle\), outcome \(0\) occurs with probability \(0.8\). If the system is \(|1\rangle\), outcome \(0\) still occurs with probability \(0.2\). The measurement gives information, but not perfectly.
This kind of example is mathematically simple, but it already shows why POVMs are useful. They describe measurements that are neither completely random nor perfectly sharp.
The puzzle Naimark dilation solves
POVMs are more general than projective measurements. At first, this may feel like we have introduced a new kind of measurement law. Are POVMs fundamentally different from projective measurements? Or are they projective measurements in disguise?
Naimark dilation answers this question.
The answer is:
A POVM on a system can be represented as a projective measurement on a larger system, followed by forgetting the extra degrees of freedom.
The phrase larger system means that we enlarge the Hilbert space. If the original system has Hilbert space \(\mathcal{H}\), we build a bigger Hilbert space \(\mathcal{K}\). The original system is embedded into \(\mathcal{K}\), often using an isometry.
An isometry is a linear map
\[ V:\mathcal{H}\to\mathcal{K} \]
that preserves inner products:
\[ \langle V\phi,V\psi\rangle_{\mathcal{K}} = \langle \phi,\psi\rangle_{\mathcal{H}}. \]
In finite dimensions, this is equivalent to
\[ V^\ast V=I_{\mathcal{H}}. \]
The larger space \(\mathcal{K}\) has an ordinary projective measurement \(\{P_i\}\). The POVM elements on the original space are then recovered by the formula
\[ E_i = V^\ast P_i V. \]
This formula is the heart of the finite-dimensional Naimark theorem.
It says that each generalized measurement operator \(E_i\) is a compression of a projection \(P_i\). Compression means: first embed the small space into the large space using \(V\), then apply the larger-space operator \(P_i\), then return to the original space using \(V^\ast\).
So the theorem has a beautiful interpretation:
Generalized measurements are ordinary projective measurements viewed from inside a smaller Hilbert space.
This idea is not only mathematically elegant. It is also physically natural. In the laboratory, a system is rarely measured in complete isolation. A detector has internal degrees of freedom. An apparatus interacts with the system. An environment may become correlated with the outcome. Quantum information theory often models this by adding an ancilla, meaning an auxiliary quantum system introduced to help implement a process. The system and ancilla evolve together, and then a projective measurement is performed on part or all of the enlarged system. This operational viewpoint is standard in quantum information treatments of measurement and quantum operations (Nielsen and Chuang, 2010; Watrous, 2018).
What “from first principles” means here
This book does not assume that you already know Naimark dilation, POVMs, or operator theory. It does assume that you are willing to learn the necessary linear algebra carefully.
We begin with the objects that make quantum information possible:
- complex vector spaces,
- inner products,
- orthonormal bases,
- linear maps and matrices,
- adjoints,
- eigenvalues,
- positive semidefinite operators,
- projections,
- tensor products.
These are not decorative tools. Each one has a job.
For example, the inner product
\[ \langle \phi,\psi\rangle \]
measures overlap between vectors. In quantum theory, this overlap controls probabilities. If two normalized vectors are orthogonal, their inner product is zero, and they can be perfectly distinguished by a suitable projective measurement. If they are not orthogonal, perfect single-shot discrimination is impossible. This fact motivates generalized measurement strategies.
A positive semidefinite operator \(E\) is exactly the kind of operator that can produce nonnegative numbers of the form
\[ \langle \psi|E|\psi\rangle. \]
That is why POVM elements must be positive. The normalization condition
\[ \sum_i E_i=I \]
is exactly what makes the probabilities add to one:
\[ \sum_i \langle \psi|E_i|\psi\rangle = \left\langle \psi\middle|\sum_i E_i\middle|\psi\right\rangle = \langle \psi|I|\psi\rangle = \langle \psi|\psi\rangle = 1 \]
when \(|\psi\rangle\) is normalized.
An isometry is exactly the kind of map that can embed a quantum state into a larger space without changing its length or inner products. This matters because probabilities depend on inner products. If the embedding distorted inner products, it would not faithfully represent the original system.
A projection-valued measurement on the larger space is exactly the “ordinary” sharp measurement we already understand. Naimark’s theorem says that by combining an isometric embedding with such a sharp measurement, we obtain every finite-outcome POVM.
The main mathematical journey
The central proof of this book will appear in Chapter 11. The proof is short once the right tools are ready, but it is easy to miss its meaning if one rushes.
The finite-dimensional proof begins with a POVM
\[ E_1,\dots,E_m \]
on a Hilbert space \(\mathcal{H}\). Because each \(E_i\) is positive semidefinite, it has a positive square root:
\[ \sqrt{E_i}. \]
This means
\[ \sqrt{E_i}\sqrt{E_i}=E_i. \]
Using these square roots, we define an isometry into a larger direct-sum space:
\[ V|\psi\rangle = \sqrt{E_1}|\psi\rangle \oplus \sqrt{E_2}|\psi\rangle \oplus \cdots \oplus \sqrt{E_m}|\psi\rangle. \]
The symbol \(\oplus\) means that the vectors are placed into separate components of a larger space. If \(\mathcal{H}\) is the original space, then the enlarged space is
\[ \mathcal{K} = \mathcal{H}\oplus\mathcal{H}\oplus\cdots\oplus\mathcal{H} \]
with \(m\) copies of \(\mathcal{H}\).
Then we define \(P_i\) to be the projection onto the \(i\)-th component of this direct sum. With these definitions, one proves
\[ E_i = V^\ast P_i V. \]
That is Naimark dilation in its finite-outcome finite-dimensional form.
The proof is not magic. It is the result of placing the square-root pieces of the POVM into different compartments of a larger Hilbert space. The projective measurement asks, “Which compartment are we in?” The answer has exactly the same probabilities as the original POVM.
Why this theorem matters in quantum information
Naimark dilation is important because it connects three views of measurement.
First, it connects the probability view. A POVM is a rule for producing outcome probabilities from quantum states.
Second, it connects the geometric view. A POVM can be represented by projections in a larger Hilbert space.
Third, it connects the physical implementation view. A generalized measurement can be implemented by coupling the system to an ancilla and then performing a projective measurement.
These three views appear throughout quantum information.
In state discrimination, POVMs describe optimal strategies for extracting classical information from quantum states. The Helstrom measurement gives the minimum-error strategy for distinguishing two quantum states under standard assumptions, and more general POVM methods are central to the theory of quantum detection and estimation (Helstrom, 1976; Holevo, 2011).
In quantum communication, a sender may encode classical messages into quantum states, and a receiver chooses a measurement to decode the message. The most informative measurement need not be a projective measurement on the original signal space. POVMs are therefore essential in the study of accessible information and decoding measurements (Holevo, 2011; Watrous, 2018).
In quantum cryptography, realistic measurement devices are modeled using generalized measurements, especially when detectors are imperfect or when security proofs must account for all possible measurements an adversary might perform. The generality of POVMs is part of why quantum information security can be formulated in a device-independent or adversarially robust way in advanced settings (Nielsen and Chuang, 2010; Watrous, 2018).
In quantum tomography, informationally complete POVMs allow one to reconstruct an unknown quantum state from measurement statistics. A POVM is informationally complete when its outcome probabilities contain enough information to determine the density matrix. Such measurements are a central part of experimental quantum-state reconstruction (Paris and Řeháček, 2004; Heinosaari and Ziman, 2012).
Naimark dilation does not solve all these problems by itself. Rather, it gives a structural principle behind them:
Whenever a POVM appears, we may think of it as a projective measurement on a suitably enlarged space.
That principle helps us design measurements, compare measurements, prove mathematical results, and understand physical implementations.
How the chapters fit together
The first part of the book builds the mathematical language. Chapters 1 through 4 explain why quantum measurement is subtle and prepare the linear algebra and Hilbert-space tools needed for the theorem.
Chapters 5 through 7 introduce projective measurements and POVMs. We will see why projections are natural, why they are limited, and why positivity plus normalization gives exactly the right probability structure for generalized measurements.
Chapters 8 and 9 introduce ancillas, embeddings, and dilations. These chapters prepare the central idea that a complicated object on a smaller space can be understood as a simpler object on a larger space.
Chapters 10 through 13 form the mathematical core. They state Naimark’s theorem, prove it in finite dimensions, explain the construction, and discuss minimality and uniqueness.
Chapters 14 through 16 connect Naimark dilation to broader quantum operation theory, including Kraus operators, instruments, unitary implementation, and Stinespring dilation. Stinespring’s theorem is another major dilation result, showing how completely positive maps can be represented using a larger Hilbert space and a simpler operation (Stinespring, 1955; Paulsen, 2002).
Chapters 17 through 21 apply the theory. We work through concrete POVMs, state discrimination, communication, cryptography, and tomography.
Finally, Chapters 22 through 24 broaden and consolidate the picture. We briefly look beyond finite dimensions, review common pitfalls, and complete a capstone construction where the reader builds and interprets a Naimark dilation step by step.
The guiding idea
The guiding idea of this book is simple:
To understand generalized quantum measurement, enlarge the space until the measurement becomes projective.
This is a recurring pattern in mathematics. Sometimes an object looks complicated because we are viewing it in too small a space. By embedding it into a larger setting, the object becomes simpler, more symmetric, or easier to analyze.
A shadow on a wall may look like a strange two-dimensional shape. But if we know it is the shadow of a three-dimensional object, we can explain it more naturally. Naimark dilation says something similar about POVMs. A POVM may look like a generalized measurement on the original Hilbert space, but it can be understood as the shadow of a projective measurement in a larger Hilbert space.
The rest of this book is devoted to making that statement precise, proving it carefully, and showing why it matters.
References
Busch, P., Lahti, P. J., and Mittelstaedt, P. (1996). The Quantum Theory of Measurement. Second revised edition. Springer.
Chefles, A. (2000). “Quantum state discrimination.” Contemporary Physics, 41(6), 401–424.
Heinosaari, T., and Ziman, M. (2012). The Mathematical Language of Quantum Theory: From Uncertainty to Entanglement. Cambridge University Press.
Helstrom, C. W. (1976). Quantum Detection and Estimation Theory. Academic Press.
Holevo, A. S. (2011). Probabilistic and Statistical Aspects of Quantum Theory. Second English edition. Edizioni della Normale.
Nielsen, M. A., and Chuang, I. L. (2010). Quantum Computation and Quantum Information. 10th Anniversary edition. Cambridge University Press.
Paris, M. G. A., and Řeháček, J., eds. (2004). Quantum State Estimation. Springer.
Paulsen, V. (2002). Completely Bounded Maps and Operator Algebras. Cambridge University Press.
Reed, M., and Simon, B. (1980). Methods of Modern Mathematical Physics I: Functional Analysis. Revised and enlarged edition. Academic Press.
Stinespring, W. F. (1955). “Positive functions on C-algebras.” Proceedings of the American Mathematical Society*, 6(2), 211–216.
Watrous, J. (2018). The Theory of Quantum Information. Cambridge University Press.