Accessible Information Bound
Formal statement
Let
\[ \mathcal E=\{p_x,\rho_x\}_{x\in\mathcal X} \]
be a finite quantum ensemble. A classical label \(X\) is sampled with probability \(p_x\), and the receiver is given the quantum state \(\rho_x\) on a finite-dimensional Hilbert space \(\mathcal H_B\). The receiver chooses a POVM
\[ \mathsf M=\{M_y\}_{y\in\mathcal Y}, \qquad M_y\ge0, \qquad \sum_yM_y=I. \]
The measurement produces the conditional distribution
\[ p(y|x)=\operatorname{Tr}(M_y\rho_x), \]
and therefore the classical joint distribution
\[ p(x,y)=p_x\operatorname{Tr}(M_y\rho_x). \]
The mutual information obtained by this measurement is
\[ I_{\mathsf M}(X;Y). \]
The accessible information of the ensemble is
\[ I_{\mathrm{acc}}(\mathcal E) = \sup_{\mathsf M} I_{\mathsf M}(X;Y), \]
where the supremum is over all POVMs on \(\mathcal H_B\). In finite dimension, this supremum may be taken over finite-outcome POVMs for ordinary finite ensembles.
Let
\[ \bar\rho=\sum_xp_x\rho_x \]
be the average state, and define the Holevo quantity
\[ \chi(\mathcal E) = S(\bar\rho)-\sum_xp_xS(\rho_x), \]
where
\[ S(\rho)=-\operatorname{Tr}(\rho\log\rho) \]
is the von Neumann entropy. The fundamental accessible-information bound is
\[ I_{\mathrm{acc}}(\mathcal E) \le \chi(\mathcal E). \]
There are also immediate structural bounds:
\[ I_{\mathrm{acc}}(\mathcal E)\le H(X), \]
because no measurement can reveal more information than the entropy of the classical label, and
\[ I_{\mathrm{acc}}(\mathcal E)\le \log d, \]
when all \(\rho_x\) live in a \(d\)-dimensional Hilbert space, because
\[ \chi(\mathcal E)\le S(\bar\rho)\le \log d. \]
Thus the accessible information is limited both by the geometry of the ensemble and by the measurement structure. Holevo’s 1973 theorem gives the central upper bound on the amount of classical information obtainable from quantum states by measurements.
What the theorem means
The phrase “classical information stored in a quantum ensemble” is potentially misleading. A quantum state may require many real parameters to describe, but a receiver does not directly read those parameters. The receiver chooses a measurement, obtains a classical outcome, and then tries to infer the label \(x\). Accessible information is therefore not the amount of information in the density matrix as a mathematical object. It is the largest classical mutual information that can actually be extracted by a physical measurement.
The ensemble may contain classical-quantum correlation before measurement. But measurement is a bottleneck. The measurement converts a quantum system \(B\) into a classical register \(Y\). Once this conversion is made, only the classical mutual information \(I(X;Y)\) remains accessible to the receiver.
The Holevo quantity \(\chi(\mathcal E)\) is the mutual information between the classical label and the quantum system before measurement. The accessible information is the largest mutual information after measurement. The bound says:
\[ \text{measurement cannot extract more classical correlation than was present quantumly.} \]
This is a data-processing statement.
Classical-quantum state formulation
Represent the ensemble by the classical-quantum state
\[ \omega_{XB} = \sum_xp_x|x\rangle\langle x|_X\otimes\rho_x. \]
The label register \(X\) is classical, and the system \(B\) is quantum. The marginal state on \(B\) is
\[ \omega_B=\bar\rho. \]
The marginal state on \(X\) is
\[ \omega_X=\sum_xp_x|x\rangle\langle x|. \]
Because \(\omega_{XB}\) is block diagonal in \(X\), its entropy is
\[ S(\omega_{XB}) = H(X)+\sum_xp_xS(\rho_x). \]
The quantum mutual information is
\[ I(X;B)_\omega = S(\omega_X)+S(\omega_B)-S(\omega_{XB}). \]
Substituting the expressions above gives
\[ \begin{aligned} I(X;B)_\omega &= H(X)+S(\bar\rho) - \left(H(X)+\sum_xp_xS(\rho_x)\right)\\ &= S(\bar\rho)-\sum_xp_xS(\rho_x)\\ &=\chi(\mathcal E). \end{aligned} \]
Thus the Holevo quantity is exactly
\[ \chi(\mathcal E)=I(X;B)_\omega. \]
This is the clean conceptual identity behind the accessible-information bound.
Proof of the Holevo accessible-information bound
Fix a POVM
\[ \mathsf M=\{M_y\}_y. \]
This measurement defines a quantum-to-classical channel
\[ \mathcal M(\tau) = \sum_y\operatorname{Tr}(M_y\tau)|y\rangle\langle y|. \]
Applying this channel to system \(B\) gives the classical-classical state
\[ \omega_{XY} = (I_X\otimes\mathcal M)(\omega_{XB}). \]
Explicitly,
\[ \omega_{XY} = \sum_{x,y}p_x\operatorname{Tr}(M_y\rho_x) |x\rangle\langle x|\otimes |y\rangle\langle y|. \]
This is precisely the classical joint distribution produced by measuring the ensemble. Its mutual information is
\[ I(X;Y)_{\omega}=I_{\mathsf M}(X;Y). \]
Now use data processing for mutual information. Since \(Y\) is obtained from \(B\) by a local channel,
\[ I(X;Y)_\omega \le I(X;B)_\omega. \]
But
\[ I(X;B)_\omega=\chi(\mathcal E). \]
Therefore, for every POVM,
\[ I_{\mathsf M}(X;Y) \le \chi(\mathcal E). \]
Taking the supremum over all POVMs gives
\[ I_{\mathrm{acc}}(\mathcal E) \le \chi(\mathcal E). \]
This proves the accessible-information bound.
The same proof can also be written using quantum relative entropy. Since
\[ I(A;B)_\rho = D(\rho_{AB}\|\rho_A\otimes\rho_B), \]
and relative entropy is monotone under quantum channels, applying the measurement channel \(B\to Y\) gives
\[ D(\omega_{XY}\|\omega_X\otimes\omega_Y) \le D(\omega_{XB}\|\omega_X\otimes\omega_B). \]
The left-hand side is \(I(X;Y)\), and the right-hand side is \(\chi(\mathcal E)\).
Why measurement structure matters
The accessible information is not simply a function of the list of labels. It depends on what measurements can extract from the corresponding states. The same label distribution can be encoded in perfectly distinguishable states, partially distinguishable states, or identical states.
If the states are orthogonal, there is a measurement that reveals the label perfectly. If the states are identical, no measurement reveals anything. If the states are nonorthogonal, the measurement must compromise between possible outcomes. This compromise is the measurement-structure bottleneck.
For a fixed POVM \(\mathsf M\), the ensemble becomes an ordinary classical channel
\[ x\mapsto y \]
with transition probabilities
\[ p(y|x)=\operatorname{Tr}(M_y\rho_x). \]
The information extracted by that POVM is only
\[ I_{\mathsf M}(X;Y). \]
The accessible information optimizes this quantity over measurements, but even the best measurement may not turn all quantum correlation into classical correlation. In general,
\[ I_{\mathrm{acc}}(\mathcal E) < \chi(\mathcal E) \]
can occur.
This is why the Holevo quantity should not automatically be called “the information that Bob can read out from one copy.” It is the pre-measurement classical-quantum mutual information. The accessible information is the post-measurement classical-classical mutual information.
Example: orthogonal pure-state ensemble
Suppose
\[ \rho_x=|x\rangle\langle x| \]
for an orthonormal family \(\{|x\rangle\}\). Then Bob can measure in this basis and learn \(x\) exactly. Therefore
\[ I_{\mathrm{acc}}(\mathcal E)=H(X). \]
The average state is
\[ \bar\rho=\sum_xp_x|x\rangle\langle x|, \]
so
\[ S(\bar\rho)=H(X). \]
Each signal state is pure, so
\[ S(\rho_x)=0. \]
Hence
\[ \chi(\mathcal E)=H(X). \]
Thus
\[ I_{\mathrm{acc}}(\mathcal E)=\chi(\mathcal E)=H(X). \]
This is the case where the ensemble is effectively classical. The measurement structure has no obstruction because the states are perfectly distinguishable.
Example: identical states
Suppose
\[ \rho_x=\rho \]
for every \(x\). Then
\[ p(y|x)=\operatorname{Tr}(M_y\rho) \]
is independent of \(x\). Therefore \(X\) and \(Y\) are independent for every measurement, so
\[ I_{\mathrm{acc}}(\mathcal E)=0. \]
The Holevo quantity also vanishes:
\[ \bar\rho=\rho, \]
so
\[ \chi(\mathcal E) = S(\rho)-\sum_xp_xS(\rho) = 0. \]
This is the opposite extreme. There may be many labels, but the quantum system contains no measurement-accessible information about which label was chosen.
Example: commuting mixed states
Suppose all \(\rho_x\) commute. Then there is a common eigenbasis \(\{|z\rangle\}\) such that
\[ \rho_x=\sum_z p(z|x)|z\rangle\langle z|. \]
Measuring in the common eigenbasis produces the classical channel
\[ p(z|x). \]
The average state is
\[ \bar\rho=\sum_zp(z)|z\rangle\langle z|, \qquad p(z)=\sum_xp_xp(z|x). \]
Then
\[ S(\bar\rho)=H(Z) \]
and
\[ \sum_xp_xS(\rho_x)=\sum_xp_xH(Z|X=x)=H(Z|X). \]
Therefore
\[ \chi(\mathcal E)=H(Z)-H(Z|X)=I(X;Z). \]
The common-eigenbasis measurement achieves this value, so
\[ I_{\mathrm{acc}}(\mathcal E)=\chi(\mathcal E). \]
This example shows that the Holevo bound becomes tight when the ensemble is classical in a single basis.
Example: two nonorthogonal pure states
Let Alice choose
\[ |0\rangle \qquad\text{or}\qquad |+\rangle=\frac{|0\rangle+|1\rangle}{\sqrt2} \]
with equal probability. The label \(X\) contains one classical bit before encoding. Since the states are not orthogonal, Bob cannot read that bit perfectly from one copy.
The average state is
\[ \bar\rho = \frac12|0\rangle\langle0| + \frac12|+\rangle\langle+|. \]
Both signal states are pure, so
\[ \chi=S(\bar\rho). \]
For two equally likely pure states with overlap
\[ c=|\langle\psi|\phi\rangle|, \]
the eigenvalues of the average state are
\[ \frac{1+c}{2}, \qquad \frac{1-c}{2}. \]
Here
\[ c=\frac1{\sqrt2}. \]
Therefore
\[ \chi = h_2\!\left(\frac{1+1/\sqrt2}{2}\right) \approx0.6009 \text{ bits}. \]
The accessible information is smaller than one bit because no measurement can perfectly distinguish \(|0\rangle\) from \(|+\rangle\). For this symmetric binary pure-state case, the optimal measurement is the Helstrom measurement, and the resulting classical channel has error probability
\[ P_e = \frac12\left(1-\sqrt{1-c^2}\right). \]
With \(c=1/\sqrt2\),
\[ P_e=\frac12\left(1-\frac1{\sqrt2}\right)\approx0.1464. \]
The mutual information obtained is
\[ I_{\mathrm{acc}} = 1-h_2(P_e) \approx0.3991 \text{ bits}. \]
Thus
\[ I_{\mathrm{acc}}<\chi<1. \]
This example is useful because it separates three quantities: the original label has one bit, the ensemble has Holevo information about \(0.6009\) bits, and the best one-copy measurement extracts about \(0.3991\) bits.
Example: trine states and strict Holevo gap
Consider the three equally likely qubit trine states
\[ |\psi_0\rangle, \quad |\psi_1\rangle, \quad |\psi_2\rangle, \]
whose Bloch vectors lie in one plane separated by angles of \(120^\circ\). Their average state is
\[ \bar\rho=\frac I2. \]
Since the signal states are pure,
\[ \chi=S(I/2)=1 \]
bit.
However, one copy of the trine ensemble does not allow Bob to extract one full bit of classical mutual information. The optimal accessible information is
\[ I_{\mathrm{acc}}=\log_2 3-1\approx0.585 \]
bits, achieved by a suitable three-outcome measurement. This is a standard example showing that the Holevo bound can be strict for nonorthogonal ensembles.
The trine example gives a clear geometric lesson. The ensemble average is maximally mixed, so the Holevo quantity is one bit. But the individual states overlap. The measurement cannot perfectly sort the three possible labels, and the accessible information is strictly smaller.
Dimension bound: why \(n\) qubits alone cannot reveal more than \(n\) bits
If all states \(\rho_x\) live in a \(d\)-dimensional Hilbert space, then
\[ \chi(\mathcal E) = S(\bar\rho)-\sum_xp_xS(\rho_x) \le S(\bar\rho) \le \log d. \]
Thus
\[ I_{\mathrm{acc}}(\mathcal E) \le \log d. \]
For \(n\) qubits,
\[ d=2^n, \]
so
\[ I_{\mathrm{acc}}(\mathcal E) \le n. \]
This is the precise theorem behind the informal statement that \(n\) qubits alone cannot communicate more than \(n\) classical bits. The word “alone” matters. If sender and receiver share entanglement, as in superdense coding, the relevant physical resource is not merely the transmitted qubits. Prior entanglement changes the communication scenario.
Lower bounds and why Holevo is not the whole story
The Holevo quantity is an upper bound, not a formula for accessible information. Computing \(I_{\mathrm{acc}}\) exactly is generally hard because it requires optimization over POVMs and a nonlinear mutual-information objective.
There are also lower bounds. Jozsa, Robb, and Wootters introduced subentropy as a lower bound connected to accessible information. Their result is conceptually dual to the Holevo upper bound: the von Neumann entropy gives a natural upper bound, while subentropy gives a universal lower-bound structure in certain ensemble settings. This illustrates that accessible information sits between measurement-achievable lower bounds and Holevo-type upper bounds.
For this theorem list, the main message is that \(\chi\) is not automatically equal to what a one-shot measurement extracts. It is the most important upper bound on what can be extracted.
Relation to channel coding
In classical-quantum channel coding, an input letter \(x\) produces a quantum state \(\rho_x\). A codeword produces a longer product state, and the receiver may perform a collective measurement on the whole block. The Holevo-Schumacher-Westmoreland theorem says that the classical capacity of a classical-quantum channel is governed by an optimized Holevo information.
This does not contradict the possible strict inequality
\[ I_{\mathrm{acc}}(\mathcal E)<\chi(\mathcal E) \]
for a single-copy ensemble. Block coding changes the problem. The receiver no longer measures each signal independently. Instead, the receiver performs a collective measurement over many channel outputs. Collective measurements can extract correlations that are not accessible through naive single-copy measurements.
Thus there are two levels:
\[ \text{single ensemble: } I_{\mathrm{acc}}(\mathcal E)\le\chi(\mathcal E), \]
and
\[ \text{asymptotic coding: optimized Holevo information governs achievable communication rates.} \]
The accessible-information bound is the one-shot measurement bottleneck; channel coding studies how this bottleneck behaves when many structured uses are combined.
Relation to data locking
Accessible information can be surprisingly sensitive to available side information. In quantum data locking, a small classical key can dramatically change how much classical information a measurement can extract from a quantum system. The Holevo quantity still bounds accessible information in every fixed scenario, but the gap between encoded information and accessible information can be large when the measurement basis is effectively hidden.
This phenomenon reinforces the main lesson: quantum information may contain correlations that are not directly accessible as classical mutual information without the right measurement structure or side information.
How to use the theorem
Given an ensemble \(\mathcal E=\{p_x,\rho_x\}\), first compute the average state
\[ \bar\rho=\sum_xp_x\rho_x. \]
Then compute
\[ \chi(\mathcal E)=S(\bar\rho)-\sum_xp_xS(\rho_x). \]
Immediately, for every POVM,
\[ I(X;Y)\le\chi(\mathcal E). \]
Therefore
\[ I_{\mathrm{acc}}(\mathcal E)\le\chi(\mathcal E). \]
If the states commute, measure in the common eigenbasis; the bound is achievable. If the states are orthogonal pure states, the bound is also achievable and equals \(H(X)\). If the states are nonorthogonal, expect the bound to be possibly strict, and do not assume that \(\chi\) is itself the accessible information.
For proof work, the theorem is often used like this: instead of analyzing every possible POVM, one upper-bounds all measurements at once by the entropy expression \(\chi\). That is why the Holevo bound is so powerful.
Common mistakes
A common mistake is to say that the Holevo quantity is always the amount of classical information Bob can extract. The correct statement is
\[ I_{\mathrm{acc}}\le\chi. \]
Equality holds in important cases, such as commuting ensembles, but not in general.
A second mistake is to ignore the measurement. Classical information is not obtained from a quantum state until a measurement is performed. Different POVMs produce different classical channels \(x\mapsto y\), and therefore different mutual informations.
A third mistake is to confuse label entropy with accessible information. The label may have entropy \(H(X)\), but the quantum encoding may not allow all of that entropy to be recovered.
A fourth mistake is to forget the dimension bound. A finite-dimensional quantum system cannot reveal arbitrarily many classical bits in one use, even if its state vector is described by continuous parameters.
A fifth mistake is to overlook collective measurements. Single-copy accessible information can be smaller than Holevo information, while block coding with collective measurements can approach Holevo-governed rates in communication problems.
Final mental image
The accessible information of an ensemble is the amount of classical correlation that survives the best possible quantum measurement:
\[ I_{\mathrm{acc}}(\mathcal E) = \sup_{\mathsf M}I(X;Y). \]
The Holevo quantity is the classical-quantum mutual information before measurement:
\[ \chi(\mathcal E)=I(X;B). \]
A measurement is a channel from \(B\) to a classical register \(Y\), and data processing gives
\[ I(X;Y)\le I(X;B). \]
Therefore
\[ I_{\mathrm{acc}}(\mathcal E) \le \chi(\mathcal E). \]
In one sentence:
\[ \text{accessible information is what measurements can extract; Holevo information is what bounds extraction.} \]
This is why the accessible-information bound is central in quantum information theory. It separates three ideas that must not be confused: the classical label originally chosen, the quantum correlation carried by the ensemble, and the classical information actually obtainable after measurement.
References
Holevo, Alexander S. “Bounds for the Quantity of Information Transmitted by a Quantum Communication Channel.” Problems of Information Transmission 9, no. 3 (1973): 177–183.
Davies, E. B. “Information and Quantum Measurement.” IEEE Transactions on Information Theory 24, no. 5 (1978): 596–599.
Jozsa, Richard, Daniel Robb, and William K. Wootters. “Lower Bound for Accessible Information in Quantum Mechanics.” Physical Review A 49, no. 2 (1994): 668–677.
Fuchs, Christopher A., and Carlton M. Caves. “Ensemble-Dependent Bounds for Accessible Information in Quantum Mechanics.” Physical Review Letters 73, no. 23 (1994): 3047–3050.
Hausladen, Paul, Richard Jozsa, Benjamin Schumacher, Michael Westmoreland, and William K. Wootters. “Classical Information Capacity of a Quantum Channel.” Physical Review A 54, no. 3 (1996): 1869–1876.
Holevo, Alexander S. Probabilistic and Statistical Aspects of Quantum Theory. North-Holland, 1982; Springer reprint, 2011.
Nielsen, Michael A., and Isaac L. Chuang. Quantum Computation and Quantum Information. Cambridge University Press, 10th anniversary edition, 2010.
Watrous, John. The Theory of Quantum Information. Cambridge University Press, 2018.
Wilde, Mark M. Quantum Information Theory. Cambridge University Press, 2nd edition, 2017.