Home Research Blog CV Tutorials Certificates

Chapter 1: Basics of Quantum Computing

This is the first part of a basic introduction to Quantum Machine Learning, based on the IBM Qiskit 2021 Summer School course. An accompanying Jupyter notebook with the Qiskit code used to generate the images in this chapter and instructions on how to install Qiskit is available here.

Note, that this chapter and the accompanying Jupyter notebook are currently a work in progress.

The source YouTube playlist of the course is freely available and worth checking out if you prefer learning in a lecture format.

Another recommended resource which I've consulted when writing this, is the seminal textbook 'Quantum Computation and Quantum Information' (10th Anniversary Edition) by Michael Nielsen and Isaac Chuang.

This chapter introduces basic concepts of quantum computing: quantum bits or 'qubits', the fundamental postulates of quantum mechanics, and further representations of quantum states such as density operators and the Bloch sphere.

Quantum Bits

The fundamental promise of quantum computing is leveraging quantum effects to improve classical computation, and this can be reflected by generalising the fundamental concept of a 'bit' into a quantum bit, or a 'qubit'.

While bits are units of information that are in one of two levels, '0' or '1', quantum bits can be in a quantum superposition of both of these states.

As explained below, such a superposition means that simplistically $n$ qubits can be in a superposition of $2^{n}$ bit string, granting hopes to use such an exponential state space to speed up computation.

To represent such a quantum superposition, we need specialised mathematical notation.

One way of doing this are statevectors $\ket{\psi} = \alpha\ket{0} + \beta\ket{1}$ in the Dirac notation, where $|\alpha|^2+|\beta|^2=1$ and the '0' state is represented by $\ket{0}=\begin{bmatrix} 1 \\ 0 \end{bmatrix}$ and the '1' state is represented by $\ket{1}=\begin{bmatrix} 0 \\ 1 \end{bmatrix}$.

The matrix equivalent to this is $\ket{\psi} = \begin{bmatrix} \alpha \\ \beta \end{bmatrix}$

Measurement wise, this means that the probability of measuring the qubit $\ket{\psi}$ in the '0' or $\ket{0}=\begin{bmatrix} 1 \\ 0 \end{bmatrix}$ state is $\alpha^2$, and the probability of measuring qubit $\ket{\psi}$ in '0' or $\ket{1}=\begin{bmatrix} 0 \\ 1 \end{bmatrix}$ is conversely $\beta^2$.

Expanding on the Dirac, or 'bra-ket', notation what were shown above were 'ket' vectors in the form $\ket{\psi}$.

Accompanying 'ket' vectors are 'bra' vectors in the form $\bra{\psi}$, which are related to 'ket' vectors as $\bra{\psi}=\ket{\psi}^{\dagger}$

This $\dagger$ operator is matrix-wise a combination of a transpose (flipping matrix values over its diagonal) and complex conjugate (flipping the sign of the imaginary component $i=\sqrt{-1}$).

To clarify this, let's break this down with an example of converting a column 'ket' vector into a row 'bra' vector and then how this works with a 2D matrix:

Starting with the 'ket' vector superposition $\ket{\psi_1}=\sqrt{2}\begin{bmatrix} 1 \\ i \end{bmatrix}$: \begin{align} \ket{\psi_1}&=\sqrt{2}\begin{bmatrix} 1 \\ i \end{bmatrix} \\ \ket{\psi_1}^{*}=\sqrt{2}\begin{bmatrix} 1^{*} \\ i^{*} \end{bmatrix}&=\sqrt{2}\begin{bmatrix} 1 \\ -i \end{bmatrix} \\ \bra{\psi_1} = (\ket{\psi_1}^{*})^{T} &= \sqrt{2}\begin{bmatrix} 1 & -i \end{bmatrix} \end{align} The complex conjugate flips the sign of the imaginary component, but not the real component, leading to $\ket{\psi_1}^{*}=\sqrt{2}\begin{bmatrix} 1^{*} \\ i^{*} \end{bmatrix}=\begin{bmatrix} 1 \\ -i \end{bmatrix}$. Then, the transpose flips the coordinates from a column vector to a row vector as $\bra{\psi_1} = (\ket{\psi_1}^{*})^{T} = \sqrt{2}\begin{bmatrix} 1 & -i \end{bmatrix}.$

For a 2D matrix $A = \begin{bmatrix} 1 & i \\ -1 & -i \end{bmatrix}$ this is instead \begin{align} A &= \begin{bmatrix} 1 & i \\ -1 & -i \end{bmatrix} \\ A^{*} &= \begin{bmatrix} 1 & -i \\ -1 & i \end{bmatrix} \\ A^{\dagger} = (A^{*})^{T} &= \begin{bmatrix} 1 & -1 \\ -i & i \end{bmatrix} \end{align} In this 2D matrix example, the transpose can be seen as a 'flip' around the diagonal. More specifically, the coordinates of the matrices flip so that $A^{*}_{12}=-i$ and $A^{*}_{21}=1$ swap values. As diagonal values all have the same row and column positions, they can only swap with themselves and so stay the same.

This complex conjugate transpose is also referred to as the Hermitian transpose.

Of note are 'Hermitian' matrices, which stay invariant to such complex conjugation, so $B^{\dagger}=B$.

A special property of Dirac notation are inner products between row 'bra' and column 'ket' vectors as shown here: \begin{align} \braket{\psi|\phi} &=\begin{bmatrix}\alpha^{*} & \beta^{*} \end{bmatrix} \begin{bmatrix} \gamma \\ \delta \end{bmatrix}\\ &= \alpha^{*}\gamma + \beta^{*}\delta \end{align}

Another way of performing the above formulation is to stick to using Dirac notation as shown below: \begin{align} \braket{\psi|\psi}&= (\alpha^{*}\bra{0} + \beta^{*}\bra{1})(\gamma \ket{0} + \delta \ket{1})\\ &= \alpha^{*}\gamma \braket{0|0} + \alpha^{*}\delta\braket{0|1} + \beta^{*}\gamma\braket{1|0} + \beta^{*}\delta\braket{1|1} \\ &=\alpha^{*}\gamma \cdot 1 + \alpha^{*}\delta\cdot 0 + \beta^{*}\gamma\cdot 0 + \beta^{*}\delta\cdot 1\\ &=\alpha^{*}\gamma + \beta^{*}\delta \end{align}

Note that same result as the matrix formulation is enforeced by $\braket{0|1}=\braket{1|0}=0$ and $\braket{0|0}=\braket{1|1}=1$. Such a property of $\ket{0}$ and $\ket{1}$ is referred to as orthonormality, that is such unit statevectors are normal or 'perpendicular' to each other in mathematical space and so their inner product is zero, but if they perform an inner product with themselves the result is 'unity' or 1.

The fact that inner products can occur between state vectors means that they are situated in a mathematical space allowing such, which is referred to as an inner product space. Similarly, that the lengths of such state vectors can be complex valued, a combination of real and imaginary (multiple of $i=\sqrt{-1}$), making this space furthermore a complex inner product space. Such a space is colloquially referred to as 'Hilbert' space.

Fundamental Postulates of Quantum Mechanics

Before diving further into the maths of quantum computing, it's worth briefly covering the postulates, or fundamentals prerequisites, of the theory of quantum mechanics itself, which enables quantum computing as a discipline. After learning of basic units of quantum computation, quantum bits, we will see how exactly these correspond to quantum computing:

  1. Isolated physical quantum systems can be described by a mathematical state space, and the specific state of the system is described by a state vector. The state vector is often symbolised as $\psi$, and is also called the wave function.
  2. A closed physical system evolves over time, with an evolution from $\psi_1$ at time $t=t_{1}$ to $\psi_2$ at time $t=t_{2}$, with the change determined only by the time difference $t_{2}-t_{1}$.
  3. Measurement of such a quantum system 'collapses' the system into a classical state with classical information which is the result of the measurement.
  4. A quantum system can be a composite of multiple smaller quantum systems. Both the composite and individual systems can be described by wave functions due to the first postulate

Now, let's look at how these postulates relate to qubits for quantum computing.

State space

Postulate 1: Isolated physical systems can be described by a state space, which is a mathematical complex vector (that is, vector lengths have both real and imaginary components) inner product space (an inner product is possible). The specific state of the system is described by a state vector, a unit vector within state space.

In the case of qubits, we can take the wave function $\ket{\psi}=\alpha\ket{0}+\beta\ket{1}$.

The condition of being a unit vector, that the 'length' of the state vector in state space is given by the inner product with itself, means that \begin{align} \braket{\psi|\psi}&= 1 \\ \alpha^{*}\alpha \braket{0|0} + \alpha^{*}\beta\braket{0|1} + \beta^{*}\alpha\braket{1|0} + \beta^{*}\beta\braket{1|1} &= 1\\ |\alpha|^{2}\cdot 1 + \alpha^{*}\beta \cdot 0 + \beta^{*}\alpha \cdot 0 + |\beta|^{2} \cdot 1 &= 1\\ \alpha^{2} + \beta^{2} = 1 \end{align}

Note that $\braket{0|1}=\braket{1|0}=0$ and $\braket{0|0}=\braket{1|1}=1$ because the computational basis state ($\ket{0}$ and $\ket{1}$) is orthonormal.

The computational basis states quite clearly correspond to the classical bits, $\ket{0}$ as 0 and $\ket{1}$ as 1. The difference from classical bits is that a superposition of states can exist, such as $\ket{\psi}=\frac{1}{\sqrt{2}}(\ket{0}+\ket{1})$.

Evolution

Postulate 2: The evolution of closed physical systems over time can be described by a unitary transformation $U$ which depends on the difference in time from the initial state at time $t_1$ to the evolved state $t_2$.

Practically in the case of single qubits, evolution can be described as a single unitary qubit gate U, where evolved state $\ket{\psi'}$ is $$\ket{\psi'}=U\ket{\psi}$$

By the definition of unitary matrices $U^{\dagger}=U^{-1}$, hence the Hermitian conjugate of $U$ acts as its inverse operation.

\begin{align} \ket{\psi'}&=U\ket{\psi}\\ \ket{\psi}&=U^{\dagger}\ket{\psi'}\\ \ket{\psi}&=U^{\dagger}U\ket{\psi} \end{align}

This invertibility alludes to the reversible nature of quantum mechanics and quantum computing, in contrast to the irreversible event of measurement as described by the next postulate.

To better understand the role of time, let's reformulate the 2nd postulate.

Postulate 2': The time evolution of a closed systems is described the Schrödinger equation $$i\hbar \frac{\mathrm{d}}{\mathrm{d}t}=H\ket{\psi}$$

Here, $\hbar$, Planck's constant, is a physical constant and $H$ is a fixed Hermitian operator referred to as the 'Hamiltonian' of the system.

The Hamiltonian can also be represented as a 'spectral decomposition', or a diagonal matrix with values composed of their eigenvalues. For a given matrix $A$ eigenvectors are vectors which when acted on return themselves multiplied by a scalar coefficient, an eigenvalue. In other words with the relation $A\ket{a}=a\ket{a}$, $\ket{a}$ is an eigenvector of $A$, and $a$ is a corresponding eigenvalue. For the Hamiltonian, such eigenvalues are the possible energies of the system described by the Hamiltonian: \begin{align} H &= \begin{bmatrix} E_0 & 0 & 0 & \dots \\ 0 & E_1 & 0 & \dots \\ 0 & 0 & E_2 & \dots \\ \vdots & \vdots & \vdots & \ddots \\ \end{bmatrix}\\ &= \sum_{i} E_{i} \ket{E_i} \bra{E_i} \end{align}

For the Hamiltonian possible energy states of the system are represented by the orthonormal eigenvectors $\ket{E}$ and their corresponding energies are represented by the eigenvalues $E$. The lowest energy state $\ket{E_0}$ is referred to as the ground state and has the ground state energy $E_0$.

From the Schrödinger equation, the unitary operation can be expressed as an exponential: $$\ket{\psi(t_2)}= e^{-i H \frac{t_{2}-t_{1}}{\hbar}} \ket{\psi(t_1)} = U(t_1, t_2)\ket{\psi(t_1)}$$

Where the unitary operator $U(t_1, t_2)= e^{-i H \frac{t_{2}-t_{1}}{\hbar}}$

Lastly, there's some nuance to the practical application of the 2nd postulate on state evolution, with approximations taken for the 'closed system'.

In reality an ideal closed system is infeasible since all systems within the universe interact, such as through gravity. Furthermore, the application of unitary operators can be seen as an external operation on a closed system, such as a gate applied to a single qubit, thus externally interrupting the evolution of the qubit over time.

Thankfully, a good enough approximation can be achieved for most systems for this postulate to apply. For example with an atom-laser system, where a laser is focussed on an atom, the behaviour of the atom's state can be approximated as the Hamiltonian of atom with additional time varying terms taking into account the laser intensity (which varies over time depending on whether it is running or not). This idea of an approximation of a closed system but with a time dependent Hamiltonian, with parameters under external experimental control, gives us a system which still evolves over time according to Schrödinger's equation.

Measurement

Postulate 3: Quantum measurements can be described by a collection of measurement operators {$M_{m}$}, where $m$ refers to measurement outcomes, which act on the state space of the system being measured.

When acting on state $\ket{\psi}$ the probability of measuring outcome $m$ is $$p(m) = \bra{\psi} M_{m}^{\dagger}M_{m}\ket{\psi}$$

The state after this measurement is $$\frac{M_{m}\ket{\psi}}{\bra{\psi}M_{m}^{\dagger}M_{m}\ket{\psi}}$$

Such measurement operators obey the following sum relation, called the completeness theorem $$\sum_{m} M^{\dagger}M = I$$

$I$ in the completeness theorem is a commonly used matrix called the 'identity' matrix, a square matrix with only '1' components along the diagonal and 0. In practice, while an identity matrix of dimension $d\times d$ is $I_{d}$, such a subscript is usually dropped as the dimensions tend to be obvious from the context. An example of the identity matrix for dimensions of $4\times4$ is shown here: $$I = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}$$

The identity matrix has the useful property of $I_{d} A=A$ for matrix $A$ with $d$ rows, for example $I \ket{\psi}= \ket{\psi}$.

The aforementioned property of the identity matrix means that the completeness relation represent a sum of probability of possible outcomes to 1 for any state $\ket{\psi}$: \begin{align} \sum_m p(m) &= \sum_m \bra{\psi} M^{\dagger} M \ket{\psi} \\ &= \bra{\psi} (\sum_m M^{\dagger}M ) \ket{\psi} \\ &= \bra{\psi} I \ket{\psi} \\ &= \braket{\psi|\psi} \\ & = 1 \end{align}

To get a better grasp for the 3rd postulate, we'll explore the simple case of measurement in the computational basis {$\ket{0}$, $\ket{1}$}.

There are two outcomes, $\ket{0}$ and $\ket{1}$, and corresponding measurement operators, $M_0=\ket{0}\bra{0} + \ket{1}\bra{1}$. The completeness theorem is obeyed by these measurement operators $M^{\dagger}_{0}+M^{\dagger}_{1}=M_{0}+M_{1}=1$, because the measurement operators are Hermitian ($M^{\dagger}_{0}=M_0$ and $M^{\dagger}_{1}=M_1$) and idempotent (when acting on themselves yield themselves $M_{0}M_{0}=M_{0}$ and $M_{1}M_{1}=M_{1}$).

We can rederive measurement probabilities of $\ket{\psi}$, $|\alpha|^{2}$ for $\ket{0}$ and $|\beta|^{2}$ for $\ket{1}$, we can derive the state immediately after measurement. For example for $\ket{0}$:

\begin{align} p(0) &= \bra{\psi} M^{\dagger}_{0} M_{0} \ket{\psi}\\ &= \bra{\psi} M_{0} M_{0} \ket{\psi} \\ &= \bra{\psi} M_{0} \ket{\psi} \\ &= (\alpha^{*} \bra{0} + \beta^{*} \bra{1}) (\ket{0}\bra{0}) (\alpha \ket{0} + \beta \ket{1}) \\ &= (\alpha^{*} \braket{0|0} + \beta^{*} \braket{1|0} )(\alpha \braket{0|0} + \beta \braket{0|1}) \\ &= (\alpha^{*} \cdot 1 + \beta^{*} \cdot 0)(\alpha \cdot 1 + \beta \cdot 0)\\ &= \alpha^{*}\alpha \\ &= |\alpha|^{2} \end{align}

A similar result can be derived for $p(1)=|\beta|^{2}$. In the above calculation, the properties of the measurement operators being Hermitian and idempotent where used again, as well as the orthonormal nature of the computational basis state ($\braket{0|0}\braket{1|1}=1$ and $\braket{0|1}=\braket{1|0}=0$).

The corresponding state after measurement, which after the measurement of $\ket{0}$ is $$\frac{M_0 \ket{0}}{\sqrt{\bra{\psi} M^{\dagger}_{0} M_{0} \ket{\psi}}}= \frac{\alpha \ket{0}}{|\alpha|}$$

Where $\frac{\alpha}{|\alpha|}$ can be dropped to get $\ket{0}$. Similarly, the same derivation can be done for the post-measurement state of $\ket{1}$ just being $\ket{1}$.

Projective Measurements

Now that the general description of the measurement postulate has been explored, it's worth looking at the commonly used special case, Projective Measurements.

Projective measurement operators, $P_m$, beyond the completeness theorem $\sum_{m} M^{\dagger}_{m}M_{m} = I$, derive from the general measurement postulate when two further restrictions are placed on the measurement operators so that $M_m$ is Hermitian ($M^{\dagger}_{m}=M_{m}$) and follows the relation $M_{m} M_{n}= \delta_{nm} M_{m}$ (where the Dirac delta $\delta_{nm}$=1 only if n is m, and is otherwise 0). This means that the probability of measuring state m is $$p(m) = \bra{\psi}P^{\dagger}_{m} P_{m} \ket{\psi} = \bra{\psi}P_{m} \ket{\psi}$$

Projective measurement operators $P_{m}$ hence function so that measurement probability of outcome $m$ is $\bra{\psi} P_{m} \ket{\psi}$ and the state post-measurement is $$\frac{P_m}{\sqrt{p(m)}}\ket{\psi}$$

These projective measurement operators form an observable by spectral decomposition with eigenvalues $m$

$$M=\sum_m m P_m$$

So $P_m$ projects states onto the eigenspace of $M$ with a resulting eigenvalue $m$.

One advantage of using projective measurements is that it has a very simple definition for the average value of projective measurements, or expectation value of the observable $M$, $\bra{\psi} M \ket{\psi}$, as shown with the following derivation: \begin{align} E[M] &= \sum_m m p(m) \\ &= \sum_m m \bra{\psi} P_{m} \ket{\psi} \\ &= \bra{\psi} (\sum_{m} m P_{m}) \ket{\psi} \\ &= \bra{\psi} M \ket{\psi} \end{align}

While projective measurements are a simpler way to deal with measurements, the general measurement postulate was described first because it is more comprehensive and faces fewer restrictions on measurement operators.

Projective measurement operators also have a properly called repeatability, where repeating the same measurement $P_m$ on a state $\ket{\psi_m}$, resulting from an initial measurement of $P_m$ on $\ket{\psi}$, just keeps returning $\ket{\psi_m}$ with probability of $\bra{\psi_m} P_m \ket{\psi_m}=1$. This repeatable property is not seen in some measurements in quantum mechanics, such as the measurement of a photon where such measurement destroys the photon, and hence projective measurements cannot be performed in these cases.

Tensor Product

Postulate 4: A quantum system can be a composite of multiple smaller quantum systems, denoted by a tensor product $\otimes$. So a system $\ket{\psi_{AB}}$ composed of $\ket{\psi_A}$ and $\ket{\psi_B}$ is $\ket{\psi_A}\otimes\ket{\psi_B}$.

A tensor product is a way of composing a larger vector space from smaller vector spaces, and mathematically is a form of operation between matrices as shown in the following examples.

Starting with the tensor product of two wave vectors $\ket{\psi}=\alpha \ket{0}+\beta \ket{1}$ and $\ket{\phi} = a \ket{0} + b \ket{1}$, it is \begin{align} \ket{\psi}\otimes\ket{\phi} &= \begin{bmatrix} \alpha \\ \beta \end{bmatrix} \otimes \begin{bmatrix} a \\ b \end{bmatrix} \\ &= \begin{bmatrix} \alpha \otimes \ket{\phi} \\ \beta \otimes \ket{\phi} \end{bmatrix}\\ &= \begin{bmatrix} \alpha a \\ \alpha b \\ \beta a\\ \beta b \end{bmatrix} \end{align}

In the case of $\ket{0}\otimes\ket{1}$ this is \begin{align} \ket{0}\otimes\ket{1} &= \begin{bmatrix} 1 \\ 0 \end{bmatrix} \otimes \begin{bmatrix} 0 \\ 1 \end{bmatrix} \\ &= \begin{bmatrix} 1 \otimes \ket{1} \\ 0 \otimes \ket{1} \end{bmatrix}\\ &= \begin{bmatrix} 0 \\ 1 \\ 0\\ 0 \end{bmatrix}\\ &= \ket{01} \end{align}

As shown in the last line often the tensor product symbol is dropped, with its usage implied by multiple symbols within a composite state vector.

The tensor product obeys the following mathematical properties:

$$l(\ket{g}\otimes\ket{h})=(l\ket{g})\otimes\ket{h}=\ket{g}\otimes(\ket{h})$$ $$(\ket{g_1}+\ket{g_2})\otimes\ket{h} = \ket{g_1}\otimes\ket{h} + \ket{g_2}\otimes\ket{h}$$ $$\ket{g}\otimes(\ket{h_1}+\ket{h_2}) = \ket{g}\otimes\ket{h_1} + \ket{g}\otimes\ket{h_2}$$

Where $l$ is a scalar.

Tensor products also apply to matrices of arbitrary sizes, such as the following two square matrices $A$ and $B$: \begin{align} A\otimes B &= \begin{bmatrix} \alpha & \beta \\ \gamma & \delta \end{bmatrix} \otimes \begin{bmatrix} a & b \\ c & d \end{bmatrix} \\ &= \begin{bmatrix} \alpha \otimes B & \beta \otimes B\\ \gamma \otimes B & \delta \otimes B \end{bmatrix}\\ &= \begin{bmatrix} \alpha a & \alpha b & \beta a & \beta b \\ \alpha c & \alpha d & \beta c & \beta d \\ \gamma a & \gamma b & \delta a & \delta b \\ \gamma c & \gamma d & \delta c & \delta d \end{bmatrix} \end{align}

For a tensor product of matrices of sizes $n\times m$ and $p \times q$ the size of the resulting matrix will be $n \cdot p \times m \cdot q$.

Tensor products allow operations to be applied selectively to component wave vectors, for example $A\otimes B$ when applied to $\ket{a}\otimes\ket{b}$ leads to $(A\ket{a})\otimes(B\ket{b})$, which is useful for finer control of multi-qubit systems.

Now that we've gone over the 4 postulates of quantum mechanics, we have the theoretical foundation to understand quantum states in quantum computing. But the Dirac formulation is not the only way to represent qubit states, as shown in the next section.

Further Representations of Qubits

Density Matrices

Another way of representing quantum systems, as an alternative to the Dirac formulation with 'bra-ket' state vectors, is to use density matrices.

A density matrix, also known as a density operator, has a spectral decomposition into vectors $$\rho = \sum_i p_i \ket{\psi_i}\bra{\psi}$$

Where $p_i$ is the probability to measure a pure state of $\ket{\psi_i}$, such as $\ket{0}$ or $\ket{1}$.

Density operators are positive operators, meaning that $\bra{\psi} \rho \ket{\psi} \geq 0$ for all $\ket{\psi}$, which can be seen from their formulation and the fact that $p_i \geq 0$.

Impure density matrices are those composed of multiple pure states.

Whether a density matrix is pure or not can be checked by performing a mathematical operation, called the trace, on the square of the density matrix, $\rho \rho = \rho^{2}$.

A trace is the sum of the diagonal components of a matrix, as demonstrated for matrix A: \begin{align} tr(A) &= tr\begin{bmatrix} a & b \\ c & d \end{bmatrix}\\ &= a + b \end{align}

Now dealing with the trace of the square of a density matrix with two orthogonal basis states with probabilities $p_1$ and $p_2$: \begin{align} \rho &= \begin{bmatrix} p_1 & 0 \\ 0 & p_2 \end{bmatrix} \\ \rho^{2} &= \begin{bmatrix} p_1 & 0 \\ 0 & p_2 \end{bmatrix} \begin{bmatrix} p_1 & 0 \\ 0 & p_2 \end{bmatrix} \\ &= \begin{bmatrix} p_1\cdot p_1 + 0 \cdot 0 & p_1 \cdot 0 + 0 \cdot p_2 \\ 0\cdot p_1 + p_2\cdot 0 & 0\cdot 0 + p_2 \cdot p_2 \end{bmatrix} \\ &= \begin{bmatrix} p_{1}^{2} & 0 \\ 0 & p_{2}^{2} \end{bmatrix} \\ tr(\rho^{2}) &= p_{1}^{2} + p_{2}^{2} \end{align}

$tr(\rho^2)=1$ occurs only when one basis state probability is unity, that is either $p_1=1$ or $p_2=1$.

This is because if both $p_1\lt 1$ and $p_2\lt 1$, then both terms reduce with the square so $p_{1}^{2}\lt p_1$ and $p_{2}^{2}\lt p_2$, hence $p_{1}^{2}+p_{2}^{2} \lt p_1 + p_2 = 1$. This turns the condition $tr(\rho^2)=1$ into a test for whether $\rho$ describes a pure state, with $tr(\rho^2)\lt 1$ indicating an impure state.

Let's look at some examples to get a clearer idea regarding density matrices.

For pure states $\rho_1 = \ket{0}\bra{0}$ and $\rho_2 =\ket{1}\bra{1}$ \begin{align} \rho_{1}^{2} &= \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix} \\ \rho_{1}^{2} &= \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix}\\ tr(\rho_{1}^{2}) &= 1 + 0 \\ tr(\rho_{1}^{2}) &= 1 \\ \rho_{2}^{2} &= \begin{bmatrix} 0 & 0 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} 0 & 0 \\ 0 & 1 \end{bmatrix} \\ \rho_{2}^{2} &= \begin{bmatrix} 0 & 0 \\ 0 & 1 \end{bmatrix}\\ tr(\rho_{2}^{2}) &= 0 + 1 \\ tr(\rho_{2}^{2}) &= 1 \\ \end{align}

Now, let's look at an impure state $\rho_3$ which is composed of $\rho_1$ and $\rho_2$, each with a probability weighting of 0.5, so $\rho_3 = 0.5 \rho_1 + 0.5 \rho_2$ \begin{align} \rho_3 &= \frac{1}{2}\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}\\ \rho_{3}^{2} &= (\frac{1}{2})^{2}\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \\ \rho_{3}^{2} &= \frac{1}{4}\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}\\ tr(\rho_{3}^{2}) &= \frac{1}{4}(1 + 1) \\ tr(\rho_{3}^{2}) &= \frac{1}{2} \\ tr(\rho_{3}^{2}) &\lt 1 \end{align}

Lastly let's look at a pure state $\rho_4$ formed by the state $\frac{1}{\sqrt{2}}(\ket{0}+\ket{1})$ \begin{align} \rho_4 &= (\frac{1}{\sqrt{2}})^{2} (\ket{0}+\ket{1})(\bra{0}+\bra{1})\\ \rho_4 &= \frac{1}{2}(\ket{0}\bra{0} + \ket{0}\bra{1} + \ket{1}\bra{0} + \ket{1}\bra{1}) \\ \rho_4 &= \frac{1}{2} \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix} \\ \rho_{4}^{2} &= (\frac{1}{2})^{2} \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix} \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix} \\ \rho_{4}^{2} &= \frac{1}{4} \begin{bmatrix} 1+1 & 1+1 \\ 1+1 & 1+1 \end{bmatrix}\\ \rho_{4}^{2} &= \frac{1}{4} \begin{bmatrix} 2 & 2 \\ 2 & 2 \end{bmatrix}\\ \rho_{4}^{2} &= \frac{1}{2} \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix}\\ tr(\rho_{4}^{2}) &= \frac{1}{2} (1 + 1)\\ tr(\rho_{4}^{2}) &= 1 \end{align}

Postulate of Quantum Mechanics: Density Matrix version

Due to the equivalence of the density matrix approach and the Dirac formalism, the postulates of quantum mechanics can be reformulated within the density matrix approach, as briefly recounted here.

Postulate 1: An isolated system has an associated complex vector inner product space, and its state can be described an density operator $\rho$ which acts on this state space. The density operator $\rho$ is positive and has a trace of 1. If the system is in state $\rho_i$ with probability of $p_i$, then a weighted sum of possible states can be done as follows to achieve the total $\rho$

$$\rho=\sum_i p_i \rho_i$$

Postulate 2: The evolution of a closed state over time is described by a unitary operator $U$, which is dependent on the difference in time from the initial to the evolved state. This unitary operator is applied to the density operator $\rho$ to evolve into state $\rho'$ as noted below

$$\rho' = U \rho U^{\dagger}$$

Postulate 3: Quantum measurements can be described by a collection of measurement operators {$M_{m}$}, where $m$ refers to measurement outcomes, which act on the state space of the system being measured.

If the state is $\rho$ before measurement, then the probability of measurement outcome $m$ being measured is $$p(m) = tr(M^{\dagger}_{m}M_{m} \rho)$$

The state after this measurement is $$\frac{M_{m} \rho M^{\dagger}_{m}}{ tr(M^{\dagger}_{m}M_{m} \rho)}$$

Such measurement operators obey the completeness theorem $$\sum_{m} M^{\dagger}M = I$$

Postulate 4: A quantum system can be a composite of multiple smaller quantum systems, denoted by a tensor product '$\otimes$'. So a state $\rho_{AB}$ with two component states of $\rho_A$ and $\rho_B$ is $\rho_{A}\otimes\rho_{B}$.

An advantage of the density matrix approach is that component systems can be analysed with an operation called partial trace. For a composite system $AB$ composed of systems $A$ and $B$, the partial trace over system $B$ recovers the density matrix of $A$ $$\rho_A = tr_{B}(\rho_{AB})$$

This partial trace that acts over system $B$, can be carried out as follows \begin{align} tr_{B}(\rho_{AB}) &= tr_{B}(\rho_{A} \otimes \rho_{B}) \\ &= \rho_{A} tr(\rho_B) \\ &= \rho_{A} \cdot 1 \\ &= \rho_A \end{align}

Above, the property $tr(\rho_B)=1$ was used.

Let's look at an example of an interesting two-qubit pure state called a 'Bell state', $\ket{\Psi_{00}}=\frac{1}{\sqrt{2}} (\ket{00}+\ket{11})$

\begin{align} \rho &= \frac{1}{2}((\ket{00}+\ket{11}))(\bra{00}+\bra{11})\\ &= \frac{1}{2}(\ket{00}\bra{00}+\ket{00}\bra{11}+\ket{11}\bra{00}+\ket{11}\bra{11}) \end{align}

A useful trick in the following trace calculations is $tr(\ket{\psi_1}\bra{\psi_2})=\braket{\psi_1|\psi_2}$, brought about by trace being the addition of diagonal terms.

Tracing over the second qubit results in the following density matrix $\rho_1$ for the first qubit \begin{align} \rho_1 &= tr_{2}(\rho) \\ &= \frac{1}{2}(tr_{2}(\ket{00}\bra{00})+tr_{2}(\ket{00}\bra{11})+tr_{2}(\ket{11}\bra{00})+tr_{2}(\ket{11}\bra{11}))\\ &= \frac{1}{2}(\ket{0}\bra{0}\braket{0|0}+\ket{0}\bra{1}\braket{0|1}+\ket{1}\bra{0}\braket{1|0}+\ket{1}\bra{1}\braket{11}) \\ &= \frac{1}{2}(\ket{0}\bra{0}\cdot 1+\ket{0}\bra{1}\cdot 0+\ket{1}\bra{0}\cdot 0+\ket{1}\bra{1}\cdot 1) \\ &= \frac{1}{2}(\ket{0}\bra{0}+\ket{1}\bra{1}) \\ &= \frac{1}{2} I \end{align}

$tr(\ket{\psi_1}\bra{\psi_2})=\braket{\psi_1|\psi_2}$ was used on the second qubit state, allowing the orthonormal relations $\braket{0|0}=\braket{1|1}=1$ and $\braket{0|1}=\braket{1|0}=0$ to be applied.

Interestingly enough, while the Bell state is pure, the first qubit isn't. First let's show that the Bell state is pure. Note, for the following calculation the orthonormal relations $\braket{00|00}=\braket{11|11}=1$ and $\braket{00|11}=\braket{11|00}=0$ will be used implicitly to avoid having too many terms. \begin{align} \rho &= \frac{1}{2}(\ket{00}\bra{00}+\ket{00}\bra{11}+\ket{11}\bra{00}+\ket{11}\bra{11})\\ \rho^{2} &= \frac{1}{4}(\ket{00}\bra{00}+\ket{00}\bra{11}+\ket{11}\bra{00}+\ket{11}\bra{11})\\&\cdot (\ket{00}\bra{00}+\ket{00}\bra{11}+\ket{11}\bra{00}+\ket{11}\bra{11}) \\ \rho^{2} &= \frac{1}{4} (\ket{00}(\bra{00} + \bra{11} + 0 + 0)\\ & + \ket{00}( 0 + 0 +\bra{00}+\bra{11})\\ & + \ket{11}(\bra{00} + \bra{11} + 0 + 0)\\ & + \ket{11}( 0 + 0 + \bra{00} + \bra{11})) \\ \rho^{2} &= \frac{1}{4} ( 2\ket{00}(\bra{00} + \bra{11}) + 2\ket{11}(\bra{00}+\bra{11}))\\ \rho^{2} &= \frac{1}{2} (\ket{00}\bra{00}+\ket{00}\bra{11}+\ket{11}\bra{00}+\ket{11}\bra{11})\\ \rho^{2} &= \rho\\ tr(\rho^{2}) &= \frac{1}{2}tr(\ket{00}\bra{00}+\ket{00}\bra{11}+\ket{11}\bra{00}+\ket{11}\bra{11}) \\ tr(\rho^{2}) &= \frac{1}{2}(1 + 1) \\ tr(\rho^{2}) &= 1 \\ \end{align} The trace operation is the addition of diagonal terms in a matrix and hence is equivalent to $\braket{\psi|\psi}$, where orthonormal relations mean that $\braket{00|11}$ and $\braket{11|00}$ do not contribute to the trace.

Now, let's look at the trace of first qubit state $\rho_{1}$, making use of the idempotent property of the identity matrix that $II=I$ \begin{align} \rho_{1} &= \frac{1}{2} I\\ \rho_{1}^{2} &= \frac{1}{4} I\\ tr(\rho_{1}^{2}) &= \frac{1}{4} tr\begin{bmatrix}1 & 0\\ 0 & 1\end{bmatrix}\\ tr(\rho_{1}^{2}) &= \frac{1}{4}(1+1)\\ tr(\rho_{1}^{2}) &= \frac{1}{2} \\ tr(\rho_{1}^{2}) &\lt 1 \end{align}

This impure first qubit state (which holds true for the second state because similarly $\rho_2=\frac{1}{2} I $) indicates uncertainty about the state of the qubit. We need the full Bell state, which is pure, to be fully certain about the state. This uncertainty about component systems, while being certain about the composite state, is a hallmark of entanglement.

Bloch Sphere

The Bloch sphere is another representation of individual qubits which I would recommend to become familiar with.

The Bloch sphere represents superpositions of the $\ket{0}$ and $\ket{1}$ states on the surface of a 3D sphere with a radius of 1. This shape is constrained by the fact that the total probability of states is $p_0+p_1=|\alpha|^{2}+|\beta|^{2}=1$ for $\ket{\psi}=\alpha\ket{0}+\beta\ket{1}$

Specifically, a general qubit state is $$\ket{\psi} = e^{i \gamma}(\cos(\frac{\theta}{2})\ket{0}+e^{i\phi} \sin(\frac{\theta}{2})\ket{1})$$

The $e^{i \gamma}$ factor, also known as the 'global phase factor', does not show up in measurement statistics, and hence can be ignored to give $$\ket{\psi} = \cos(\frac{\theta}{2})\ket{0}+e^{i\phi} \sin(\frac{\theta}{2})\ket{1}$$

For example, with $\theta=\frac{\pi}{4}$ and $\phi=\frac{\pi}{4}$, the Bloch sphere is

Annotated Bloch sphere

In the computational basis state $\ket{0}$, $\theta=0$ so the Bloch vector is vertically upwards along the positive 'z' axis

Bloch sphere in computational basis '0' state

For the orthogonal state in the computational basis $\ket{1}$, $\theta=\pi$ so the Bloch vector is vertically downwards along the negative 'z' axis

Bloch sphere in computational basis '1' state

Two other orthonormal basis sets are the $(\ket{+}, \ket{-})$ basis states and the $(\ket{+i}, \ket{-i})$ basis states. In these states $\theta=\frac{\pi}{2}$, which leads to $\cos(\frac{\pi}{4})=\sin(\frac{\pi}{4})=\frac{1}{\sqrt{2}}$, hence these basis states are 'equatorial' and along the 'xy' plane bisecting the Bloch sphere.

First, starting with the $\ket{+}$ and $\ket{-}$ states, these have the 'relative' phase $\phi$ of respectively $0$ and $\pi$, which when substituted into $e^{i\phi}$ leads to $e^{0}=1$ and $e^{i\pi}=-1$ respectively, hence \begin{align} \ket{+}&=\frac{1}{\sqrt{2}}(\ket{0}+\ket{1}) \\ &=\frac{1}{\sqrt{2}}\begin{bmatrix}1 \\ 1\end{bmatrix} \\ \ket{-}&=\frac{1}{\sqrt{2}}(\ket{0}-\ket{1}) \\ &=\frac{1}{\sqrt{2}}\begin{bmatrix}1 \\ -1\end{bmatrix} \\ \end{align}

On the Bloch sphere, $\ket{+}$ is along the positive 'x' axis

Bloch sphere in Hadamard basis '+' state

And the orthonormal basis state $\ket{-}$ points to the opposite side of the Bloch sphere, along the negative 'x' axis

Bloch sphere in Hadamard basis '-' state

Secondly, the $\ket{+i}$ and $\ket{-i}$ states have the relative phase $\phi$ of respectively $\frac{\pi}{2}$ and $\frac{3\pi}{2}$, leading to $e^{i\frac{\pi}{2}}=i$ and $e^{i\frac{3\pi}{2}}=-i$, so

\begin{align} \ket{+i}&=\frac{1}{\sqrt{2}}(\ket{0}+i\ket{1}) \\ &=\frac{1}{\sqrt{2}}\begin{bmatrix}1 \\ i\end{bmatrix} \\ \ket{-i}&=\frac{1}{\sqrt{2}}(\ket{0}-i\ket{1}) \\ &=\frac{1}{\sqrt{2}}\begin{bmatrix}1 \\ -i\end{bmatrix} \\ \end{align}

The Bloch sphere representation of the $\ket{+i}$ state is along the positive 'y' axis

Bloch sphere in circular basis '+i' state

The orthonormal $\ket{-i}$ state is in the opposite direction, along the negative 'y' axis

Bloch sphere in circular basis '-i' state

Quantum Circuits

This section is currently a work in progress.

Simple Demonstrations of Quantum Effects

This section is currently a work in progress

Further Comprehension

To internalise the concepts here, it helps to write out derivations with pen-and-paper.

The source YouTube playlist of the course is freely available and definitely worth checking out if you prefer learning in a lecture format. Specifically, I based this chapter on lectures 1.1 and 1.2.

Another recommended resource which I've consulted when writing this, is the seminal textbook 'Quantum Computation and Quantum Information' (10th Anniversary Edition) by Michael Nielsen and Isaac Chuang. While this is unfortunately not freely available, it is worth going through this or another textbook (or a similar resource) and solve questions with pen-and-paper to build a more rigorous understanding after reviewing this course. In the case of 'Quantum Computation and Quantum Information', I would recommend going through chapter 1, which an overview of quantum computing, and chapter 2 which introduces the maths of quantum mechanics.