\documentclass[]{article}
\begin{document}
\title{Quantum Computing and Shor's Algorithm}
\date{Original Spring 1999, Revised 2015}
\author{Matthew Hayward}
\maketitle
\pagebreak
\tableofcontents
\pagebreak
\section{Preface}
This paper is intended to be a beginners introduction to the field of
quantum computing. As I began to study this topic at the University of
Illinois under the supervision of Dr\. Roy Campbell in 1999, there was
a lack of introductory material to the topic. This paper will
hopefully serve as an introduction to the rudiments of quantum
computing and the specifics of Shor's algorithm for factoring large
numbers.
In order to get the most out of this paper, readers should be familiar
with the following topics:
\begin{itemize}
\item Binary representation of numbers
\item Complex numbers
\item Vector mathematics
\end{itemize}
That being said there are appendices covering the bare essentials of
those topics that are needed to fully comprehend this paper.
The field of quantum computing has its own vocabulary, most of the
novel terms used in this paper are described in the glossary at the
end.
\section{Introduction}
Astounding advances have been made in the manufacture of computers
between 1950 and 1999. The number of atoms needed to represent a bit
in memory has been decreasing exponentially since 1950. Likewise the
number of transistors per chip, clock speed, and energy dissipated per
logical operation have all followed their own improving exponential
trends. Despite these fantastic advances, the manner in which all
computers function has been essentially identical: combinations of a
few simple logical operations that process some number of operations
per unit time. This rate of improvement in transistor size cannot be
sustained much longer, at the current rate as of 199 in the year 2020
one bit of information will requite only one atom to represent it.
The problem is that at that level of miniaturization the behavior of
the components of a computer will become dominated by the principles
of quantum physics. (Williams, Clearwater)
With the size of components in classical computers shrinking to where
the behavior of the components may soon be dominated more by quantum
physics than classical physics researchers have begun investigating
the potential of these quantum behaviors for computation.
Surprisingly it seems that a computer whose components are able to
function in a quantum are more efficient than any classical computer
can be.
It is the physical limitations of the classical computer, and the
possibilities for the quantum computer to perform certain useful tasks
more rapidly than any classical computer which drive the study of
quantum computing.
\section{The Classical Computer}
\subsection{Turing Machines}
One of the people most directly responsible for the current concept of
computing machines is Alan Turing. Other early giants in the field of
Computer Science include Kurt Godel, Emil Post, and Alonzo Church.
Turing and others proposed mathematical models for computing which
allowed for the study of algorithms and in absence of any particular
computer hardware. These abstraction have proved invaluable in the
field of computer science, allowing for general results about
algorithms and their run time complexity to be obtained without having
to consider a variety of potential computation engines on which those
algorithms would run. Turing's model is called a Turing machine. The
design of the Turing machine is the following: The machine consists of
an infinite tape divided into cells, each cell can contain a 1, a 0,
or a blank. There is also a head which can move about the tape, read
a cell, write to a cell, change its internal state, or end
computation. It is the internal state of this head that is the
program which decides what action the head will take next. (Steane)
As simple as this model appears, it encompasses all the functionality
of today's most modern supercomputer. That is to say, given the time
and memory, there is not a single computation that can be performed on
a modern supercomputer that can not be performed on a Turing machine.
The immense value of the Turing machine is that when examining what
sort of functions can be computed by a computer, one need only to examine
a Turing machine, and not one of the millions of potential computing
devices to determine if a computation is possible. If a Turing
machine can perform the computation, then it is computable. If a
Turing machine can not, then the function can not be computed by any
classical computer. (Shor)
\subsection{Church-Turing Thesis}
This bold claim, that any computer is essentially equivalent to a
Turing machine grew out of contemporaneous work by Alonzo Church and
Alan Turing, and is variously referred to as Church's Thesis, the
Church-Turing Thesis, the Turing-Church thesis, the Church-Turing
conjecture, and Turing's thesis. There are several equivalent
formulations of the thesis stated at different times and to different
degrees of rigor, but the central idea is that:
\begin{quote}
Every effectively calculable function can be produced by a
Turing machine.
\end{quote}
The claim here is that anything which one would like to compute, which
one reasonably could compute (for instance given pen and paper, and
enough time), are precisely those things which a Turing machine can
compute.
The Church Turing thesis is perhaps best understood as a
\textbf{definition} of the types of functions that are calculable in
the real world - not as a theorem to be proven. As evidence for the
suitability of this as a definition, multiple (indeed every one
considered to far) distinct models of computation have been shown to
be equivalent to the Turing model with regard to what functions they
can compute.
Beyond the mere ability of a computing device to calculate a function
is the consideration of how long it would take to do so. This notion
is generalized by the time complexity of a given algorithm, which
considers the number of steps or operations an algorithm would have
to perform relative to the length of the input in order to compute the
function.
There are various extensions of the Church-Turing thesis that address
the relative efficiency of different models of computation, such as
the \textbf{Complexity-Theoretic Church-Turing Thesis} states that:
\begin{quote}
A Turing machine can efficiently simulate any realistic model
of computation.
\end{quote}
Here \emph{efficiently} means a Turing machine can simulate any
realistic model of computation with at most polynomial time expansion
of the computation time or other resources (such as memory). This
hypothesis may be viewed either as a definition of what a realistic
model of computation is, or as a statement of fact (which may turn out
to be false) about realistic models of computation. There are no
known models of computation which are thought to be realizable that
violate this thesis. The discovery of a model which did violate this
thesis would likely be an astonishing breakthrough in the fields of
computational complexity, computer science, and/or physics.
These theses gives us insight into the power of computing machines.
If a computer theoretically can compute all the functions a Turing
machine can compute (given enough memory and time) then it is as
powerful as a Turing machine.
\subsection{Running Time and Complexity Classes}
Turing produced the Turing machine in an effort to answer the question
``Is there a mechanical procedure by which the truth or falsity of any
mathematical conjecture can be decided?'' Turing reasoned that if
there was such a method, and if it were truly mechanical in nature and
requiring no creative effort, then it could be performed by a Turing
machine. The proof offered by Turing that there is no such procedure
is the same as the proof of Turing's Halting problem, which states
that no Turing machine $H$ can in general be given any arbitrary
Turing machine $X$ and its input ($I_x$) as its own ($H$'s) input, and
determine whether $X$ will run to completion on $I_x$. The
demonstration by Turing of a problem which can not be solved by a
Turing machine suggests the division of functions into two groups:
those that can be computed by a Turing machine and those that can not.
By the Church-Turing thesis we can generally say that functions that
are not Turing computable are simply ``not computable.'' It should be
stressed at this point that this classification of a function that is
not Turing computable as not computable in a general sense is due to
the \textbf{definitional} aspect of the Church-Turing thesis. It is a
logical possibility that some future computing machine will indeed be
able to compute some function that a Turing machine can not - however
no such machine is known, and the development of such a machine would
be a large breakthrough. Arguments can be made that no such machine
would be physically possible - however this is not a settled
question.
Determining if a function is computable at all is interesting, but this
distinction is not fine enough to determine if a function can
realistically be solved by a physical computer in a reasonable amount
of time. If a computation will take billions of years to compute, it
may as well be uncomputable from the perspective of the pragmatist. A
algorithm can be characterized by the number of operations and amount
of memory it requires to compute an answer given an input of size $n$.
These characterizations of the algorithm determine what is called the
algorithm's \emph{running time} or \emph{time complexity}.
Specifically, the time complexity of an algorithm to compute a
function is determined by looking at how the number of operations of
the algorithm scale with the size of the input of the function it
computes. We'll take the size of the input to be the number of bits
needed to represent that number in binary.
An algorithm which computes its output for an input of size $n$ using
resources (computational steps, memory, etc.) that is bounded above by
a polynomial function of $n$ are said to be of \emph{polynomial
time}. For example if an algorithm with an input size of 10 bits
took $10^{4} + 7*10^{2} + 1001$ operations to compute its answer it would
be considered a polynomial time algorithm.
Problems which can be solved in polynomial time or less are generally
deemed to be \emph{tractable}. An algorithm which solves a problem in
polynomial time may be referred to as \emph{efficient}. Problems which
require more than polynomial time are usually considered to be
\emph{intractable}, for example an algorithm which would require $2^{n}$
operations for an input size of $n$ would be considered intractable,
as the number of operations grows exponentially with the input size,
not polynomially. Generally tractable problems are considered to be
those which may practically be solved in the real world. (Shor)
In complexity theory, problems are often restated in terms of a
\emph{decision problem} - this means that the function of interest
takes in its input and produces a yes or no answer.
Computer Scientists have grouped problems into a variety of complexity
classes, below are some of the more well known.
\begin{description}
\item[P]: The class of decision problems which can be answered in
polynomial time.
\item[NP]: Nondeterministic polynomial time, the class of decision
problems where a candidate for an answer can be verified as a
correct answer or not in polynomial time.
\item[NP-complete]: The ``hardest'' problems in NP, this set has the
interesting property that if any NP-complete problem can be solved
in polynomial time, then P and NP are in fact the same. Whether P
equals NP is an outstanding problem in computer science and complexity theory.
\end{description}
(Cormen, Leiserson, Rivest)
Given these distinctions, to determine if an function may be
\emph{efficiently} computed by a classical computer there are two
questions that must be answered. First, can a Turing machine compute
the function, if so then the function is called computable. Second
the time complexity of the algorithm to be used must by considered.
In general if an algorithm requires polynomial time or less to compute
it is considered to be tractable, and if not it is considered to be
intractable.
Accordingly, interest in quantum computing is twofold. First it is of
interest if a quantum computer can compute functions which are
uncomputable on a classical computer, and second it is of interest
whether an algorithm which is intractable on a Turing machine may be
tractable on a quantum computer.
\section{The Quantum Computer}
\subsection{Quantum Physics}
When considering the possible efficiency of a computer which makes use
of the results of quantum physics, it is helpful to know a little of
quantum physics itself. Quantum physics arose from the failure of
classical physics to offer correct predictions on the behavior of
photons and other elementary particles. Quantum physics has since
been under intense scrutiny, as some of its predictions seem very
strange indeed. Nevertheless experiments verify the same strange
behavior which leads skeptics to challenge the veracity of Quantum
physics.
\subsection{The Classical Bit}
To understand the ways in which a quantum computer is different from a
classical computer you must first understand the rudiments of the
classical computer. The most fundamental building block of a
classical computer is the bit. A bit is capable of storing one piece
of information, it can have a value of either 0 or 1. Any amount of
information can be encoded into a list of bits. In a classical
computer a bit is typically stored in a silicone chip, or on a metal
hard drive platter, or on a magnetic tape. About $10^{10}$ atoms were
used to represent one bit of information in 1999. The smallest
conceivable storage for a bit involves a single elementary particle of
some sort. For example, for any particle with a spin-1/2
characteristic (such as a proton, neutron, or electron), it's spin-1/2
characteristic on measurement will be either $+1/2$ or $-1/2$. We can
thus encode a bit using a single particle by mapping 1 to be spin
$+1/2$ and 0 to be $-1/2$, assuming we can measure and manipulate the
spin of such a particle. If we were to try to use this spin-1/2
particle as a classical bit, one that is always in the 0 or 1 state,
we would fail. We would be trying to apply classical physics on a
scale where it simply is not applicable. This single spin-1/2
particle will instead act in a quantum manner. (Williams, Clearwater)
This spin-1/2 particle which behaves in a quantum manner could be the
fundamental building block of a Quantum computer. We could call it a
qubit, to denote that it is analogous in some ways to a bit in a
classical computer. Just as a memory register in a classical computer
is an array of bits, a quantum memory register is composed of several
spin-1/2 particles, or qubits. There is no particular need for the
spin-1/2 particle, equally well we could use a Hydrogen atom, and
designate its electron being measured in the ground state to be the 0
state, and it being in the first excited state to be the 1 state.
There are a multitude of possible physical qubit representations that
could work. For simplicity I will discus only the spin-1/2 particle
from here on, but analogous arguments could be made for many things.
\subsection{State Vectors and Dirac Notation}
We wish to know exactly how the behavior of the spin-1/2 particle, our
qubit, differs from a that of a classical bit. Recall that a
classical bit can store either a 1 or a 0, and when measured the value
observed will always be the value stored. Quantum physics states that
when we measure the spin-1/2 particles state we will determine that it
is in the $+1/2$ state, or the spin $-1/2$ state. In this manner our
qubit is not different from a classical bit, for it can be measured to
be in the $+1/2$, or 1 state, or the $-1/2$, or 0 state. The
differences between the qubit and the bit come from what sort of
information a qubit can store when it is not being measured.
According to quantum physics we may describe the state of this
spin-1/2 particle by a state vector in a Hilbert Space.
A Hilbert Space is a linear, complex vector space.
In a linear vector space you may add and multiply vectors that lie
within the space together and the resulting vector will still lie
within that space. For example you are probably familiar with the
$x$, $y$, $z$ coordinate system where the $x$, $y$, and $z$ axes are
mutually perpendicular real number lines, which coincide at the point
$x = 0$, $y = 0$, $z = 0$ - this is an example of a linear vector
space.
In a complex vector space the lengths of the vectors within the space
are described with complex numbers. Complex numbers are numbers which
take the form $a + i*b$, where $a$ and $b$ are real numbers, and $i$
is defined to be the square root of negative one.
(Williams, Clearwater)
In the Hilbert Space for our state vector, which describes the state
of our spin-1/2 particle, these perpendicular axes will correspond to
each possible state that the system can be measured in. So our
Hilbert Space for a single qubit will have two perpendicular axes, one
corresponding to the spin-1/2 particle being in the $+1/2$ state, and
the other to the particle being in the $-1/2$ state. These states
which the vector can be measured to be are referred to as
\emph{eigenstates}. The vector which exists somewhere in this space
which represents the state of our spin-1/2 particle is called the
\emph{state vector}. The projection of the state vector onto one of
the axes shows the contribution of that axes' eigenstate to the whole
state. This means that in general, the state of the spin-1/2 particle
can be any combination of the base states. In this manner a qubit is
totally unlike a bit, for a bit can exist in only the 0 or 1 state,
but the qubit can exist, in principle, in any combination of the 0 and
1 state, and is only constrained to be in the 0 or 1 state when we
measure the state.
Now I will introduce the standard notation for state vectors in
Quantum physics. The state vector is written with the following way
and called a \emph{ket vector} $|\psi \rangle$. Where $\psi$ is a
list of numbers which contain information about the projection of the
state vector onto its base states. The term ket and this notation
come from the physicist Paul Dirac who wanted a concise shorthand for
writing formulas that occur in Quantum physics. These formulas
frequently took the form of the product of a row vector with a column
vector. Thus he referred to row vectors as \emph{bra vectors}
represented as $\langle y|$. The product of a bra and a ket vector
would be written $\langle y|x \rangle$, and would be referred to as a
\emph{bracket}. (Williams, Clearwater)
There is nothing special about this vector notation, you may think of
any state vector being written as a letter with a line over it, or as
a bold letter as vectors are normally denoted. If you do further
reading into this area you will almost assuredly come across the bra
and ket notation, which is why it is presented.
\subsection{Superposition and Eigenstates}
Earlier I said that the projection of the state vector onto one of the
perpendicular axes of its Hilbert Space shows the contribution of that
axes' eigenstate to the whole state. You may wonder what is meant by
the \emph{whole state}. You would think (and rightly so, according to
classical physics) that our spin-1/2 particle could only exist
entirely in one of the possible $+1/2$ and $-1/2$ states, and
accordingly that its state vector could only exist lying completely
along one of its coordinate axes. If the particle's axes are called
$x$ and $y$, and the state vector's x coordinate which denotes the
contribution to the $-1/2$, or 0 state, and y coordinate which denotes
the contribution to the $+1/2$, or 1 state, should only be (1,0), or
(0,1).
That seems perfectly reasonable, but it simply is not correct.
According to quantum physics a quantum system can exist in a mix of
all of its allowed states simultaneously. This is the Principle of
Superposition, and it is key to the power of the quantum computer.
While the physics of superposition is not simple at all,
mathematically it is not difficult to characterize this kind of
behavior.
\subsection{The Quantum qubit}
Back to our qubit, our spin-1/2 particle. Now that we know that while
it can only be measured to have a spin of $+1/2$ or $-1/2$, it may in
general be in a superposition of these states when we are not
measuring it. We could refer to its state in the following manner.
Let $x_{1}$ be the eigenstate corresponding to the spin $+1/2$ state,
and let $x_{0}$ be the eigenstate corresponding to the spin $-1/2$
state. Let $X$ be the total state of our state vector, and let
$w_{1}$ and $w_{0}$ be the complex numbers that weight the
contribution of the base states to our total state, then in general:
\[|X \rangle = w_{0} * |x_{0} \rangle + w_{1} * |x_{1} \rangle \equiv (w_{0},w_{1})\]
At this point it should be remembered that $w_{0}$ and $w_{1}$, the
weighting factors of the base states are complex numbers, and that
when the state of $X$ is measured, we are guaranteed to find it to be
in either the state:
\[0 * |x_{0} \rangle + w_{1} * |x_{1} \rangle \equiv (0,w_{1})\] or the state
\[w_{0} * |x_{0} \rangle + 0 * |x_{1} \rangle \equiv (w_{0},0)\]
This is analogous to a system you may be more familiar with, a vector
with real weighting factors in the a two dimensional plane. Let the
base states for this two dimensional plane be the unit vectors $x$,
and $y$. In this case we know that the state of any vector $V$ can be
described in the following manner:
\[V = x_{0} * x + y_{0} * y \equiv (x_{0},y_{0})\]
Our state vector is a unit vector in a Hilbert space, which is similar
to vector spaces you may be more familiar with, but it differs in that
the lengths of the vectors are complex numbers. It is not necessary
from a physics perspective for the state vector to be a unit vector
(by which I mean it has a length of 1), but it makes for easier
calculations further on, so I will assume from here on out that the
state vector has length 1. This assumption does not invalidate any
claims about the behavior of the state vector. To see how to convert
a state vector of any length to length 1 see appendix A.
With all this in mind we have fully defined the basic building block
of our quantum computer, the qubit. It is fundamentally different
from our classical bit in that it can exist in any superposition of
the 0 and 1 states when it is not being measured. (Barenco, Ekert,
Sanpera, Machiavello)
\subsection{The Quantum Memory Register}
Up till now we have considered a two state quantum system,
specifically our spin-1/2 particle. However a quantum system is by no
means constrained to be a two state system. Much of the above
discussion for a 2 state quantum system is applicable to a general $n$
state quantum system.
In an $n$ state system our Hilbert Space has $n$ perpendicular axes,
or eigenstates, which represent the possible states that the system
can be measured in. As with the two state system, when we measure our
$n$ state quantum system, we will always find it to be in exactly one
of the $n$ states, and not a superposition of the $n$ states. The
system is still allowed to exist in any superposition of the $n$
states while it is not being measured.
Mathematically if two state quantum system with coordinate axes
$x_{0}$, $x_{1}$ can be fully described by:
\[|X \rangle = w_{0} * |x_{0} \rangle + w_{1} * |x_{1} \rangle \equiv (w_{0},w_{1})\]
Then an $n$ state quantum system with coordinate axes $x_{0},
x_{1},\ldots, x_{n-1}$ can be fully described by:
\[|X \rangle = \sum_{k = 0}^{n-1} w_{k} * |x_{k} \rangle \]
In general a quantum system with $n$ base states can be represented by
the $n$ complex numbers $w_{0}$ to $w_{n-1}$. When this is done the
state may be written as:
\[ |X \rangle = \left( \begin{array}{c}
w_{0} \\
w_{1} \\
\vdots \\
w_{n-1}
\end{array} \right) \]
Where it is understood that $w_{k}$ refers to the complex weighting
factor for the $k$'th eigenstate.
Using this information we can construct a quantum memory register out
of the qubits described in the previous section. We may store any
number $n$ in our quantum memory register as long as we have enough
qubits, just as we may store any number $n$ in a classical register as
long as we have enough classical bits to represent that number. The
state of the quantum register with $n$ states is give by the formula
above. Note that in general a quantum register composed of $n$ qubits
requires $2^{n}$ complex numbers to completely describe its state. A
$n$ qubit register can be measured to be in one of $2^{n}$ states, and
each state requires one complex number to represent the projection of
that total state onto that state. In contrast a classical register
composed of $n$ bits requires only $n$ integers to fully describe its
state.
This means that one can store an exponential amount of information in
a quantum register relative to the size of that register. Here we see
some of the first hints that a quantum computer could be exponentially
more efficient than a classical computer in some respects. Recall
that from our discussion of complexity that problems which can be
solved in polynomial time are generally thought of as being tractable,
and that problems which can be solved in exponential time are thought
of as intractable. If a quantum computer can be is exponentially more
efficient than a classical computer for some algorithms, then some
problems currently intractable may become tractable! This is a large
part of the motivation for the study of quantum computing.
\subsection{Probability Interpretation}
Now that we know how to represent our state vector as a superposition
of states, and we know that we can only measure the state vector to be
in one of the base states, it would seem that there would be some sort
of discrepancy. We must determine what happens when we measure the
state vector. We know from quantum physics that given an initial
condition the state vector will evolve in time in accordance with
Schr\"{o}denger's equation:
\[ih\frac{\partial |X(t) \rangle }{\partial t} = H(t)|X(t) \rangle\]
Where $i$ is the square root of negative one, $h$ is $1.0545*10^{-34}$
Js, and $H$ is the Hamiltonian operator, which is determined by the
physical characteristics of the system being evolved.
In our notation this expression is:
\[ih\frac{\partial w_{i}(t)}{\partial t} = \sum_{j}H(t)_{ij} * w_{j}(t)\]
This evolution would appear to be continuous, but these equations only
apply to a quantum mechanical system evolving in isolation from the
environment. The only way to observe the state of the state vector is
to in some way cause the quantum mechanical system to interact with
the environment. When the state vector is observed it makes a sudden
discontinuous jump to one of the eigenstates, but this is not a
violation of Schr\"{o}denger's equation. When the quantum mechanical
system interacts with the outside environment and measured the state
vector is said to have collapsed. (Williams, Clearwater)
Now understanding that a state vector will collapse when it interacts
with the external environment, we still need to know in what manner
this collapse happens. To perform any sort of useful calculation we
must be able so say something about which base state a quantum
mechanical system will collapse into. The probability that the state
vector will collapse into the $j$'th eigenstate, is given by
$|w_{j}|^{2}$ which is defined to be $a_{j}^{2} + b_{j}^{2}$ if $w_{j}
= a_{j} + i*b_{j}$, where $w_{j}$ is the complex projection of the
state vector onto the $j$'th eigenstate. In general the chance of
choosing any given state is
\[Prob(j) = \frac{|w_{j}|^{2}}{\sum_{k = 0}^{ n-1} |w_{k}|^{2}}\] but
as mentioned earlier we will insist on having the state vector of
length one, and in this case the probability expression simplifies to
$Prob(j) = |w_{j}|^{2}$.
So now we know how to construct an $n$ state quantum system, which can
be placed in an arbitrary superposition of states. We also know how
to measure the resultant superposition and get a base state with a
certain probability. This is all that we need to understand about our
quantum memory register to be able to simulate its behavior.
\subsection{Quantum Parallelism}
Given that our quantum memory register differs from a classical one in
that it can store a superposition of the base states of the register,
one might wonder what this implies as to the efficiency of quantum
computing. The study of quantum computing is relatively new, most
give credit to Richard Feynman for being the first to suggest that
there were tasks that a quantum computer could perform exponentially
better than a classical computer. Feynman observed that a classical
computer could not simulate a quantum mechanical system without
suffering from exponential slowdown. At the same time he hinted that
perhaps by using a device whose behavior was inherently quantum in
nature one could simulate such a system without this exponential
slowdown. (Feynman)
Several quantum algorithms rely on something called quantum
parallelism. Quantum parallelism arises from the ability of a quantum
memory register to exist in a superposition of base states. A quantum
memory register can exist in a superposition of states, each component
of this superposition may be thought of as a single argument to a
function. A function performed on the register in a superposition of
states is thus performed on each of the components of the
superposition, but this function is only applied one time. Since the
number of possible states is $2^{n}$ where $n$ is the number of qubits
in the quantum register, you can perform in one operation on a quantum
computer what would take an exponential number of operations on a
classical computer. This is fantastic, but the more superposed states
that exist in you register, the smaller the probability that you will
measure any particular one will become.
As an example suppose that you are using a quantum computer to
calculate the function $\mathcal{F}(x) = 2*x \bmod 7$, for $x$
integers between 0 and 7 inclusive. You could prepare a quantum
register that was in a equally weighted superposition of the states
0-7. Then you could perform the $2*x \bmod 7$ operation once, and the
register would contain the equally weighted superposition of
1,2,4,6,1,3,5,0 states, these being the outputs of the function $2*x
\bmod 7$ for inputs 0 - 7. When measuring the quantum register you
would have a $2/8$ chance of measuring 1, and a $1/8$ chance of
measuring any of the other outputs. It would seem that this sort of
parallelism is not useful, as the more we benefit from parallelism the
less likely we are to measure a value of a function for a particular
input. Some clever algorithms have been devised, most notably by Peter
Shor and L. K. Grover which succeed in using quantum parallelism on a
function where they are interested in some property of all the inputs,
not just a particular one.
\subsection{The Efficiency of the Quantum Computer}
Given the possible efficiency of quantum parallelism much work has
been done to show formally with mathematical proofs how quantum
computers differ from classical ones in the complexity classes and
running times of various algorithms. Here is a short list of some of
the landmarks in the study of the efficiency of quantum computers.
In 1980 Paul Benioff offered a classical Turing machine which used
quantum mechanics in its workings, thus showing that theoretically a
quantum computer was at least as powerful as a classical computer.
(Benioff)
In 1982 Richard Feynman showed that a classical Turing Machine (and
hence any classical computer if the Complexity-Theoretic Church-Turing
Thesis holds) could not simulate a quantum mechanical system without
suffering exponential slowdown. (Feynman)
David Deutsch and Richard Jozsa showed in a paper in 1992 that there
was an algorithm that could be run in poly-log time on a quantum
computer, but required linear time on a deterministic Turing machine.
This may have been the first example of a quantum computer being shown
to be exponentially faster than a deterministic Turing
machine. However, the problem could also be solved in poly-log time in
a probabilistic Turing machine, a Turing machine which is capable of
making a random choice. (Deutsch, Jozsa)
Andre Berthiaume and Giles Brassard proved in a 1992 paper that $P \subset QP$, where $P$ is a complexity class as mentioned earlier and
$QP$ (also denoted as $EQP$) corresponds to problems which can be
solved in worst case polynomial time by a quantum computer, so there
are indeed problems which can be solved in polynomial time on a
quantum computer that can not be solved in a polynomial time with a
deterministic Turing machine. (Berthiaume, Brassard)
\section{Quantum Algorithms}
\subsection{Introduction to Shor's Algorithm}
By the early nineties it was known that a quantum computer could be
more efficient than any classical computer for certain tasks of the
Complexity-Theoretic Church-Turing thesis holds . Nonetheless
investigations into these areas were largely driven by academic
curiosity. There was not much economic motive for people to spend
lots of money or time trying to build a quantum computer.
That changed in 1994 when Peter Shor, a scientist working for Bell
Labs, devised a polynomial time algorithm for factoring large numbers
on a quantum computer. This discovery drew great attention to the
field of quantum computing.
\subsection{Motivation for Shor's Algorithm}
The algorithm was viewed as important because the difficulty of
factoring large numbers is relied upon for many cryptography systems.
If an efficient method of factoring large numbers is implemented most
of the many encryption schemes would be next to worthless to protect
their data from prying eyes. While it has not been proven that
factoring large numbers can not be archived on a classical computer in
polynomial time, as of 2015 the fastest algorithm publicly available
for factoring large number runs in
$O(exp(\frac{64}{9}n^{1/3} (log\ n)^{2/3})$, operations
where $n$ is the number of bits used to represent the number: this
runtime exceeds polynomial time. In contrast Shor's algorithm runs in
$O((\log n)^{2} * \log \log n)$ on a quantum computer, and then must
perform $O(\log n)$ steps of post processing on a classical computer.
Overall then this time is polynomial. This discovery propelled the
study of quantum computing forward, as such an algorithm is much
sought after. (Shor)
\subsection{Overview of Shor's Algorithm}
Shor's algorithm hinges on a result from number theory. This result
is:
\begin{quote}
The function $\mathcal{F}(a) = x^{a} \bmod n$ is a periodic
function, where $x$ is an integer coprime to $n$. In the context
of Shor's algorithm $n$ will be the number we wish to
factor. When two numbers are coprime it means that their
greatest common divisor is 1.
\end{quote}
Calculating this function for an exponential number of $a$'s would
take exponential time on a classical computer. Shor's algorithm
utilizes quantum parallelism to perform the exponential number of
operations in one step.
The reason why this function is of utility in factoring large numbers
is this:
\begin{quote}
Since $\mathcal{F}(a)$ is a periodic function, it has some period
$r$. We know that $x^{0} \bmod n = 1$, so $x^{r} \bmod n = 1$, and
$x^{2r} \bmod n = 1$ and so on since the function is periodic.
\end{quote}
Given this information and through the following algebraic
manipulation:
\[x^{r} \equiv 1 \bmod n\]
\[(x^{r/2})^{2} = x^{r} \equiv 1 \bmod n\]
\[(x^{r/2})^{2} - 1 \equiv 0 \bmod n\]
and if $r$ is an even number
\[(x^{r/2} - 1)(x^{r/2} + 1) \equiv 0 \bmod n\]
We can see that the product $(x^{r/2} - 1)(x^{r/2} + 1)$ is an integer
multiple of $n$, the number to be factored. So long as $x^{r/2}$ is
not equal to $\pm 1$, then at least one of $(x^{r/2} - 1)$ or $(x^{r/2} +
1)$ must have a nontrivial factor in common with $n$. So by computing
$\gcd(x^{r/2} - 1, n)$, and $\gcd(x^{r/2} + 1, n)$, we will obtain a
factor of $n$, where $\gcd$ is the greatest common denominator
function.
Here is a brief overview of Shor's algorithm, which is explained in
detail in the next section. Shor's algorithm tries to find $r$, the
period of $x^{a} \bmod n$, where $n$ is the number to be factored and
$x$ is an integer coprime to $n$. To do this Shor's algorithm creates
a quantum memory register with two parts. In the first part the
algorithm places a superposition of the integers which are to be $a$'s
in the $x^{a} \bmod n$ function. We will choose our $a$'s to be the
integers 0 through $q - 1$, where $q$ is the power of two such that
$n^{2} \leq q
#include "complex.h"
//Complex constructor, initializes to 0 + i0.
Complex::Complex(): real( 0 ), imaginary( 0 ) {}
//Argument constructor.
Complex::Complex( double r, double i ): real( r ), imaginary( i ) {}
//Complex destructor.
Complex::~Complex() {}
//Overloaded = operator.
Complex & Complex::operator=(const Complex &c) {
if (&c != this) {
real = c.Real();
imaginary = c.Imaginary();
return *this;
}
}
//Overloaded + operator.
Complex & Complex::operator+(const Complex &c) {
real += c.Real();
imaginary += c.Imaginary();
return *this;
}
//Overloaded * operator.
Complex & Complex::operator*(const Complex &c) {
real = real * c.Real() - imaginary * c.Imaginary();
imaginary = real * c.Imaginary() + imaginary * c.Real();
return *this;
}
//Overloaded == operator. Small error tolerances.
bool Complex::operator==(const Complex &c) const {
//This is to take care of round off errors.
if (fabs(c.Real() - real) > pow(10,-14)) {
return false;
}
if (fabs(c.Imaginary()- imaginary) > pow(10,-14)) {
return false;
}
return true;
}
//Sets private data members.
void Complex::Set(double new_real, double new_imaginary) {
real = new_real;
imaginary = new_imaginary;
return;
}
//Returns the real part of the complex number.
double Complex::Real() const {
return real;
}
//Returns the imaginary part of the complex number.
double Complex::Imaginary() const {
return imaginary;
}
\end{verbatim}
\subsection{qureg.h}
\begin{verbatim}
#include "complex.h"
#ifndef QUREG_H
#define QUREG_H
using namespace std;
class QuReg {
public:
//Default constructor. Size is the size in bits of our register.
//In our implementation of Shor's algorithm we will need size bits
//to represent our value for "q" which is a number we have chosen
//with small prime factors which is between 2n^2 and 3n^2 inclusive
//where n is the number we are trying to factor. We envision our
//the description of our register of size "S" as 2^S complex
//numbers, representing the probability of finding the register on
//one of or 2^S base states. Thus we use an array of size 2^S, of
//Complex numbers. Thus if the size of our register is 3 bits
//array[7] is the probability amplitude of the state |1,1,1>, and
//array[7] * Complex Conjugate(array[7]) = probability of choosing
//that state. We use normalized state vectors thought the
//simulation, thus the sum of all possible states times their
//complex conjugates is = 1.
QuReg(unsigned long long int size);
QuReg(); //Default Constructor
QuReg(const QuReg &); //Copy constructor
virtual ~QuReg(); //Default destructor.
//Measures our quantum register, and returns the decimal
//interpretation of the bit string measured.
unsigned long long int DecMeasure();
//Dumps all the information about the quantum register. This
//function would not be possible for an actual quantum register, it
//is only there for debugging. When verbose != 0 we return every
//value, when verbose = 0 we return only probability amplitudes
//which differ from 0.
void Dump(int verbose) const;
//Sets state of the qubits using the arrays of complex amplitudes.
void SetState(Complex *new_state);
//Sets the state to an equal superposition of all possible states
//between 0 and number inclusive.
void SetAverage(unsigned long long int number);
//Normalize the state amplitudes.
void Norm();
//Get the probability of a given state. This is used in the
//discrete Fourier transformation. In a real quantum computer such
//an operation would not be possible, on the flip side, it would
//also not be necessary as you could simply build a DFT gate, and
//run your superposition through it to get the right answer.
Complex GetProb(unsigned long long int state) const;
//Return the size of the register.
int Size() const;
private:
unsigned long long int reg_size;
Complex *State;
};
#endif
\end{verbatim}
\subsection{qureg.C}
\begin{verbatim}
#include