A Detailed Look at the Fundamentals of Quantum Machine Learning

Nethma Peiris

Nethma Peiris

September 20 2024

cover image 2

Table of Content

  • Introduction
  • What is Quantum Computing?
  • Connecting Quantum Computing with Machine Learning
  • Easiest Way to Understand QML: Probability
  • Foundations of Quantum Machine Learning
  • Training a Quantum Machine Learning Model
  • Optimizing Quantum Machine Learning Models: Loss Functions and Techniques
  • Conclusion

Introduction

Imagine flipping a coin. It’s either heads or tails, right? But while the coin is still spinning in the air, it exists in a state that is neither heads nor tails—this concept of superposition is at the core of quantum computing. Unlike classical computing, which relies on the binary states of 0s and 1s, quantum computing harnesses the principles of quantum mechanics to explore multiple possibilities simultaneously, unlocking profound computational advantages.

This article delves into the world of Quantum Machine Learning (QML), where the power of quantum computing intersects with the dynamic field of machine learning. We’ll begin by exploring foundational concepts such as qubits, quantum gates, and quantum measurements. From there, we’ll dive into the intricacies of training QML models, examining the role of loss functions and the application of advanced metrics like Kullback–Leibler Divergence. By the end, you'll have a comprehensive understanding of the key components that drive Quantum Machine Learning.

What is Quantum Computing?

There is a lot of exciting news around quantum computing, and most people understand it as being faster or capable of replacing the current uses of computing done by supercomputers. However, quantum computing is not merely a faster or a bigger version of traditional computing. It introduces a completely different approach to computing with the potential to solve complex problems that are currently beyond the capabilities of today’s computers.

Let's imagine an open-world simulation game where players explore an uncharted island. In the beginning, they can only explore on foot, slowly uncovering small portions of the island. As they progress, they invent tools like wheels, and eventually, they develop high-speed vehicles. These advancements allow them to explore the island much faster and reach areas that were previously inaccessible. This journey mirrors the evolution of traditional computing. Early computers were like walking on foot—slow and limited in capability. As technology advanced and data became increasingly available, computers became more powerful, allowing us to explore new areas and solve increasingly complex problems. However, even with the most advanced vehicles, some areas of the island in the game remain unreachable, representing the limitations in our current computing capabilities and resources to achieve certain tasks.

Suppose those unreachable parts in the game world are distant islands. Here, boats come into play. Boats don't necessarily travel faster than cars, but they open up entirely new possibilities by allowing players to explore areas that cars can't reach—like discovering new islands.

Quantum computing is like introducing boats into this simulation. While traditional computers (cars) are great for land-based tasks, quantum computers (boats) are designed to perform entirely different tasks, opening up new realms of possibilities. Just as boats allow players to explore islands that were previously unreachable, quantum computing enables us to tackle challenges and solve problems that classical computing can't, expanding the boundaries of what we can achieve.

In general terms, quantum computing is the cutting-edge field of computer science that harnesses the unique principles of quantum computers to solve problems that even the most powerful classical computers struggle to handle.

Connecting Quantum Computing with Machine Learning

The rise of Quantum computing is monumental. Machine learning is already prevalent with multiple applications. Most machine learning applications find their place in quantum computing and vice versa. In classical machine learning(the one we know today), neural networks are trained to achieve specific outcomes. Similarly, in quantum computing, quantum circuits are tuned to produce the required results, paralleling the process of classical machine learning. This relationship between quantum machine learning and classical machine learning is manifested in several ways.

1. Enhanced Computational Capabilities: Quantum computers can perform computational tasks in different ways, which allows machine learning algorithms to run more efficiently.

2. New Learning Models: Quantum machine learning introduces unique concepts like superposition and entanglement, which can be used to create new models that are fundamentally different from classical models. This could potentially lead to breakthroughs in how we approach learning tasks.

3. Data Privacy and Security: Quantum machine learning can enhance security by leveraging the principles of quantum cryptography. These algorithms enhance the security and maintain the privacy of sensitive data during the training process.

In classical machine learning, two primary approaches dominate: supervised learning and unsupervised learning, with other branches like reinforcement learning and semi-supervised learning also playing significant roles. Let's focus on supervised and unsupervised learning. Discriminative models, also known as conditional models, are primarily used in supervised machine learning. Therefore, the majority of supervised machine learning tasks can be viewed as discriminative learning. On the other hand, generative models learn the underlying probability distributions of the data, making it suitable for unsupervised learning.

For instance, in discriminative learning, we might train a model using pictures of dogs so that, given a new image, the model can predict whether or not it contains a dog. On the other hand, in generative learning, we train a model to generate new images of dogs based on the available dataset.

Classical computers work more naturally with supervised learning, whereas quantum computers are more suited for unsupervised learning. Zapata AI mentions that classical computers excel in tasks where specific input-output relationships are defined, mirroring supervised learning. In contrast, quantum computers are adept at generating truly random numbers—something classical computers struggle with—and can perform tasks without relying on specific inputs, akin to generative learning models. In both cases, parameter tuning is necessary.

Easiest Way to Understand QML: Probability

The easiest way to understand quantum machine learning is through the concept of probability. Consider a basic probability problem involving a coin flip. Imagine we have a biased or unbiased coin, and after flipping it ten times, we observe six heads and four tails. Now, suppose we have two models that can simulate these ten coin flips: one model predicts a 60% probability of getting heads and 40% of getting tails, while the second model predicts 10% heads and 90% tails. The first model is more likely to have generated the results we observed, making it a better fit for our data. This is a basic representation of generative machine learning models.

Frame 40412.png

In generative machine learning, we provide general data—like the outcomes of a coin flip—to a model. The model learns the patterns within this data and then generates new data that closely mirrors what we would expect if we flipped the coin a few more times. These newly generated data points are much more likely to resemble the input data.

This article will discuss how quantum computers perform similar tasks in the context of quantum machine learning.

Foundations of Quantum Computing

Classical Qubit vs Quantum Qubit

In computing and digital communication, a bit is the most basic unit of information. Classical bits behave like a light switch. They have two distinct logical states: on-and-off states which are usually represented as 1 and 0 respectively. A bit can only be in one of these states, either on or off, at any given time. In classical computers, data is stored by manipulating these bits.

In quantum computing, a quantum bit or qubit is a basic unit of quantum information. Qubits behave more like a slider. The special feature of a qubit is that it can be in both the on and off states simultaneously, as well as in any specific state between 0 and 1; this is called a superposition state. This means a qubit can be any combination of states between 0 and 1. Therefore, a qubit can be visualized as a whole sphere, known as the Bloch sphere. In this sphere, the north pole is denoted as |0> (ket notation) for the on state, and the south pole is denoted as |1> for the off state. All the points on the surface of the sphere correspond to superpositions of states 0 and 1.

Frame 40413.png

Unlike a classical bit, which has 2 positions (on and off), a qubit has infinite states due to its superposition. However, this does not mean a single qubit stores an infinite amount of information compared to a classical bit. The amount of information a qubit can store depends on specific conditions.

Quantum Measurement and Its Implications

Accurate measurements remove uncertainty from our daily lives. Whether you're checking the temperature outside or seeing if a new bookcase will fit in your living room, measurement tools are vital. Instruments like thermometers, tape measures, scales, radar guns, and GPS devices interact with the physical world to provide precise data that guide our decisions and actions.

In the quantum world, the purpose of measurement is still to observe and understand an object's properties. However, what makes quantum measurements unique is that the values of quantum properties do not exist until the measurement is made. This doesn't mean we're entirely clueless before measuring; we can predict the range of possible outcomes. But, we cannot know which specific outcome will occur until the measurement is actually taken.

For instance, when we try to detect the superposition of a qubit, two things may happen. When a qubit comes close to the north pole and we observe it, it goes all the way up (state |0>) to the north pole, and similarly, it goes to the south pole. When this happens, it loses all the information of the qubit where it was from before.

Frame 40414.png

Therefore, when a qubit is at the equator of the Bloch sphere, there is a 50-50 probability that it will go either to the north or south pole upon measurement. This process is called quantum measurement/observation. The only way we can be certain is when it is completely at the north or south pole. Any other position on the sphere is inherently uncertain. The connection between quantum measurement and generative machine learning lies in the probabilistic nature of qubits. Each qubit can be conceptualized as representing a probabilistic model. For instance, to model the outcome of a biased coin flip, we can initialize the qubit in a quantum state that encodes the desired probability distribution. By applying quantum operations to this qubit, we can manipulate its state to represent various probabilistic scenarios. This capability allows quantum computers to perform complex generative tasks by simulating and modeling intricate probability distributions and patterns.

Exploring Different Quantum Measurement Bases

As we may have realized, measurements in quantum computing might seem counter-intuitive. When we observe a qubit, we change its status, causing it to “collapse” into a definitive state. However, let's assume the qubit is determined and we are passively not changing it; then the measurements become contradictory. This phenomenon can be illustrated through the “glove box experiment”.

In the glove box experiment, there is a glove that could either be red or blue and can take the shape of either a right hand or a left hand. When we place the glove in a box, we don’t know its color or shape. If we want to determine the color and shape of the glove, we can do so in two ways.

The first way is to determine the color of the glove through a hole in the box. Let’s assume it appears to be red, but we still don’t know its shape. The second observation is to close our eyes and use our hand to determine the shape of the glove. During this observation, it appears to be a right-hand glove. We cannot conduct these two observations simultaneously. If we check the glove's color again using the first method, it may now appear blue. This discrepancy arises because we assumed the glove's properties were deterministic and that observing them wouldn’t change things. When we force the glove to be in a superposition of shape and color, and then observing it causes it to change, similar to the behavior of a qubit. Qubits can be measured in different bases. One common basis is the one corresponding to the north and south poles of the Bloch sphere. Similarly, we can measure qubits based on any pair of antipodal points on the sphere. The probability of measuring a particular outcome depends on how close the qubit is to a specific antipodal point—closer proximity means a higher probability of collapsing to that point. Measuring different bases reveals different properties of the qubits.

Combining this idea with the glove experiment, we can assume the north pole of the Bloch sphere represents the red region and the south pole represents the blue region in Figure 6. The |+⟩ and |−⟩ states, could represent the glove's shape—whether it is a right or left hand. Just as we can measure different properties of the glove using different methods, we can measure various properties of qubits using different measurement bases. However, these measurements cannot be taken simultaneously.

Frame 40415.png

For instance, at a given time, we measure the color of the glove; either be at the north pole, representing the red color, or at the south pole, representing the blue color. However, since the qubit aligns with an unbiased position (equator to the shape's antipodes), it can be either left or right in shape. Therefore, when we observe the shape of the glove, it can be either left or right.

In quantum computing, there are three primary bases for measuring qubits:

1. Z basis: Corresponds to the states |0⟩ and |1⟩ (north and south poles).

2. X basis: Corresponds to the states |+⟩ and |−⟩ (equator points related to shape in the analogy).

3. Y basis: Corresponds to the states |+i⟩ and |−i⟩, which are related to complex numbers.

For the purpose of this discussion, we'll focus on the general concepts of measurement without delving into the complexities of the Y basis.

Frame 40416.png

Introduction to Quantum Gates

Any qubit in any state acts as a simple generative machine learning model. For a particular outcome, once we determine what state our qubit should be in, we need a method to achieve that state. This is where quantum gates come into play.

The concepts of gates in both classical and quantum computers function similarly. In classical computers, there are bits representing 1 and 0. To flip these bits, we use gates such as AND, OR, NOT, XOR, etc. In quantum computers, there are qubits instead of classical bits. Like the gates in classical computers, quantum computer gates are used to achieve a particular state of a qubit. This article will discuss fundamental gates, specifically rotation gates and entanglement gates.

Rotation Gates: Manipulating Qubit States

The task of the rotation gate is to rotate the qubit. For example, if a qubit is in state 0 and we want to rotate it to a particular position, we use a rotation gate. To do this, we need the correct set of rotation gates. There are three fundamental rotation gates, each corresponding to one of the fundamental axes: X, Y, and Z. Along any of these axes, we can rotate by an angle θ. These gates are called RX(θ), RY(θ), and RZ(θ) respectively.

Frame 40417 (1).png

For example, if we use rotation gates to simulate the coin flip example, imagine starting with a, the qubit is in the |0⟩ state, that is the zeroth superposition (North pole of the Bloch sphere), and after we send it through the rotation gate (RY(/2)), it comes to the perfect 50-50 superposition of state |0> and |1>. After that, it is measured using a standard basis. Since the qubit is in a 50-50 superposition state between state zero and state one, it can go to either state one or zero. For this particular measurement, let’s assume it goes to state one (the South Pole of the Bloch sphere). If we run this many times, we get roughly 50% of heads (North Pole observations) and 50% of tails (South pole observations). Therefore, this rotation gate acts as a model of a perfect coin flip. Likewise, we can decide the angle of rotation and represent the unbiased coin flip model through these rotation gates.

Frame 40418.png

For more complex scenarios, such as modeling two unbiased coin flips, which give 25% of each observation, we can demonstrate that particular problem with two qubits whose measurements of each qubit are independent.

Entanglement Gates: Correlating Qubit Measurements

When dealing with two coins that give biased results (both coins are more likely to give heads), it can be represented using two quantum circuits. We can align qubits to give results more toward heads. The harder problem arises when we consider two coins that give 50% head results and 50% tail results, but the combination between heads and tails has 0% probability. If we use two independent coins, no matter how many times we try, we cannot achieve such a result. Therefore, the only conclusion we can draw is that both coins are somehow correlated. This is where the idea of entanglement comes in.

Quantum entanglement means that when there are two different qubits in certain states, if somehow we can entangle them, their measurements become correlated. When we separate the entangled qubits and take them to two different far distances, measuring the state of one qubit instantly correlates with the measurement of the other qubit. This phenomenon is what Einstein referred to as "spooky action at a distance." This entanglement results in qubits sharing information faster than the speed of light. For the purposes of this article, the general idea of entanglement is sufficient for machine learning purposes. The idea of different levels of entanglement is what matters.

For example, the flip of two different unbiased, independent coins is considered not entangled. On the other hand, two coins that only give heads and tails are considered to be in the most entangled state, known as the Bell state. Between the unentangled and most entangled states, there are several gates that vary in their effects. A specific gate, which combines with an angle, represents the unentangled state at its lowest angle. As we increase the angle, the correlation between the two qubits gets higher. This gate is called the Mølmer–Sørensen gate. For the specific entangled coin example discussed above, we can use the YYΘ gate.

Frame 40419.png

Quantum Circuit Topologies

After scientists learned how to rotate and align qubits as desired, they realized it is not an easy task to build all the states they want. This can be illustrated by a coin flip example with four different coins, where each coin flip yields only 50% heads and 50% tails results, with all 16 possible combinations of results considered as zero. This represents a highly entangled situation, as mentioned above.

This scenario can be modeled using a quantum circuit. The implementation of these particular problems follows a standard approach. Instead of building circuits from scratch for every situation, scientists focus on the angle of the qubits. This is similar to most machine learning approaches, where we use a fixed architecture and find the optimal parameters that yield the best solutions.

There are three standard approaches, also known as circuit topologies, discussed in this article:

1. Star Topology: In this topology, all Y-Y gates connect the first qubit to all the others.

Frame 40420.png 2. Line Topology: In this topology, each qubit is connected to the next one in a line.

Frame 40421.png 3. All Topology: This is the most complex topology, where each pair of qubits is connected.

Frame 40422.png

Just like in regular neural networks, we can implement more layered quantum circuits to build models. Such circuits are known as Quantum Circuit Born Machines (QCBMs), discovered by Marcello Benedetti, Delfina Garcia-Pintos, Oscar Perdomo, Vicente Leyton-Ortega, Yunseong Nam, and Alejandro Perdomo-Ortiz.

Training a Quantum Machine Learning Model

When building a model for the coin flip problem, the goal is to measure the quantum circuit multiple times to obtain a specific output distribution, which can be represented as a histogram.

Frame 40423.png

To achieve these desired outputs, we need to adjust the angles of the qubits to align with the target distribution. This process is analogous to model training in quantum computing.

To illustrate this concept, we use a new dataset. Although the dataset has changed, the underlying idea remains similar to the four-coin flip example. Instead of using zeros and ones, we use a "bars and stripes" dataset. This dataset is commonly used in generative models and consists of images inside a given rectangle that represent various colorings of vertical bars or horizontal stripes.

Each element in the bars and stripes dataset can be encoded using bit strings. For example, one color might be represented as 1 and the other as 0. The 16 possible ways to color a 2x2 grid of boxes correspond to all possible bit strings of length four. This representation is analogous to the four-flipping coins scenario.

Frame 40424.png

The distribution of the bars and stripes dataset and the corresponding probability distribution is discussed. The model training aims to mimic this distribution using Quantum Circuit Born Machines (QCBMs).

Quantum machine learning models resemble neural networks. Just as neural network training involves adjusting parameters to achieve the optimal solution, the training of QCBMs involves finding the optimal angles and entanglements to match the desired distribution.

The model training process will be explained using the star topology:

1. Initial State: All qubits start in the zeroth state. The topology includes layers of Y rotations, Z rotations, and entangling gates

Frame 40425.png 2. Y Rotations: The qubits are first rotated in the Y direction. Frame 40426.png 3. Z Rotations: Next, the qubits are rotated in the Z direction. Frame 40427.png 4. Entanglement: The first two qubits are entangled in a specific direction. Frame 40428.png 5. Further Entanglement: The first qubit is then entangled with the third qubit. Frame 40429.png 6. Additional Entanglement: Finally, the first qubit is entangled with the fourth qubit. Frame 40430.png

After entanglement, the qubits are measured. The measurement process involves adjusting the qubit states. For this example, assume that after measurement, the qubits align in the following positions: up (head), down (tail), down (tail), and up (head). This circuit transforms the state from 0000 to 0110.

Frame 40431.png

If the circuit is run again, the outcome may differ. The training process involves iterating to get the output histogram closer to the expected distribution.

Training begins with random angles, such as theta 1 to theta 11. The measured histogram will initially look like this: Frame 40432.png

To achieve the desired output histogram, the training process adjusts the angles to minimize the difference between the measured and expected histograms. This process involves finding the optimal angles that produce the closest possible match to the target histogram.

A key challenge in training is determining the most suitable architecture. Similar to general machine learning, starting with a simple architecture and then refining it can be effective. The complexity of the model increases the likelihood of finding the desired set of angles but also extends the training time. Therefore, finding the simplest model that produces suitable results is crucial and requires testing various architectures.

The ultimate challenge is finding the optimal set of angles. This is where optimization techniques come into play, such as Data-Driven Quantum Circuit Learning (DDQCL). Initially, a random histogram is obtained through the quantum circuit. The iterative process starts by comparing the obtained histogram with the expected histogram. The goal is to adjust the qubit angles to minimize the discrepancy between the measured and target results.

Frame 40433.png

Several loss functions are used for comparison, including KL-Divergence, Log Likelihood, and Maximum Mean Discrepancy. To update the angles, methods such as Gradient Descent and Non-Gradient approaches can be employed. Non-gradient methods like CMA-ES and Particle Swarm Optimization are often used for optimization.

Optimizing Quantum Machine Learning Models: Loss Functions and Techniques

In quantum machine learning, optimizing the parameters of a quantum circuit involves evaluating how well the circuit's output matches the desired target distribution. Several loss functions are used for this purpose, each measuring the difference between two distributions in different ways.

KL Divergence (Kullback–Leibler Divergence)

KL Divergence is a widely used metric to compare two probability distributions. It quantifies how one probability distribution diverges from a second, reference probability distribution. In the context of quantum machine learning, KL Divergence is used to measure how closely the histogram obtained from a quantum circuit matches the expected histogram.

KL Divergence is particularly useful because it returns smaller values when the histograms are similar and larger values when they are dissimilar. This allows us to quantify the difference between the observed and target distributions effectively.

For example, if we have a target distribution and a distribution obtained from our quantum circuit, KL Divergence will help us understand how well our circuit is performing by giving us a numerical value that reflects the degree of discrepancy between these distributions.

In practice, optimizing a quantum circuit involves minimizing the KL Divergence between the obtained histogram and the target histogram. This process helps fine-tune the parameters of the quantum circuit to achieve the desired outcomes.

Conclusion

Quantum Machine Learning (QML) represents a transformative fusion of quantum computing and traditional machine learning techniques. By leveraging the principles of quantum mechanics, such as superposition and entanglement, QML offers novel ways to model and solve complex problems that classical approaches may struggle with.

Through the use of quantum circuits, gates, and advanced techniques like quantum entanglement and rotation gates, we can create powerful generative models capable of mimicking complex data distributions. The training of these models, akin to optimizing neural networks in classical machine learning, involves adjusting quantum circuit parameters to align with desired outcomes. Loss functions like KL Divergence play a crucial role in this optimization process, helping us measure and minimize discrepancies between the model’s output and the target distribution.

As the field of quantum computing continues to evolve, the integration of QML holds immense potential for breakthroughs in various domains, from drug discovery to optimization problems. The ability to harness quantum mechanics for machine learning tasks opens up new avenues for research and application, promising advancements that could reshape industries and expand our understanding of complex systems.

While QML is still an emerging field with challenges to overcome, its advancements signify a promising frontier where quantum computing and machine learning intersect. As researchers and practitioners continue to refine these techniques and explore their possibilities, we move closer to realizing the full potential of quantum-enhanced machine learning.

share-icon

Related Articles :