Little Data With a Big Solution Space — the Sweet Spot for Quantum Computing

  1. In a nutshell
  2. Quantum computer as a coprocessor
  3. Centrality of quantum parallelism
  4. Quantum advantage is the whole point
  5. Quantum supremacy
  6. Combinatorial explosion
  7. The essential goal: exponential speedup
  8. Big solution space
  9. Little data — limited input data, limited results
  10. Little computation — quantum computations must be simple
  11. No, large computations are not appropriate for quantum computing
  12. Big data
  13. 3 Vs of big data
  14. Big data data models
  15. Structured big data is not amenable to quantum computing
  16. Big data graphs are not amenable to quantum computing either
  17. Big data images are not amenable to quantum computing
  18. Big data text is not amenable to quantum computing
  19. Semi-structured big data is not amenable to quantum computing
  20. Big data media is not amenable to quantum computing
  21. 3Vs of big data are not amenable to quantum computing
  22. Caveat — no good examples, yet
  23. Variational methods as little data
  24. Hybrid quantum/classical computing
  25. Shor’s algorithm exemplifies little data with big solution space
  26. Quantum computers have no I/O or database access
  27. Big data with a quantum computer
  28. Hadamard transform to achieve parallelism
  29. The basic model for quantum parallel computation
  30. The challenge: extracting results
  31. Extracting values from a large solution space
  32. Using interference to select values from a large solution space
  33. Quantum phase estimation (QPE) to extract a value
  34. Quantum Fourier transform (QFT) to extract a value
  35. How many qubits does a quantum computer need?
  36. Solution space size
  37. Shot count or circuit repetitions
  38. Circuit repetitions could dramatically reduce any quantum advantage
  39. Pay attention to net or effective quantum advantage
  40. Don’t forget wall clock time
  41. Classical sampling, Monte Carlo simulation
  42. Approximate solutions
  43. Input data and parameters for a quantum circuit
  44. Generally, at least twenty qubits will be needed for significant quantum advantage
  45. Generally, over fifty qubits will yield a dramatic quantum advantage — and even quantum supremacy
  46. But what about super-exponential problems?
  47. Will quantum error correction (QEC) be required?
  48. Where are we today?
  49. Will quantum computing ever be able to process big data?
  50. Caveat for D-Wave Systems
  51. What about IBM’s new quantum roadmap for one million qubits?
  52. Future advanced quantum computer architectures
  53. Universal quantum computers
  54. Conclusions

In a nutshell

At its heart, quantum computing is all about quantum parallelism, which is the evaluation of a relatively simple computation over a very big solution space.

  1. Little data in.
  2. Little computation in a big solution space. Quantum parallelism.
  3. Little data out.
  4. Repeat enough times to get a statistically-meaningful result.

Quantum computer as a coprocessor

Quantum computing is not expected to fully replace classical computing any time soon. In fact, there won’t ever be any applications which are 100% pure quantum. Rather, applications will be a combination of classical and quantum code, referred to as hybrid quantum/classical computing.

Centrality of quantum parallelism

The essence of quantum parallelism is to perform a single computation once, but with an n-qubit register initialized to a superposition of all 2^n possible binary values for a range for a parameter of the computation. It takes some cleverness and tricks to extract a single result from that parallel computation, but that’s the essence of quantum computing.

Quantum advantage is the whole point

The overall goal of quantum computing is quantum advantage — that a quantum computer will complete a task in far less time than a classical computer. Not just 10% to 20% faster, not just 40% to 50% faster, not just 200% faster, not just 500% faster, not just 10 times faster or 50 or 100 times faster, but many 1,000’s of times faster or millions of times faster or even billions of times faster or even more.

Quantum supremacy

Taken to the extreme, some tasks take so long that they simply can’t be practically implemented on a classical computer, a very large cluster of classical servers, or even on a classical supercomputer — and complete in our lifetimes. If such a task can be implemented on a quantum computer, this is not only a quantum advantage but is also called quantum supremacy.

Combinatorial explosion

Generally, problems are harder — take more time — to solve as they get bigger. The question is at what rate the time to complete a task grows as the amount of input data grows. As long as that rate of growth is fairly low, you can probably solve the problem on a classical computer, but if the rate of growth is high, the problem gets harder and harder to solve on a classical computer, to the point where eventually it takes too long to be practical or tolerable.

The essential goal: exponential speedup

The quality that is needed to give a quantum computer a quantum advantage and to tackle problems with combinatorial explosions is called exponential speedup. This is the essential goal of quantum computing.

Big solution space

A combinatorial explosion of data produces what is called a big solution space. A solution space is simply the full range of possible combinations to be considered to solve a problem.

Little data — limited input data, limited results

Even with the power of quantum parallelism it is not possible to process a large amount of input data or to produce a large amount of output data. This is why this paper refers to little data.

  1. Small amount of input data. And parameters.
  2. Relatively simple calculation applied in parallel to a very big number of combinations of the limited amount of input data. This is the big solution space.
  3. Extraction of a small amount of result data.

Little computation — quantum computations must be simple

Although a quantum computer could evaluate a computation over all values in a big solution space in parallel, the actual computation itself must be relatively simple.

No, large computations are not appropriate for quantum computing

As already indicated, quantum computing is focused on little computations evaluated over a big solution space. Large computations consisting of large amounts of code and requiring significant memory or mass storage or I/O or database access are not going to be appropriate — or even possible — on a quantum computer alone.

Big data

Big data is a vague and ambiguous term, but for the purposes of discussion here, it refers to a large number of rows or records of data — millions or billions of rows or records, or gigabytes (billions of bytes) or terabytes (trillions of bytes) of data, or even larger.

3 Vs of big data

Just for reference, big data is commonly characterized with three terms which each begin with a V — the 3 Vs of big data:

  1. Volume. Lots of data.
  2. Velocity. At a high rate.
  3. Variety. In a lot of different formats.

Big data data models

Not all big data is created equal. Big data comes in a variety of forms:

  1. Rows. Large number of linear rows of structured data.
  2. Graphs. With large numbers of nodes and edges.
  3. Images. Large numbers of large images, possibly with deep color.
  4. Text. Large amounts of unstructured text. Or semi-structured text.
  5. Semi-structured data. A combination of any of the above. Such as structured documents or XML files.
  6. Media. Audio, video, etc. Large sizes, in complex data formats.

Structured big data is not amenable to quantum computing

As a coprocessor, a quantum computer is well-suited to performing a single computation, maybe in parallel for a large number of discrete values, but still only a single computation, while structured data involves a number of distinct data values, each requiring its own computation.

Big data graphs are not amenable to quantum computing either

Graph databases are a special category of big data, where the data is not organized as a linear sequence of rows, but as a mathematical graph with nodes and edges. A lot of specialized classical code can be optimized to traverse and process graphs, but none of that is relevant to quantum computing.

Big data images are not amenable to quantum computing

Although very tiny images can be processed on a quantum computer — a few dozen pixels or very small thumbnail images, large images such as a typical photograph (megapixels) with significant color depth are well beyond the limited capabilities of both current quantum computers and those projected over the next decade.

Big data text is not amenable to quantum computing

Large quantities of text (unstructured data), such as log files, news, documents, books, etc. are not readily processed by current or projected quantum computers.

Semi-structured big data is not amenable to quantum computing

Since neither large quantities of structured data nor raw text are amenable to quantum computing, neither is semi-structured big data, such as large structured documents or large XML files.

Big data media is not amenable to quantum computing

Digital media, such as audio (music, podcasts), and video (YouTube videos, TV shows, feature-length movies) is completely inappropriate for processing on a quantum computer, being well beyond the limited capabilities of both current quantum computers and those projected over the next decade.

3Vs of big data are not amenable to quantum computing

None of the three Vs of big data are amenable to quantum computing:

  1. Volume. No quantum advantage for big data.
  2. Velocity. Again no quantum advantage for big data.
  3. Variety. Not appropriate for quantum computing.

Caveat — no good examples, yet

Although this paper suggests that little data with a big solution space has real potential for quantum computing, unfortunately there are not yet any good or great examples of solutions which achieve this effect. The potential is there, but reality hasn’t yet caught up with the promise — and may not for quite a few years.

Variational methods as little data

Some problems, such as in quantum computational chemistry, are simply too difficult for even the best of current quantum computers. A compromise, hybrid approach is taken for such problems using what are called variational methods which is an iterative approach using classical code to help guide the iterations of the quantum logic.

  1. Pick an initial trial value. Possibly at random or guided by some a priori domain knowledge about the problem. This is called an ansatz.
  2. Perform one iteration of the quantum calculation, producing an intermediate result.
  3. Use classical code to examine the intermediate quantum result and optimize it to guide it in a presumably more fruitful path — or decide whether it is close enough to a final solution to the problem. This optimization yields a revised ansatz.
  4. Rinse and repeat. Repeat steps 2 and 3 until either the classical code decides the quantum result is close enough to a final solution, or if so many iterations have been attempted without achieving an acceptable solution that continuing to search for a solution is fruitless — and the computation fails.

Hybrid quantum/classical computing

The proper term for what was just described is hybrid quantum/classical computing, where parts of the application are classical code while parts are quantum code.

Shor’s algorithm exemplifies little data with big solution space

I have my own reservations about the practicality of Shor’s algorithm for factoring large semiprime numbers (product of two prime numbers) — useful for cracking large public encryption keys — but it is the ultimate example of the concept of little data with a big solution space.

Quantum computers have no I/O or database access

It may seem shocking, but since a quantum computer operates as a simple coprocessor, it has no ability to perform I/O or to access files or a database, or to access any resources over a network.

Big data with a quantum computer

Since a quantum computer has no I/O, file or database access, or network access, that means that it can only process a fairly small amount of data within a single quantum circuit. That doesn’t suggest a big opportunity for solving big data problems. Technically, one could in fact use a quantum computer to solve a big data problem, but since only a small amount of data can be processed at each step, there is little opportunity to achieve a true quantum advantage with exponential speedup. If there is no substantial quantum advantage over a classical computer, there’s no reason to use a quantum computer.

  1. A very small amount of data for each computation.
  2. A relatively small number of those computations.

Hadamard transform to achieve parallelism

Without going into the gory details, the Hadamard transform in the form of the Hadamard gate (or H gate) is the key tool for achieving quantum parallelism.

The basic model for quantum parallel computation

The basic model for achieving exponential speedup is to evaluate a single computation in the context of a parameter which ranges over a big number of values. In particular, with n qubits, a total of 2^n values, in the form of 2^n quantum states, can be evaluated.

  1. Initialize an n-qubit quantum register to a superposition of all 2^n discrete values of the range of the desired parameter to be varied using a Hadamard transform (H gate on each qubit.)
  2. Perform the desired computation on the n-qubit register. Any input values must be encoded in the gate structure of the quantum circuit.
  3. Transform the 2^n computation results (quantum states) to an n-qubit result register, such as using quantum phase estimation (QPE), a quantum Fourier transform (QFT), or some other method which exploits quantum interference to map the 2^n computed values (quantum states) to a single n-qubit value to be extracted from the solution space.
  4. Measure the n-qubit result register, producing an n-bit classical result.
  5. Perform classical processing on the n-bit classical measured result.
  6. Repeat the above to develop an expectation value (distribution) for the results. The shot count input parameter specifies how many circuit repetitions to perform. Exactly how to select the actual result from the distribution will depend on the application.

The challenge: extracting results

The great challenge when evaluating a computation across a big solution space is exactly how to extract results. There is no easy answer for all situations. The answer is simply that you have to be clever. That is the focus of countless academic papers on quantum computing.

Extracting values from a large solution space

Three of the main approaches to extracting a result value from a large solution space are:

  1. Interference
  2. Quantum phase estimation (QPE)
  3. Quantum Fourier transform (QFT)

Using interference to select values from a large solution space

This paper won’t go into the details, but interference is one of the primary clever tricks used to select values from a large solution space. As with classical wave interference, two waves can reinforce or cancel each other — with cancellation causing a quantum state to be ignored or discounted, and reinforcement causing a quantum state to be selected or weighed more heavily.

Quantum phase estimation (QPE) to extract a value

One specialized method to exploit quantum interference to extract a value from the large solution space of a quantum computation is quantum phase estimation (QPE). Details are beyond the scope of this paper.

Quantum Fourier transform (QFT) to extract a value

Another specialized method to exploit quantum interference to extract a value from the large solution space of a quantum computation is quantum Fourier transform (QFT). Details are beyond the scope of this paper.

How many qubits does a quantum computer need?

When we say that n qubits support 2^n quantum states or 2^n discrete values, that is not to say that the quantum computer needs only n qubits. These n qubits are the ones that participate in a parallel computation, but the quantum circuit may need more qubits, even many more qubits for its full computation.

Solution space size

Some examples for solution space size based on the number of qubits used to represent a range of values:

  1. 4 qubits = 2⁴ = 16 quantum states (or discrete values.)
  2. 6 qubits = 2⁶ = 64 quantum states.
  3. 8 qubits = 2⁸ = 256 quantum states.
  4. 10 qubits = 2¹⁰ = 1024 = 1K quantum states.
  5. 12 qubits = 2¹² = 4096 = 4K quantum states.
  6. 16 qubits = 2¹⁶ = 64K quantum states.
  7. 18 qubits = 2¹⁸ = 256K quantum states.
  8. 20 qubits = 2²⁰ = 1M (one million) quantum states.
  9. 24 qubits = 2²⁴ = 16M (sixteen million) quantum states.
  10. 28 qubits = 2²⁸ = 256M (256 million) quantum states.
  11. 30 qubits = 2³⁰ = 1G (one billion) quantum states.
  12. 40 qubits = 2⁴⁰ = 1T (one trillion) quantum states.
  13. 50 qubits = 2⁵⁰ = 1Q (one quadrillion or one million billion) quantum states.

Shot count or circuit repetitions

Even with the quantum parallelism of n qubits evaluating a calculation over 2^n discrete values, that apparent advantage must be discounted to account for circuit repetitions (also called shot count) which are needed to account for the probabilistic nature of quantum computing.

Circuit repetitions could dramatically reduce any quantum advantage

If you’re really lucky, circuit repetitions could be merely dozens or low hundreds. But that’s probably for only a relatively small number of qubits.

  1. Shot count = 10. Very unrealistic, but to establish a baseline or best case. Reduces 1,000,000 to 1 advantage to 100,000 to 1.
  2. Shot count = 100. Also unrealistic. Reduces 1,000,000 to 1 advantage to 10,000 to 1.
  3. Shot count = 1,000. Possible, but not guaranteed. Reduces 1,000,000 to 1 advantage to 1,000 to 1.
  4. Shot count = 2,000. Possible, but not guaranteed. Reduces 1,000,000 to 1 advantage to 500 to 1.
  5. Shot count = 5,000. Reduces 1,000,000 to 1 advantage to 200 to 1.
  6. Shot count = 10,000. Reduces 1,000,000 to 1 advantage to 100 to 1.
  7. Shot count = 50,000. Reduces 1,000,000 to 1 advantage to 20 to 1.
  8. Shot count = 100,000. Reduces 1,000,000 to 1 advantage to 10 to 1.
  9. Shot count = 1,000,000. Reduces 1,000,000 to 1 advantage to 1 to 1 — no advantage at all.

Pay attention to net or effective quantum advantage

Just to reemphasize the point of the preceding section, the raw quantum advantage of a quantum algorithm may be exponential — 2^n, but that raw advantage must be discounted by the number of circuit repetitions needed to achieve a statistically meaningful result.

Don’t forget wall clock time

Many computations can be performed in such a tiny fraction of a second or minutes that nobody could care how much time it really is, but the combination of shot count or circuit repetitions coupled with practical limitations on how many circuit repetitions can be performed in a second (e.g., low thousands, not millions) could cause a quantum computation to take multiple seconds, tens of seconds, or even minutes or hours. The Google quantum supremacy experiment took over three minutes — and it wasn’t attempting to do anything practical.

Classical sampling, Monte Carlo simulation

Even when the solution space is large, classical computing is not automatically out of the running. A variety of clever techniques can be used, such as sampling. Monte Carlo simulation is a popular sampling technique. For many problems a few hundred or a few thousand samples may be sufficient to gain a decent approximation of the overall solution space.

Approximate solutions

Sampling is one approach to achieve an approximate solution. Some problems really do require precise solutions, but approximate solutions are usually quite acceptable.

Input data and parameters for a quantum circuit

Since there is no way for a quantum circuit to read input data, any input data — our little data — as well as any input parameters must be encoded in the gate structure of the quantum circuit.

Generally, at least twenty qubits will be needed for significant quantum advantage

A computation involving twenty qubits would involve 2²⁰ or roughly one million quantum states. A one-million to one advantage would generally be considered a significant advantage that would not be easy to overcome using even very clever techniques on even a large cluster of classical computers.

  1. Shot count = 10. 24 qubits.
  2. Shot count = 100. 27 qubits.
  3. Shot count = 1,000. 30 qubits.
  4. Shot count = 2,500. 32 qubits.
  5. Shot count = 5,000. 33 qubits.
  6. Shot count = 10,000. 34 qubits.
  7. Shot count = 25,000. 35 qubits.
  8. Shot count = 50,000. 36 qubits.
  9. Shot count = 100,000. 37 qubits.
  10. Shot count = 250,000. 38 qubits.
  11. Shot count = 1,000,000. 40 qubits.
  12. Shot count = 10,000,000. 44 qubits.
  1. 19 qubits = 512K quantum states. Net quantum advantage 512.
  2. 18 qubits = 256K quantum states. Net advantage 256.
  3. 17 qubits = 128K quantum states. Net advantage 128.
  4. 16 qubits = 64K quantum states. Net advantage 64.
  5. 15 qubits = 32K quantum states. Net advantage 32.
  6. 14 qubits = 16K quantum states. Net advantage 16.
  7. 13 qubits = 8K quantum states. Net advantage 8.
  8. 12 qubits = 4K quantum states. Net advantage 4.
  9. 11 qubits = 2K quantum states. Net advantage 2.
  10. 10 qubits = 1K quantum states. Net advantage 1 — no net advantage.

Generally, over fifty qubits will yield a dramatic quantum advantage — and even quantum supremacy

Over fifty qubits in a single parallel computation is essentially a slam-dunk for quantum advantage — and even quantum supremacy as well.

But what about super-exponential problems?

A problem or algorithm with super-exponential complexity is one whose algorithmic complexity is worse than a problem or algorithm which has exponential complexity. The traveling salesman problem has factorial complexity, which is worse than exponential complexity.

Will quantum error correction (QEC) be required?

The model of computation discussed in this paper is independent of whether the physical quantum computer supports quantum error correction (QEC) or not.

Where are we today?

These days, we’re ecstatic when somebody can simply figure out how to even solve a problem at all on a quantum computer, even if the solution is very restricted and doesn’t even provide any quantum advantage at all.

Will quantum computing ever be able to process big data?

Never say never, but at present there isn’t even a speculative theoretical prospect for a quantum computer to be able to deliver an exponential speedup for big data. As noted, relatively small computations can be performed on a relatively small amount of input data. Any exponential speedup comes when the computation has a large solution space.

Caveat for D-Wave Systems

D-Wave Systems has a specialized quantum computer which comes with a hardwired application algorithm which is focused on a fairly narrow niche of optimization problems, which it handles fairly well, but doesn’t have the gate-level flexibility or generality of other quantum computers.

What about IBM’s new quantum roadmap for one million qubits?

Just as I was about to finalize this paper IBM released its roadmap to one million qubits:

  1. One million qubits or even 5, 10, 20, 50, or 100 million qubits just won’t cut it for truly big data. And even a single million qubits is a whole decade away, or more.
  2. It’s not clear if one million qubits is physical qubits or logical qubits. Either way is still not enough for big data. I suspect it is one million physical qubits, which at 16 to 25 physical qubits per logical qubit would have a capacity of 40,000 to 62,500 logical qubits, which is certainly not appropriate for big data.
  3. The 1,121-qubit Condor system planned for late 2023 still has a relatively limited capacity for data.
  4. Still no prospect of I/O, file or database access, or network connectivity to access large amounts of data from within a quantum circuit.
  5. Still no prospect for rich data and media types.
  6. Still no prospect for structured data.
  7. Still no prospect for complex processing, especially for large amounts of data. Still focused on the little data with a big solution space model detailed in this paper. The roadmap does refer to “feedback and feed-forward system capabilities, where we’ll be able to control qubits based on classical conditions while the quantum circuit runs”, but that is too vague and sounds too limited to suggest true complex processing, let alone for large amounts of data.

Future advanced quantum computer architectures

The discussion in this paper is limited to existing notions of the design and architecture of quantum computers as known and practiced today and likely in the near future, but those notions may evolve in a dramatic manner in the coming years and decades, so that quantum computers 5, 10, 15, and 20 years from now may operate in a radically different manner than they do today.

  1. Quantum communication.
  2. Quantum networking.
  3. Quantum measurement.
  4. Quantum sensing.

Universal quantum computers

Someday we may see the advent of advanced universal quantum computers which truly and fully integrate the capabilities of both classical and quantum computers, possibly opening the door to quantum computing in the context of big data.

Conclusions

The key factors to consider:

  1. A quantum computation using anything less than 20 qubits will not deliver a substantial quantum advantage — like one million to one.
  2. Shot count significantly reduces the quantum advantage. If a shot count of 25,000 is needed, you would need a computation using at least 35 qubits for a single parallel computation before you would see a net quantum advantage of one million to one.
  3. Shot count may not scale well for some algorithms so that the net quantum advantage might be relatively small — 4 to 64 — even when a significant number of qubits are used.
  4. Some algorithms may always take longer than on a classical computer if the shot count is too high.
  5. A net quantum advantage of less than 256 may not be very useful since it is relatively easy to run a quantum algorithm in parallel on that many classical machines.
  6. Sure, you may be able to demonstrate quantum advantage on under 20 qubits, but it won’t be very impressive for computations using less than about 32 to 40 qubits.
  7. Although 50 qubits or so (maybe 53–60) is nominally enough to get to quantum supremacy, shot count requirements may add another 10 to 20 qubits to achieve quantum supremacy. 60–80 is a safer bet. And this is the number of qubits used in a single parallel computation, not total qubits on the machine or used by the overall algorithm.
  8. There may be applications which are very sensitive to wall clock time, so that even a very modest quantum advantage could be significant. But the potential of running the classical algorithm in parallel on dozens or hundreds of classical machines should be taken into account as well before deriving the true net quantum advantage.
  9. Even with IBM’s ten-year roadmap, we’re still limited to relatively small amounts of input data.
  10. The discussion here is based on current conceptions of quantum computer architectures — all bets are off for more advanced notions of quantum computer architectures, including some form of universal quantum computer which may in fact be perfect for super-big data.

--

--

Freelance Consultant

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store