Shots and Circuit Repetitions: Developing the Expectation Value for Results from a Quantum Computer

Unlike classical computers which are known for their predictable determinism, quantum computers are inherently probabilistic by nature, which they inherit from the quantum mechanical physics underlying the implementation of their qubits, and made even less predictable by the variety of noise, errors, and environmental interference inherent in the technology of NISQ devices. The solution is to run the same quantum circuit many times and see what results occur most commonly. The average result of a number of repetitions of a quantum circuit (sometimes called shots) is roughly what physicists call the expectation value and is the closest that a quantum computer can come to the deterministic results characteristic of a classical computer. The focus of this informal paper is to explore and raise awareness of the issues rather than necessarily offer robust and definitive answers and recommendations.

Caveat: This informal paper explores expectation value, probability, and statistics to the extent needed by a typical quantum algorithm or application developer, and not necessarily as deep or as technically precise and correct to satisfy a physicist or mathematician.

Sections of this informal paper:

  1. Motivation
  2. The problem
  3. Two forms of uncertainty in a quantum computer
  4. Statistics to the rescue
  5. Statistical aggregation to develop the expectation value
  6. How does a classical computer avoid the inherent probabilistic nature of quantum mechanics?
  7. Apparent determinism = probability + statistics
  8. Stochastic processes
  9. Quantum computing seeks to exploit the inherent uncertainty of quantum mechanics
  10. Everything is fine, until you measure qubits
  11. Applying statistics to measurement
  12. Don’t forget the noise and errors
  13. Environmental interference
  14. Noise vs. errors vs. environmental interference
  15. Measurement errors
  16. Each qubit has its own noise and error profile
  17. Shot count, shots, or circuit repetitions
  18. How many shots or circuit repetitions are needed?
  19. You need to know what results to expect
  20. Expectation value
  21. Simplified concept of expectation value
  22. Raw quantum results
  23. Bit string cases
  24. Quantum registers
  25. Cases of series of results
  26. Possibility of garbage results
  27. What to do about outliers
  28. Actual vs. requested shot count
  29. What to do when there is no clear dominant result
  30. Just take the average of the results — or the most common result
  31. Manually examine all results
  32. What fraction of shots should be dedicated to probabilistic nature vs. noise?
  33. Testing to characterize effects of noise, errors, and environmental interference on results
  34. Four types of dynamic shot count
  35. Dynamic shot count based on parameters
  36. Automatic dynamic shot count?
  37. Won’t quantum error correction solve the problem?
  38. Wall clock time
  39. Context of a quantum simulator
  40. Design for portability
  41. Is 10 ever a reasonable shot count?
  42. IBM Qiskit default shots is 1024
  43. Is 100 shots reasonable?
  44. What is the maximum shots for a single circuit?
  45. No clear rules, methodology, or guidance for shot count
  46. What’s a good design goal for shot count when designing new algorithms?
  47. Rules of thumb for shot count
  48. Formulas for shot count
  49. Intuitions for shot count
  50. Need to balance shot count and wall clock time
  51. How many digits of precision are reasonable?
  52. How does precision affect shot count?
  53. Is it still too soon to worry too much about shot counts?
  54. One exception: true random number generation — only a single shot required
  55. Don’t confuse circuit repetitions with the iterations or layers of variational methods
  56. Don’t confuse circuit repetitions with experiments
  57. Iterations of variational algorithms
  58. Application experiments
  59. Watch out for total circuit executions
  60. Don’t get too used to the nuances and glitches of the quantum machine you used yesterday
  61. How does quantum volume impact shot count?
  62. What is the impact of shots on quantum advantage?
  63. What about quantum supremacy?
  64. Scaling shot count for Shor’s algorithm
  65. Advice to authors of papers on quantum algorithms
  66. Recommendations
  67. Conclusions
  68. What’s next?

Motivation

  1. Formal — or even informal — treatment of the question of how to determine how many shots (circuit repetitions) are appropriate for a particular quantum circuit.
  2. Any consistency in focus on the importance of shot count.
  3. How shot count should scale as the input data scales.
  4. How exactly specific changes in error rates and coherence times should impact shot count.
  5. Limitations of increasing shot count — at what point can you simply not get a statistically valid result from a quantum circuit due either to the nature of the circuit itself (probability) or the quantum hardware being too noisy.
  6. How to analyze a circuit to determine an optimum shot count.
  7. Formal formulas for computing shot count.
  8. Even informal rules of thumb for estimating shot count.
  9. Robust guidance for when the vendor default for shot count — or 100 or 1,000 — fails to deliver satisfactory results, what should the algorithm or application designer do — in a more methodical manner rather than a haphazard trial and error manner.
  10. Could shot count be computed or estimated automatically based on circuit and some statement on required precision and any wall clock time limit? IOW, parameters which the application developer is likely to know and understand.
  11. How exactly an application developer who needs a deterministic value should treat the probabilistic results from a quantum calculation. What statistical techniques can or should be used to derive an expectation value?
  12. How to tune shot count based on desired precision of expectation value.
  13. How extreme shot count requirements might impact quantum advantage. For real-world scenarios.

This informal paper won’t provide definitive answers to any of these and other questions and issues, but will at least lay out the questions and issues to lay the groundwork for others to try to begin addressing them.

Without meaningful answers, the nascent quantum computing sector will have great difficulty in advancing very far from today’s laboratory curiosity phase to production-scale real-world applications which really do deliver on the promise of quantum advantage and quantum supremacy.

The problem

And that’s just for a super-trivial calculation. For more complex computations the uncertainty in the outcome grows dramatically.

Why is this? Why is the outcome of a quantum computation so uncertain.

Two forms of uncertainty in a quantum computer

  1. The inherent probabilistic nature of quantum computers and indeed of quantum mechanics itself.
  2. Noise and errors due to our inability to construct an ideal, perfect quantum computer at this time — noise and errors in the physical components as well as environmental interference from outside of the quantum computer that we are currently unable to filter out well enough to prevent the sensitive components of the quantum computer from being affected. This includes decoherence as well — the quantum state of a qubit will decay over time.

Sure, over time we will be able to construct less-faulty and more ideal hardware, but even if all forms of noise and errors were absolutely eliminated, we are still stuck with the inherently probabilistic nature of quantum computers and quantum mechanics itself.

Statistics to the rescue

In much the same way that a flipped coin may sometimes be heads and sometimes tails and we can flip the coin a significant number of times and get the average fraction of the time that it comes out heads from a statistical perspective, we can do the same with quantum computing — rerun the same quantum circuit many times and see what result we get on average or the most number of times.

Statistical aggregation to develop the expectation value

By looking at a single data point we cannot tell much about it.

But by collecting a significant number of data points, all of which are supposedly a measurement of the same observable phenomenon, now we can apply statistical methods, such as average (or mean), variance, frequency distribution, buckets, etc. to get a better view of where the true value — the expectation value — lies.

How does a classical computer avoid the inherent probabilistic nature of quantum mechanics?

Again, statistics to the rescue. We’re not trying to build classical computers with discrete atoms, discrete electrons, or discrete spin states. Even the tiniest classical binary 0 or 1 bit is still represented with many thousands of atoms or electrons. The quantum effects are definitely there in each atom, each electron, and each spin state, but we’re using and measuring the average of many thousands of them. This is statistical aggregation at work.

When we measure a classical bit, there is always a threshold current, so that a binary 0 bit doesn’t have to be perfectly zero, and a binary 1 bit doesn’t have to be perfectly one. If the threshold current is exceeded, which is well above the level of zero, then a binary 1 value is presumed, otherwise a binary 0 value is presumed.

Sure, classical computers do sometimes get errors, but too rarely to be noticeable these days. In the early days of classical computing (1940’s and 1950’s, and even into the 1960’s) it was not uncommon for computers to fail fairly often. Some of the earliest machines could only run for a few minutes, at best. We’ve come a long way since then.

Apparent determinism = probability + statistics

In some cases, you actually may be able to achieve actual determinism using this approach, but much of the time it may be more of an illusion as occasional errors, noise, and some unexpected change in environmental interference might produce non-deterministic results that your classical application code hadn’t seen before. Of course, you should endeavor to design your shot count and statistical aggregation process to smooth over such potential glitches, but the process is not as deterministic as one might wish it to be, yet.

Stochastic processes

A quantum computer in general, or a set of qubits, or even a single qubit is a stochastic system or a stochastic process.

I will continue to refer to the probabilistic nature of a quantum computer, but the implication is that the reference is to the concept of a stochastic process.

That may not make this paper any more clear, but it does more firmly connect it to more formal mathematics and physics.

Quantum computing seeks to exploit the inherent uncertainty of quantum mechanics

All of this uncertainty is not actually a problem at all while a quantum computation is in progress. Quantum state is much more sophisticated than the simple binary 0 or 1 of a classical bit. A qubit can be in a superposition of both a binary 0 and a binary 1, with a different probability for each. Further, quantum state is based on complex numbers which support a phase, which is the so-called imaginary part of a complex number, and this phase can take on a range of values at the same time that the overall probability of 0 and 1 takes on a range of values. It can indeed get… complex, in contrast to the binary simplicity of classical computing. But it is this complexity that enables the raw power of quantum computing.

Entanglement between the quantum states of otherwise isolated qubits provides another resource for exploitation in quantum computations.

As long as the quantum computation is in progress, the various portions of the quantum state of each qubit are fairly stable — except when noise and errors creep in, but that’s separate from the probabilistic nature of quantum state itself.

As long as the quantum computation is isolated in the world of qubits and quantum mechanics, everything is fine. Not deterministic in a classical sense, but coherent and sensible, at least at the intellectual level.

Everything is fine, until you measure qubits

The raw quantum state of a qubit utilizes probability amplitudes which are complex numbers. If you square the probability amplitudes for 0 and 1 you get actual probabilities, which by definition must add up to 1.0. That’s pretty solid math, but probability doesn’t give you certainty. The probability for binary 1 may be low, like 0.25, or high, like 0.75, or exactly 0.50 as for a flipped coin, but those are still just probabilities. Unlike a classical computer, measurement of a qubit is not based on a threshold for 0 vs. 1, but is completely at the mercy of the strictly probabilistic nature of quantum mechanics. Indeed, a qubit with a probability of 0.25 for binary 1 will more commonly measure as a binary 0, but without any guarantee. Similarly, a qubit with a probability of 0.75 for binary 1 will more commonly measure as a binary 1, but sometimes it won’t.

Applying statistics to measurement

Actually, it would only be truly random if the probability for the quantum state being 1 was exactly 0.50. Otherwise, the result will be biased — like a biased coin or loaded dice.

So we can take advantage of that bias.

Just rerun your quantum circuit a number of times and examine the distribution of the results. How many times? That’s an open question, and the real overall topic and purpose of this paper. More on that later.

If we rerun the quantum circuit which has a probability for binary 1 of 0.25, we will likely find that 25% of the time we get a binary 1 bit value and 75% (100% minus 25%) of the time we get a binary 0 bit value.

Similarly if we rerun the quantum circuit which has a probability for binary 1 of 0.75, we will likely find that 75% of the time we get a binary 1 bit value and 25% (100% minus 75%) of the time we get a binary 0 bit value.

A big catch is that even if the probability is 75%, the actual experimental results have a good chance of not being exactly 75%. 77%, 73%, 75.001%, or 74.999% for the actual results are all reasonable to expect if the underlying probability is 75%.

Don’t forget the noise and errors

Our 0.25 probability might act like 0.30 or 0.20, or even as 0.75 or 0.70 or 0.80. Literally, anything is possible.

In fact, noise and errors can turn a 0.00 into a 1.00 or turn a 1.00 into a 0.00 — or anything in between.

Environmental interference

I won’t explore environmental interference in depth here, but it includes electromagnetic radiation (radio, TV, and cell phone signals), magnetic interference, electrical power fluctuations, thermal fluctuations, and simple physical vibrations which can affect component connections and operation.

The designers of quantum hardware do indeed endeavor to shield their hardware from environmental interference, and they succeed to a significant degree, but the quantum hardware is so sensitive that even the little remaining environmental interference can have a significant impact on the execution of quantum circuits.

Again, the designs of quantum hardware will undoubtedly improve in the coming years, but perfect hardware and perfect operating environments will remain out of reach, forever, although at some stage over the coming years I really expect that it will eventually become “good enough”, even if not perfect.

Unfortunately, environmental interference is highly variable and unpredictable, so it is no easy task to somehow compensate for it in the design of either quantum hardware or the algorithms which run on quantum hardware.

About the only tool that quantum algorithm and application designers have to work with is to execute their quantum circuits more times and hope that the environmental interference gets washed out from a statistical perspective — statistical aggregation.

Noise vs. errors vs. environmental interference

This paper won’t delve deeply into the distinction between noise, errors, environmental interference, but simply superficially distinguish them.

  1. Noise differs from environmental interference in that the source of the noise is within the quantum computer itself — from electrical, electronic, and mechanical components, cabling, wiring, circuit connections, and intermittent interactions of any of the preceding, while environmental interference by definition has a source external to the quantum computer (see the preceding section.)
  2. Errors can be caused by noise and environmental interference, but errors are generally caused by the flawed execution of operations, such as quantum logic gates and measurement operations (so-called measurement errors), as well as decoherence. An operation on one qubit can also have an unintended effect on a nearby qubit — so-called spectator errors or crosstalk. Microwave signals and laser operations simply aren’t 100% foolproof, yet.
  3. Even if there were no environmental interference, no noise within the quantum computer, and all operations are executed flawlessly, there is still the possibility of errors as a result of decoherence, where an idle qubit gradually decays from a 0 to a 1 or from a 1 to a 0.

But no matter which it is, the solution is circuit repetitions — let statistical aggregation smooth over any variability due to noise, errors, or environmental interference.

Measurement errors

Why does even simple measurement cause errors? A measurement operation is really no different than any other qubit operation (quantum logic gate.) As such, it is susceptible to errors, related to the use of microwave or laser pulses, depending on the hardware technology used to implement the quantum computer.

How big might the measurement error be? It’s completely hardware dependent, so it will vary between machines. The documentation for your quantum computer might specify a precise value or range, but you may need to do your own tests to find out for yourself. And it may vary from day to day or even hour to hour based on the specific calibration of your machine, which can drift even hour by hour. At least at one point, IBM was recalibrating their machine twice a day.

Measurement error might be 1%, but could be less or more than that as well. And it can vary for each qubit of a machine.

Each qubit has its own noise and error profile

You may discover from your own testing which qubits are more problematic and endeavor to avoid using them.

Sometimes the system calibration testing can detect and compensate for such differences, but sometimes not, and qubits can drift from any such calibration as the hours pass by.

Shot count, shots, or circuit repetitions

The hope — intention — is that noise, errors, environmental interference, and to some degree the probabilistic nature of quantum computation will get washed out from a statistical perspective.

Note: The terms shots, shot count, and circuit repetitions will be used and treated interchangeably in this paper.

How many shots or circuit repetitions are needed?

But that only accounts for the inherent probability of even an ideal, perfect quantum computer.

Now you also have to account for noise and errors.

Every model of quantum computer will have its own unique tendency for noise and errors.

Part of the calibration process for a quantum computer is to calculate and minimize errors and the effects of noise and environmental interference.

In the end, the vendor will need to supply you with estimates for the coherence time and fidelity for qubits and gate operations on qubits.

To some extent it may be trial and error, and in many cases it may definitely be trial and error, with a repetition of two steps:

  1. Increase the number of shots or circuit repetitions until you get statistically accurate results that you are comfortable with.
  2. Reduce the number of shots or circuit repetitions until the total wall clock run time is acceptable for your needs while still producing statistically accurate results that you are comfortable with. It’s a balancing act and tradeoff between the two competing interests.

More of an art or craft than science, but that’s where we are.

You need to know what results to expect

Generally, at least for the current stage of quantum computing, relatively small problems are being solved, such that a classical solution can generally be found.

In some cases a classical solution can be found using a so-called analytic solution. Basically evaluating a formula to get an exact result.

In other cases a Monte Carlo simulation can quickly arrive at a fairly decent approximation of an exact result, to as many digits of precision as you might need for the quantum solution.

In other cases, such as simulation of physics and chemistry, an empirical measurement from a physical experiment may provide the expected result. Such as so-called chemical accuracy.

In the near term, people remain focused on addressing problems where the solution is known, just to confirm that quantum computers can even solve realistic problems at all, but this raises the daunting spector of what people will do once we transition into the regime of exponential problems where we do not now know the expected result and cannot derive it analytically, classically, or even via experimental trial and error.

Expectation value

Physicists in quantum mechanics use the term expectation value as well, but in a somewhat more complex manner. This informal paper will not delve into the details, but at least some of the details can be found here:

  1. https://en.wikipedia.org/wiki/Expectation_value_(quantum_mechanics)
  2. https://en.wikipedia.org/wiki/Expected_value
  3. https://mathworld.wolfram.com/ExpectationValue.html

There is a discussion of expectation values related specifically to quantum computing in this paper:

Rather than merely adding up the measurement values, physicists will multiply each value by its probability and then not bother to divide the sum at all. If you had n numbers, each would have a probability of 1/n, so multiplying each number by 1/n before adding them is effectively the same as adding up the raw numbers and dividing by n.

The reason physicists do the calculation this way is that in quantum mechanics the probabilities are not necessarily equal.

This sounds relatively simple, but then it gets complicated since a lot of measurements are not continuous real numbers. For example:

  1. A single qubit whose value will be a binary 0 or 1. An average of 0.5 is not terribly useful when measuring qubits.
  2. A physical phenomenon such as energy levels of electrons, which are discrete levels rather than a continuous range of real numbers.
  3. A range of integers. A real, non-integer average is not very useful.
  4. A collection of bit strings — one bit per qubit, one bit string for each computational basis state. Each value is a pattern. A numerical average is not useful.

For an interesting perspective on the difficulties people have with the quantum mechanical notion of expectation value, this paper discusses a survey of physics students — upper-level undergraduate and graduate students:

Simplified concept of expectation value

  1. What are the expectations about what a single value (single measurement) looks like. Such as its range of possible values and the theoretical distribution of those values, possibly even a probability for each value. That’s in line for the quantum mechanical notion of expectation value.
  2. How do we want to think about a series of measurements. What information do we want to extract from the series?

Sometimes a simple arithmetic average really is what we want, but in quantum computing there are so many additional possibilities, of which the simple arithmetic average is but one.

For example, in the case of bit strings, we may want to see the frequency distribution for each of the bit string values.

Or we may just care about which bit string occurred most frequently.

Or maybe we want the top few bit strings by frequency.

Or maybe we want some particular statistical measure, such as mean (average), median, variance, standard deviation, etc.

Maybe we want only the maximum. Or, if the distribution is bimodal, we’d like the two maximums (peaks.) Or maybe the distribution is multimodal and we want all of the peaks.

Or maybe the maximum is an outlier, a fluke, and the result of errors. Of course that’s why we execute many repetitions of the quantum circuit, to use statistics to filter out the flukes.

Even for the simple case of a single qubit, we might want to examine the frequency of 0 and 1 values. Or we might want to examine the actual values of the series in the order they occurred.

It is also possible that the qubits represent more than one result value. Maybe the quantum circuit has two or three registers and maybe some individual qubits, so there might be multiple expectation values for a single quantum circuit.

In any case, it can get complicated, so that a single, simple arithmetic average is not necessarily sufficient for the needs of a quantum application.

We’ll leave the purist quantum mechanical conception of the expectation value of an observable behind, and focus on the needs of quantum algorithm and application developers.

Raw quantum results

Each qubit has a complex quantum state for the duration of the quantum computation, but measurement of a qubit causes its complex quantum state to collapse into a single binary bit — a binary 0 or a binary 1.

A qubit might be in an equal superposition of a 0 and 1 like a flipped coin spinning in midair during the quantum computation, but measurement is like snatching the spinning coin and slamming it flat — it is then forced into a 0 or 1 (tail or head.) The outcome may have been equally uncertain during the computation (coin flip), but measurement (snatching the coin and slamming it down) removed all uncertainty and all possibility of alternative outcomes.

Generally a quantum computer will have some number of qubits, not all of which are necessarily needed for the output or result. Even if not all of the qubits used in a quantum computation are needed for the final results, it can sometimes be enlightening to examine their final (flattened) state, especially if you get results which are unexpected.

There are several interesting cases for the raw quantum results of a single execution of a quantum circuit:

  1. A single qubit. A binary bit 0 or 1, or a boolean value false or true.
  2. A collection of single qubits considered as a group, a bit string, but each still a discrete binary value, typically potentially entangled, such as a so-called register, interpreted as a raw bit string but not interpreted as a single binary integer. For example, the two qubits of a Bell state, or three or more qubits in a GHZ or W state.
  3. A collection of qubits which represent the binary digits of an integer. Two bits for 0–3, three bits for 0–7, four bits for 0–15, 10 bits for 0–1023, and so on.
  4. A collection of qubits which represent the fractional binary digits of a binary point number, from 0 to 0.999… (up to by not including 1.0.) Two bits for 0, 0.25, 0.50, and 0.75. Three bits for 0 to 0.875. Four bits for 0.9375. And so on.
  5. Some combination of any number of the above — zero or more individual qubits, zero or more bit strings, zero or more k-bit integers, and zero or more k-bit binary-point fractions. Where each number may have its own value of k.

A single k-qubit binary integer is the common case, especially for applications which utilize the quantum computer as a math coprocessor.

Bit string cases

  1. All bits independent, isolated, uncorrelated.
  2. 2-qubit Bell states — correlation or dependency between the quantum state of the two qubits.
  3. k-qubit GHZ state — all 0 or all 1, in theory.
  4. k-qubit W state — exactly one bit is 1, rest are 0, in theory.
  5. k-bit integer.
  6. k-bit binary fraction.

Quantum registers

A quantum algorithm will operate on the individual qubits, and each qubit is indeed measured independently, but the software interface will enable the application to see those measured qubit values as if they were the bits of a classical register.

Cases of series of results

Although you can consider the raw qubits values for the column values, it is generally more helpful to reduce the cases of integers and binary-point fractions to actual integers and binary fractions, as well as bit strings for related bits.

Given that each row has a collection of simple one-bit binary values, arbitrary k-bit integers and binary fractions, and bit strings, we have the following cases to consider across all rows but for each column independently:

  1. The raw values across all rows. Let the application sort them out.
  2. Exclude duplicates from raw values.
  3. The simple integer arithmetic average of a column across all rows. Simply add the column across all rows and divide by the number of rows.
  4. Exclude outliers from the simple integer arithmetic average. Specify count or threshold for outliers, smallest and largest values. Alternatively, outliers could be based on low-frequency integer values.
  5. Maximum value. What is the largest value which occurs across all rows for a given column.
  6. Most frequent value. Which binary value occurs most frequently across all rows for that column.
  7. Top k values. May be identical.
  8. Top k most common values. Ignoring duplicates.
  9. Cutoff threshold. How many values (rows) have a value in the column with a binary value above a specified binary value.
  10. Cutoff threshold average. Average of all of the values above the cutoff threshold.
  11. Bit string distribution. Frequency distribution for each unique bit string value from 0…0 to 1…1 for the number of qubits in the bit string. Generally only useful for relatively short bit strings since there will be 2^k entries, which is one million even for k of 20. Although, just because there are potentially 2^k entries, that doesn’t mean that every value is captured — the values may be relatively sparse. But even then, if your shot count is high and there is lots of noise and errors, many spurious values could be generated.
  12. Raw distribution. Counts for each unique value.
  13. Bucket distribution. Divide the k-bit values into some number of buckets and return a count or percentage of the row values which fall into each bucket.
  14. Maximum bucket. Simply the bounds of the bucket (or bucket number) which contains the most values. Which raw value occurred the most in that bucket. Should have the option of raw count of values in a bucket or a percentage, fraction, or probability based on the shot count.
  15. Bimodal distribution. Return the central value for each of the two peaks of a bimodal distribution, and possibly the number of values occurring within a range about the peaks.
  16. Multiple peaks. Return a list of all peaks, with a minimum distance between peaks. Optionally order the peaks by number of number of rows at the peak value or within some minimum distance from the peak value.
  17. Period of the peaks. Measure the distance between peaks. Or distances plural.
  18. Random. No apparent true, single value or pattern.
  19. Garbage. Not truly random, but no discernible value or pattern.

Note that frequency distributions could be:

  1. Raw counts.
  2. Fractions.

For the case of simply one-bit binary values (ala coin flips):

  1. Were there more 1’s than 0’s — true or false.
  2. What percentage of the time was the result a 1. How close to 100%? Well above 50% or close to 50%?
  3. How random did the sequence appear? Longest run of 1’s or 0’s. Average percentage of time with runs of 1’s or 0’s longer than some threshold, say 3 or 4. Other measures of randomness.
  4. Raw binary values. Such as simply generating random bit values.

Possibility of garbage results

It is possible that a significantly larger shot count might enable some pattern to appear out of the apparent garbage, but it may simply be that the specified quantum circuit simply cannot be executed reliably on the quantum hardware being used. That’s an intuitive judgment call which will have to be made by the designer or developer.

What to do about outliers

  1. Accept them as is. If the shot count is high enough, they may simply get washed away in the statistical aggregation.
  2. Ignore them. Based on some numeric threshold, values which are too far outside an expected range could simply be dropped.
  3. Ignore the fringe values — highs and lows. Regardless of any expected range, simply drop a percentage of the highest and lowest values.

Algorithm and application designers are well-advised to carefully examine a fair number of test cases to see what types of outliers actually show up in realistic test cases. It could be surprising.

Actual vs. requested shot count

When doing math with the results of executing a quantum circuit, the actual shot count must be used, not the shot count requested by the application.

This most likely will not be a real issue for most algorithms and applications, but in extreme cases it might be, so be warned.

Also, since it varies by backend, you could have an algorithm or application which runs fine on one or more backends, but then you may get different results on some other backends.

What to do when there is no clear dominant result

In some cases this really will mean that you need more circuit repetitions to average out the noise and errors.

In other cases it may mean that the desired quantum computation simply isn’t practical on the particular quantum hardware. That’s a tough judgment call to make.

Just take the average of the results — or the most common result

Besides, given the probabilistic nature of quantum computation, frequently you must simply accept the nature of the beast and compute the average. Alternatively, simply take the most common result.

Manually examine all results

  1. There may be unexpected surprises.
  2. There may be some strange biases.
  3. There may be some anomalous results which should be discarded as outliers.
  4. There may be bimodal or multimodal results rather than a single peak — or vice versa.
  5. Your preconceptions and intuitions about what the results should look like may be a little bit off, skewed somewhat, or just plain completely wrong.

Don’t get me wrong — I’m not saying that you must always manually examine all quantum results, but just enough to gain a solid feel that you have a decent handle on what the data really looks like.

Note that there is a special risk once we get to problems and solutions in the quantum supremacy regime where we no longer have classical or analytical solutions to guide us as to the accuracy of results. And manual examination may be impractical, especially for nontrivial shot counts.

What fraction of shots should be dedicated to probabilistic nature vs. noise?

You may have to resort to trial and error testing.

But, if you run your quantum algorithm and application on a quantum simulator with noise and errors set to zero, that should give you a reasonably solid measure of how many shots are required to compensate for the probabilistic nature of quantum computing alone.

And then turn up the noise and errors to a level approximating some real (or hypothetical) quantum computer to see how the shot count has to be raised to then compensate for both the probabilistic nature of quantum computing and noise, errors, and environmental interference.

Testing to characterize effects of noise, errors, and environmental interference on results

  1. Are there any obvious patterns?
  2. Are there test cases which just give really bad results?
  3. Are there test cases which surprisingly give much better than expected results?
  4. Are all of your assumptions born out?
  5. Do you see things that you really didn’t expect?
  6. How do the test results affect how you conceptualize the expectation value for the quantum computation?

And the underlying probabilistic nature of the quantum computation itself will impact these test results as well. Unfortunately probability, noise, errors, and environmental interference cannot usually be cleanly separated.

Ultimately the results of these tests will provide feedback into how you compute or estimate shot count, and possibly how the application handles the results.

Four types of dynamic shot count

There are four different reasons that the shot count may vary and have to be dynamic for a particular algorithm:

  1. Dependence on particular input data and parameter values. Particularly where the quantum circuit itself is dynamically generated based on the input data and parameters.
  2. Portability — to significantly or even somewhat different hardware.
  3. Calibration settings for the particular machine at the particular moment. Calibration can drift as well.
  4. Individual qubits or particular combinations of qubits can have distinct noise and error rates and coherence times.

Dynamic shot count based on parameters

Just as an example, in one paper, they did a “grid search” based on various parameter values to determine optimal shot count:

Based on different circuit algorithms, the range of shot counts examined in that paper included:

  • 5,000, 25,000, 125,000, and 625,000
  • 10,000, 100,000, 1,000,000, and 10,000,000

That’s quite a range — 5,000 to 10,000,000.

In that particular case, the goal was to optimize or minimize wall clock time to achieve a desired precision of the quantum result.

They didn’t end up using 10,000,000 repetitions, but they did test it, rejecting it since 1,000,000 repetitions gave an acceptable result.

Automatic dynamic shot count?

This might be implemented partially in the machine firmware or the lower levels of system software which control the overall execution of quantum circuits. One goal would be to minimize network latency for the overall circuit execution once circuit repetitions are factored in.

There would probably need to be some minimum number of repetitions to establish an initial range.

And possibly some provision to detect and discard outliers.

There could also be the opposite — detect that the results are too scattered, variable, and seemingly random to be worth continuing. Signaling to the application that the computation is failing. Again with some specified minimum shots to establish a baseline range.

In either case, the application should be informed of the actual number of shots attempted.

Won’t quantum error correction solve the problem?

But even with QEC, that only addresses half of the problem — even with an ideal, non-noisy, error-free, non-NISQ quantum computer, we still have to cope with the probabilistic nature of quantum computing, that 2 + 2 is not always 4.

Wall clock time

This may force you to reduce shots or circuit repetitions to fit within a tighter wall clock time frame. This can certainly impact accuracy of results as well, but at a minimum it may force you to be more conservative or do more experiments to arrive at a smaller number of shots or circuit repetitions.

Incidentally, the Google quantum supremacy experiment took 200 seconds of wall clock time to execute a randomly generated quantum circuit one million times. They were able to execute the circuit 5,000 times per second. It’s not clear if they limited themselves to 3 minutes (and 20 seconds) of wall clock time, or needed or wanted one million repetitions to get a statistically meaningful result.

Context of a quantum simulator

In theory, one could configure the simulator to have any desired degree of noise, errors, coherence time, or even environmental interference.

While a production quantum algorithm would seek to have minimum noise and errors, maximum coherence time, and minimum environmental interference, it would be enlightening to see how the quantum results are affected by various degrees of noise and errors.

In particular, one could perform experiments to see two things:

  1. Shot count needed to achieve a desired precision of results based on a modeled degree of noise and errors.
  2. Degree of modeled noise and errors required to achieve the desired precision of results for a particular shot count.

Two possible uses of this data:

  1. Predict what improvements would be needed in real hardware to achieve the desired precision and shot counts.
  2. Calculate or adjust shot count based on measured level of noise and errors on a particular machine. Real machines are periodically recalibrated and the calibration can vary over time.

It could be enlightening to see what shot count is required with absolutely zero noise and errors. This is the shots needed to account for the inherently probabilistic nature of quantum computing.

It could also be enlightening to see what maximal degree of noise and errors can be tolerated and still get acceptable results in an acceptable amount of time.

When designing quantum algorithms it would always be nice to know how close to the edge you are — where the “cliff” is, beyond which you can never get good results, and near which decent results begin to deteriorate either abruptly or slowly. You can do such testing with a real machine as well, but it would also be good to test against a range of potential machine profiles to enable rapid, easy, and comprehensive portability.

Ultimately, every algorithm designer should endeavor to determine the minimum and maximum requirements for reasonable operation of their algorithm on real machines. Maximum meaning the point of diminishing returns, where even significant improvements in hardware yield rather insignificant improvements in precision of results.

Design for portability

Is 10 ever a reasonable shot count?

It might be enlightening to run an algorithm for all shot counts from 1 to 100, just to see how badly the algorithm performs. Some algorithms might perform quite well for very small shot counts. It may be that only reasonably sophisticated and complex quantum algorithms need even more than 10 to deliver production-quality results. It could be that most experimental projects don’t need much more than mediocre results.

IBM Qiskit default shots is 1024

Is 100 shots reasonable?

I’ve seen quite a few Qiskit examples with shots=100, although it’s not clear whether that’s a testament to fewer shots being required or that the default of 1024 is beyond what is needed.

What is the maximum shots for a single circuit?

As of the time of this writing (July 4, 2020), the IBM Qiskit backend configuration is able to return max_shots, indicating the maximum number of times you can execute a single quantum circuit at one time — for one call to the execute function.

What is the value of max_shots for Qiskit? Unknown to me at this time — each configured backend will have its own setting of this configuration parameter. And, it could change over time.

In short, consult your quantum backend provider for this level of technical detail.

No clear rules, methodology, or guidance for shot count

I haven’t seen any vendor documentation on this.

I haven’t seen any in any formal papers.

I haven’t seen any citations to any paper that surveys or proposes this.

I did run across one paper which indicated that they did a grid search on multiple parameters, including shot count, to find the combination which came closest to the exact analytical solution in the least amount of time. But this means that this approach won’t be appropriate for problems where an exact analytic solution is not known.

Another paper simply said “we chose to perform Nshots = 100 shots per experiment”, with no indication of why or how they made that choice. They did indeed verify that it was a reasonable choice, but that leaves the methodology of picking a number open, other than experimentation or trial and error.

Looking at some examples from IBM Qiskit, I see shots=100 and shots=1000 a lot, and occasionally shots=10000, shots=1000000, and even one shots=1000000 (one million!) — but never with any explanation for why or how that number of shots was chosen.

What’s a good design goal for shot count when designing new algorithms?

On the one hand they should strive to keep the shot count as small as possible.

On the other hand they should attempt to get the maximum precision possible, regardless of how big that makes the shot count. Or at least the maximum precision that they expect that they might need.

It would probably be a good idea to give the user of an algorithm the ability to tune precision so that shot count and wall clock time can be controlled to at least some degree.

It also depends on the motivations of the designer — are they focused on speed or quality? Is wall clock time a big deal or not? How much precision is really needed or acceptable?

Rules of thumb for shot count

Formulas for shot count

Intuitions for shot count

Trial and error — and careful testing — is about the best we can do at this stage.

Need to balance shot count and wall clock time

You have to understand your application requirements to do that balancing.

How many digits of precision are reasonable?

But in the world of quantum computing, precision is still a very scarce commodity. Getting even just a few digits of precision is commonly seen as a major victory and a nightmare to actually implement. Part of the problem is noise and errors due to the tentative quality of current NISQ hardware. The probabilistic nature of quantum computing factor in as well. And there is a snowball effect as attempts to deal with limited hardware (qubits) impact precision and attempts to do more extensive circuits to compensate for the effects of coping with limited hardware cause more noise and errors to accumulate.

Generally, the current state of affairs is that you get what you can get. You can exert superhuman efforts to get more precision, but your efforts can backfire as well.

The current state of affairs requires you to have advance knowledge of the general ballpark of the correct answer and then you can optimize your algorithm around that narrowed range of possibilities.

Generally, you need to have classical code which dynamically generates a narrowed quantum circuit tailored to the specific input data and parameters.

The quaint classical notion of even single precision numbers and math with a fairly wide range of exponents and a fair number of digits of precision (7 decimal digits) is mere fantasy in the realm of quantum computing, today or any time in the near or foreseeable future.

How does precision affect shot count?

Generally, more digits of precision will require more circuit repetitions.

Whether each additional digit is a linear increase in required shots or more of a polynomial or even an exponential increase will depend on the particular algorithm (circuit) as well as how bad the noise and errors are for the particular quantum computer you are running on.

Despite all of the hype about quantum computers, it is best to keep your number of digits of precision reasonably modest. Six or seven decimal digits may be kind of an upper bound. Don’t expect 10 or 20 decimal digits of precision. Again, some niche classes of algorithms may be much more generous, but for the general case, even six decimal digits is pushing it, and for current NISQ machines you may be lucky to get even two or three decimal digits of precision.

Generally, quantum computing is resolved at the binary level, the classical bit, so decimal digits must be converted to binary bits. Generally, you can simply multiply decimal digits by three and add a little extra, or divide binary bits by three and subtract a little. For example, two decimal digits, 99, requires seven bits — 2 * 3 + 1, three decimal digits, 999, requires ten bits — 3 * 3 + 1, and six decimal digits, 999,999 or one million, requires twenty bits — 6 * 3 + 2. How much decimal precision could you theoretically represent with 32 qubits? Well, that’s equivalent to 32 binary bits. 32 / 3 is roughly 10. A 32-bit binary integer on a classical computer can represent up to a little over four billion. One billion, or one billion minus one, is 999,999,999 which is nine decimal digits. So, 32 / 3–1 is 9, the number of decimal digits you can fully represent in a 32-bit or 32-qubit number. How many bits or qubits do you need to fully represent a ten decimal digit number? 32 bits gets you four billion, 33 bits gets you eight billion, and 34 bits gets you sixteen billion. So, 10 * 3 + 4 is 34 bits to represent sixteen billion, enough to fully represent 9,999,999,999 or ten billion. Or, equivalently, 34 / 3 is 10 decimal digits.

Then, any math or formulas for calculating or estimating shot count will be based on the number of bits of precision.

Is it still too soon to worry too much about shot counts?

I would say probably so, for many or most cases, but not for more sophisticated algorithms of some interesting degree of complexity. But even there, this may be a rather small fraction of all current usage.

One exception: true random number generation — only a single shot required

Classical computers cannot generate true random numbers without special hardware. Rather, most applications on classical computers make do with pseudo-random numbers.

But since quantum computers are inherently probabilistic by nature, they can trivially generate a random number using a single quantum quantum logic gate for each random bit desired — the Hadamard gate, which puts a qubit in an exactly equal superposition of 0 and 1. A simple measurement of that qubit will then measure a binary 0 or a binary 1 bit with equal probability.

Granted there is a very slight probability that the probability will be slightly more or less than 50%, but that is unlikely to even be detectable.

Also, there is a probability of a measurement error, but even that is unlikely to be detectable since a random result is expected anyway.

One can generate an n-bit random number either by executing a single-qubit quantum circuit n times or by executing an n-qubit quantum circuit once.

Don’t confuse circuit repetitions with the iterations or layers of variational methods

Don’t confuse circuit repetitions with experiments

Iterations of variational algorithms

The iterations of a hybrid quantum/classical variational algorithm are comparable to the concept of an experiment — multiple iterations within each experiment. So discussion here related to experiments applies to iterations of variational algorithms as well.

Application experiments

  • Number of application experiments.
  • Number of iterations within each experiment.
  • Shot count for each quantum circuit invocation within each iteration.

Watch out for total circuit executions

And if an application involves a variational algorithm with iterations, you have three counts to multiply to get total quantum circuit executions — application experiment count, iteration count within each experiment, and circuit repetition count (shot count) for each invocation of the quantum circuit.

Be careful to look at both shot count and number of experiments since it is the product of the two (as well as iterations for variational algorithms) which will drive total wall clock time. One paper used a fixed shot count of 100, but their number of experiments varied from 10 to 100,000,000, so their total number of circuit executions was upwards of 1,000 to 100,000 to 1,000,000 to 10,000,000 to 1,000,000,000 (one billion circuit executions!). Granted, these were simulations rather than running on real quantum hardware, but still…

Don’t get too used to the nuances and glitches of the quantum machine you used yesterday

Hardware, firmware, and software is constantly being updated. There is always something new and better coming. The problem comes when new and better causes subtle or not so subtle changes in those nuances and glitches that your algorithms and code became dependent on.

Generally it may not be a big deal to revise your algorithms and code to adapt to the changes in nuances and glitches, but sometimes it is in fact a big deal. It may require significant redesign, reimplementation, tuning, and of course testing. And maybe that comes at a time when you’re too busy on other tasks. Or maybe it comes when you’re on vacation or out sick and nobody else is available who knows your algorithms and code the way you do.

Sometimes new is not strictly better. The new machine or software may be smaller and cheaper, such as when minicomputers, microprocessors, personal computers, and smartphones came out, or even simply a cheaper mainframe, any of which suddenly change the nuances and glitches your code depended on, or introduced new nuances, glitches, and possibly limits which you now must figure out how to adapt to.

You may have carefully tuned shot counts to work great for the old hardware, but now you must carefully re-tune them again.

Maybe the old shot counts still work fine.

Maybe you simply want to take advantage of new opportunities to trim the shot counts dramatically.

And maybe there are some subtle differences which for some reason require somewhat higher shot counts. For example, maybe the new machine or some new configuration of an old machine is much faster but with a higher error rate.

In any case, change frequently comes at some cost, even if the change reduces one cost but increases some other cost.

How does quantum volume impact shot count?

But how do we quantify the impact on shot count given a delta in quantum volume? Great question, but the answer is… unknown, for now. As usual, you’ll have to run your own tests and see for yourself based on your own algorithm and your own application.

Looking forward, it would be great if some variation of quantum volume — characterizing the noise and errors, and possibly even the environmental interference of the quantum computer — could eventually be used as a parameter in calculating shot count.

What is the impact of shots on quantum advantage?

Technically, this is currently a moot point since no practical quantum algorithms currently exist which demonstrate true and dramatic quantum advantage for practical real-world problems.

But as quantum hardware improves and algorithm sophistication improves, eventually we will get to the stage where a very large number of shots could swamp and eliminate any potential quantum advantage. But it would depend on the nature of the problem and how much precision is really needed.

Conceivably, theoretically, it could be a problem, but from a practical perspective it’s not even on our long-range radar today.

That said, there may be various dead spots where the combination of number of qubits and number of shots is more than a little suboptimal. That may be something that incremental improvement in hardware quality can improve, or not.

Eventually, once quantum hardware improves enough, the required shot count may in fact fall significantly, so that true and dramatic quantum advantage finally becomes within reach.

But keep in mind that shot count is not only about noise and errors, but also about the inherent probabilistic nature of even an ideal perfect quantum computer, so that even as noise and errors are reduced, the probabilistic aspects of the quantum circuit remain and keep the shot count from falling to 1, and may in fact keep it high enough that quantum advantage may still remain out of reach for some quantum algorithms even as it becomes in reach for other quantum algorithms.

What about quantum supremacy?

There may well be math which enables us to blindly calculate the number of shots needed, but we’d have no way to check it! Except with another quantum algorithm, which then itself cannot be checked except by another quantum algorithm, and so on, ad infinitum. Oh dear!

Ah, but some classes of problems are actually relatively easy to check the results even if the results are hard and impossible to compute classically, such as factoring very large numbers — a resulting factor can be quickly checked classically. But if we’re doing a massive optimization, there’s no way to efficiently classically validate that there is no more optimal solution than the one produced by the quantum algorithm. Maybe the tradeoff is that as long as the quantum optimization is “good enough”, then that will be good enough. Still, that feels a little unsatisfying from a theoretical and purist perspective.

Scaling shot count for Shor’s algorithm

It might well be that shot count might need to be astronomically large for large input numbers, even on an ideal, perfect, noise-free, error-free quantum computer.

I have serious reservations as to whether Shor’s algorithm would work at all on a NISQ machine for anything more than super-trivial small numbers.

And if it can be made to work on a NISQ device, it may require such a large shot count that its net quantum advantage over classical methods might be minimal at best — and that would be for relatively small input numbers, at best.

Using Shor’s algorithm for all but relatively small numbers (which can be factored on classical machines fairly easily) might require post-NISQ machines, and maybe even require quantum error correction (QEC).

Again, the primary issue here is the dearth of any formal analysis of what shot count would need to be for any realistic implementation of Shor’s algorithm.

Again, the primary issue is not peculiar to Shor’s algorithm, but to complex quantum algorithms in general.

Advice to authors of papers on quantum algorithms

  1. Always be explicit about what shot count was needed or recommended.
  2. Be clear about the technical justification behind any shot count.
  3. Provide a solid mathematical formula whenever possible.
  4. Offer a crisp rule of thumb whenever possible.
  5. Preferably give a treatment for how shot count would scale as input grows.
  6. At least speculate what might need to happen for shot count as input grows.
  7. Consider and separate discussion of shot count for an ideal quantum simulator, an at least partially noisy quantum simulator, a current NISQ device, and speculation about better NISQ devices which may be available within 1–2 years.
  8. How shot count might impact quantum advantage. Especially when a very large shot count is needed.

Recommendations

  1. Plan to run experiments to see how sensitive your results are to variations in circuit repetitions.
  2. Reflect on how shot count will, might, or could scale as input size or output precision grows.
  3. If you are authoring a formal or academic paper, please say something explicit about shot counts, how you determined them, how you validated them experimentally, and how they might scale with input size — see the preceding section.

Conclusions

That said, it’s not too early to begin preparation and research for future stages of the quantum computing sector when shot counts do have a much more dramatic impact on the results of quantum algorithms and applications.

What’s next?

  1. Development of formulas and methods for explicitly calculating shots and circuit repetitions for particular quantum circuits.
  2. Development of less flawed and less noisy qubits. Less noise and fewer errors — mean fewer shots are needed to compensate for them.
  3. Development of quantum computers better shielded from environmental interference.
  4. Improvements in coherence time of qubits. And how that impacts required shot count.
  5. Development of quantum algorithm techniques that can reduce circuit length and hence cumulative errors.
  6. Development of quantum algorithm techniques which can mitigate noise and errors within the quantum circuit, reducing the need for extra shots.

I don’t expect much to change over the next two years. Most quantum circuits will remain relatively shallow, such that 100 or 1024 shots will probably remain sufficient for the vast majority of quantum circuits other than advanced research projects.

The big open question is what hardware or algorithm breakthroughs will open the floodgates enough to where shot count becomes a significant issue for most quantum algorithm designers and quantum application developers.

Freelance Consultant