What Is Algorithmic Complexity (or Computational Complexity) and Big-O Notation?
Algorithmic complexity is a rough sense of how much computing resources will be required for an algorithm to handle input data of various sizes or values, particularly in terms of time and space (memory or storage.)
The goal is to derive a mathematical formula which expresses how a change in the size of the input data will impact the overall performance of the algorithm. Actually, it’s more of a rough rule of thumb than a technically precise mathematical formula.
Computational complexity is essentially a synonym for algorithmic complexity.
Time complexity is also essentially a synonym for algorithmic complexity.
Big-O notation is a mathematical representation of algorithmic complexity as a function of the size of the input to an algorithm.
This informal paper will explore some of the various aspects of algorithmic complexity, and summarize Big-O notation. A broad, general audience is intended. A deep background in computer science and mathematics is not required. My intention was to provide background to support quantum computing, but most of the material here applies equally to classical computing.
Algorithmic complexity can also refer to:
- The visual complexity of the source code for an algorithm, including length, nesting of logic, obscurity of intentions for logic and formulas, and lack of comments or documentation. The degree to which the true, mathematical algorithmic complexity may be hidden or obscured due to the difficulty of reading and comprehending the source code.
- The degree to which the resource requirements of an algorithm are not known or unclear due to dependence on external functions, external data such as databases, or invocations of networked services whose algorithmic complexity and resource requirements may not be known with any degree of certainty or may be highly variable.
- The degree of uncertainty for an algorithm due to latency (response time) for external services which may be delayed by arbitrary amounts of time due to load on external systems which are hosting those services.
Many algorithms run so quickly and efficiently that their algorithmic complexity is negligible and of no real concern, but quite a few algorithms can be very problematic, taking too long to run or possibly running out of memory.
Oh, and were you wondering why good software developers are paid a lot of money? In short, they are paid roughly in proportion to the algorithmic complexity that they must master and deal with on a daily basis — the less complexity involved, the less skill required.
Iteration and looping
Computers are generally fast, but there are factors which can make even the fastest computers seem slow.
Generally, there are three factors which can cause a computer algorithm to seem slow:
- Processing a large amount of input data.
- Performing complex computations with many operations.
- Generating a large amount of output data.
Generally, the main factor causing a computer algorithm to be slow is complex processing which requires iteration and looping.
Even if the total size of the algorithm source code (lines of code) is relatively modest, the total number of operations performed by the algorithm can be very large due to performing many iterations, executing loops many times.
Even worse than a loop or iteration is the fact that loops or iterations can be nested, to any level, so that the total number of operations to be executed becomes very large, very quickly.
Much of the analysis needed to determine algorithmic complexity revolves around the number of times the loops in an algorithm must be executed.
Functions and function calls
Another factor which can cause a computer algorithm to be slow due to complex processing is the concept of a function and function calls. (See a subsequent section for a related concept: function recursion.)
Functions and function calls are a very powerful intellectual technique for expressing the complexity of a computation in a more succinct manner.
There are two uses for functions:
- To use a common code sequence in more than one place in an algorithm without having to literally repeat the full code sequence every time it is used.
- To partition the complexity of an algorithm in a modular manner so that each modular component is less cluttered and easier to follow, and easier to manage and maintain.
For the former, a function will be invoked multiple times, so the net amount of code is reduced.
For the latter, the function may or may not be invoked more than once. If not invoked more than once, the net amount of code is not reduced, but the code is easier to read and easier to manage and maintain.
Either way, the net impact on algorithmic complexity is as if each function call were replaced with a copy of the code of the invoked function.
Subroutines
A subroutine is essentially a synonym for a function.
There may be nuances of difference in some contexts, but for the purpose of analyzing algorithmic complexity — and this paper — they can be treated as equivalent.
Function recursion
A function can invoke itself, either directly or indirectly by one or more other functions. This self-invocation is known as function recursion.
Obviously self-invocation cannot continue forever, or else the function would run forever and never stop, which is not a good thing. Somehow, there has to be logic in the function which will make the self-invocation conditional so that eventually the recursion will stop.
In any case, from the perspective of analyzing computational complexity, each call to the function, even from itself, must be replaced with the code of the function.
The final self-invocation will be replaced with the code of the function except that the conditional self-invocation will not be present.
The depth of recursion, the number of levels of self-invocation may vary from relatively shallow to extremely deep.
The depth will likely depend in some way on the input size. Whether the depth of recursion is linear, sub-linear, polynomial, or exponential will depend on the algorithm. That’s the major task for analyzing algorithmic complexity.
Quantum parallelism
Quantum algorithms are a whole different category compared to classical computing. Instead of the common use of iteration and loops, quantum parallelism is used to execute a computation over the full range of values simultaneously as if a single computation. This provides a quantum advantage, and commonly yields an exponential speedup over comparable classical methods.
Hybrid quantum/classical algorithms
Despite the tremendous advantages of quantum parallelism, the needs of particular problems coupled with the limitations of current quantum computing hardware may force the algorithm designer to resort to a mix of classical and quantum computing approaches, resulting in what is called a hybrid quantum/classical algorithm.
The net effect is that only a fraction of the potential quantum parallelism can effectively be utilized, forcing the designer to fall back on classical iteration and looping for at least some fraction of the algorithm.
This makes it more difficult to analyze algorithmic complexity, but it is mostly simply analyzing the combined hybrid algorithm as if it were two distinct algorithms.
Secondary storage and external services
Two other factors impacting the performance of algorithms are:
- If all data cannot fit or be kept in memory, it may be necessary to access secondary storage such as spinning disks, which operate much more slowly than memory.
- Accessing external services, such as across a network or the Internet, can dramatically impact performance.
These factors can indeed dramatically impact performance, but they may be more of a constant factor — which then is multiplied by the algorithmic complexity, since the purpose of algorithmic complexity is to quantify how performance will be impacted as the size of the input changes, not the resources required for each unit of input data.
Input size
Input size can come in one of two forms, depending on the nature of the algorithm:
- For algorithms processing a variable amount of input, input size will be the count of items or records of input data. Such as searching or sorting a list of data. The data may be directly input or may come from an external source such as a database.
- For algorithms with only a fixed amount of input data, such as a set of parameters, the input size must be calculated from the input parameters. Such as for an algorithm which computes the factorial function or factors a large number. The actual value of an input parameter may be the effective input size.
Some algorithms may require a combination of these methods.
Some algorithms may be dynamic, such as a web service which accepts requests from other nodes in the network, with each request dependent on the input data and parameters given for that particular request. Complexity can get very complex!
Big-O notation
Big-O notation is a simple mathematical formula which roughly approximates or places an upper bound on the resource requirements for an algorithm based on the size of the input — the complexity of the algorithm, its algorithmic complexity.
Technically, algorithmic complexity can and should apply to both time and space (memory and storage) resource requirements, but commonly people are primarily interested in running time of an algorithm. That tradition will be maintained in this paper — the focus is on time complexity.
Algorithmic complexity is couched with the language “on the order of”, indicating the rough or approximate cost of the algorithm in terms of resource requirements. How rough? Within an order of magnitude. Precision is not attempted. A rough estimate is usually good enough.
The text “on the order of” is shortened as a capital (big) “O”, hence the name Big-O notation.
The form of Big-O notation is simply:
- O(formula)
Where the formula is any mathematical expression based on the input parameters or size of the input data for the algorithm. The input size is commonly expressed in the formula using a lower-case (little) “n”.
For example,
- O(n²)
This paper will spare non-technical readers the details. The main takeaway is that algorithmic complexity can get out of hand very quickly, especially for advanced AI or simulation algorithms or any algorithm handling lots of data or performing complex calculations, so great care is needed to try to minimize the algorithmic complexity, or to at least realize the costs being incurred.
A short summary of the common forms of Big O notation, in increasing order of cost and declining order of performance, include:
- O(1) or O(k). Constant. Great, easy, fastest, trivial, simple.
- O(log log n). Double logarithmic. Relatively trivial except for very large input.
- O(log n). Logarithmic. Non-trivial but very manageable.
- O(sqrt(n)). Quadratic speedup. Faster than linear, but not as fast as logarithmic. For example, Grover’s algorithm for searching unstructured data on a quantum computer.
- O(n). Linear. Still manageable and predictable, but not optimal for large amounts of data.
- O(n log n). Linearithmic. Not quite as good as linear, but not so bad.
- O(n²). Quadratic. Now it’s getting somewhat expensive, but not terrible.
- O(n^k). Polynomial. Worse than quadratic (for k > 2), but not as bad as exponential.
- O(2^n) or O(k^n). Exponential. Try to avoid this. This is bad, except for very small amounts of data.
- O(n!). Factorial. Only if you really need it and don’t need real-time performance. Or your input data is very small.
Again, the complexities listed earlier on the list have lower cost and higher performance, while the complexities listed later on the list have higher cost and lower performance.
- Note: There is no clarity as to whether log n represents a natural logarithm (base of e), a base 10 logarithm, or a base 2 logarithm. Technically the distinction between the three is not critical since they all differ by a constant factor and Big-O notation is a rough approximation rather than mathematically precise. Generally, you can differentiate between the three possibilities by context, but generally base 10 logarithm is assumed by default. Use ln n when you explicitly intend a natural logarithm, and use log2 n when you explicitly intend a base 2 logarithm.
As a simple but common example, consider two algorithms with input size of n = 20:
- For an algorithm of quadratic complexity — O(n²), O(20²) is 400.
- For an algorithm of exponential complexity — O(2^n), O(2²⁰) is one million.
One million vs. 400 is a dramatic difference in algorithmic complexity.
The Wikipedia has several articles with further details on algorithmic complexity and Big-O notation:
- Wikipedia article on Big O notation
- Wikipedia article on Algorithmic complexity
- Wikipedia article on Computational complexity
- Wikipedia article on Computational complexity theory
- Wikipedia article on Time complexity
Polynomial complexity
Setting aside a lot of mathematical details and jargon, the concept of a polynomial refers to a set of parameters or input values which are used as bases to which constant exponents are applied.
Polynomial complexity refers to algorithmic complexity which can be expressed as a polynomial.
Some examples:
- O(n²)
- O(n³)
- O(n⁴)
- O(n³ + m²)
Polynomial complexity also tends to include algorithmic complexity which is simpler than a polynomial, such as linear and logarithmic, although they are better described as having subpolynomial complexity.
Subpolynomial complexity
An algorithm whose complexity is linear or better, including logarithmic and quadratic speedup, is referred to as having subpolynomial complexity, meaning simply that its algorithmic complexity is better than an algorithm having polynomial complexity, such as O(n²).
Technically, any algorithm with algorithmic complexity of O(n^k) or better is commonly referred to as having polynomial complexity.
Sublinear complexity
Any algorithmic complexity that is less complex than linear — O(n) — is known as sublinear complexity.
This would include:
- O(1) or O(k). Constant. Great, easy, fastest, trivial, simple.
- O(log log n). Double logarithmic. Relatively trivial except for very large input.
- O(log n). Logarithmic. Non-trivial but very manageable.
- O(sqrt(n)). Quadratic speedup. Faster than linear, but not as fast as logarithmic. For example, Grover’s algorithm for searching unstructured data on a quantum computer.
Sublinear complexity is a really good thing. Even better than polynomial complexity.
Sublinear complexity is a subset of subpolynomial complexity.
Sublinear complexity will generally be better than subpolynomial complexity.
Exponential complexity
In contrast to polynomial complexity, where the input parameter is used as the base and a relatively small integer as the exponent, exponential complexity reverses that order, using a relatively small constant value as the base and the input size as the exponent.
Some examples:
- O(2^n)
- O(3^n)
- O(4^n)
- O(3^n 2^n)
Exponential complexity also tends to include algorithmic complexity which is more complex than exponential complexity, such as factorial complexity.
Polynomial complexity vs. exponential complexity
Generally, polynomial complexity refers to algorithms which are simpler and faster, while exponential complexity refers to algorithms which are more complex and slower.
Polynomial complexity is preferable.
Exponential complexity should be avoided.
Brute force
Programmers are expected to design clever and efficient algorithms, but oftentimes the best — or only feasible — approach is simply brute force. A brute force algorithm is an algorithm which exhaustively evaluates all possible solutions without any heuristics or cleverness to intelligently guide its selection of candidate solutions to evaluate.
A brute force algorithm may proceed sequentially through all data or all possible solutions, starting at the first and continuing until the last, without any heuristics or intelligence to change or guide the order of processing.
Whether a brute force algorithm can stop as soon as it finds a workable solution, or whether the algorithm must proceed through all possible solutions before a final decision can be made will depend on the algorithm and its requirements.
Generally brute force is the least attractive approach to finding a solution, except in four situations:
- The data size or prospective solution count is small enough that the time and resources to evaluate all possible solutions is reasonably small or acceptable.
- There is no known algorithm to achieve a solution other than exhaustive processing.
- Known algorithms are not guaranteed to achieve a sufficiently optimal or accurate solution.
- A quantum computer can be used to simultaneously evaluate all possible solutions in parallel in a single step using quantum parallelism.
Random and statistical methods
Another type of approach to computation are statistical methods such as using random sampling to approximate a deterministic result. This requires generating lots of random numbers and then evaluating the values of functions based on these random input values. Repeated enough times, results can be fairly accurate, but repetitions have a cost.
Fewer repetitions cost less but produces less accurate results. More repetitions produce more accurate results but cost more. It’s a classic cost/benefit tradeoff, which affects performance — and the algorithmic complexity of the algorithm.
The number of repetitions must be factored into the algorithmic complexity, but in fact the repetition count might be an input parameter so that different runs of the same algorithm may use a different number of repetitions, resulting in different performance even though the main input data may be unchanged.
Generally, changing the repetition count will only result in a linear impact on the performance, which won’t change the overall, rough algorithmic complexity of the algorithm, but it can depend on other factors as well. In short, it may have no impact, or it could have a large impact, depending on the details of the algorithm.
A common example is the Monte Carlo method for simulations.
Simulations of complex processes, including phenomena of the physical world and business and industrial processes, commonly use such methods, where there is no single deterministic result, but the result depends on input values which tend to be random rather than predictable.
Input parameters for a simulation could have a multi-dimensional effect, so that the impact on performance is not a simple linear scaling, and may result in a change in the overall algorithmic complexity — or not, depending on exactly how the parameters are used. It can get complicated.
The good news is that if performance is unacceptable, the granularity (precision) of the simulation can be reduced to make performance more acceptable. But this does not always work out as well as expected. It all depends, on many factors. As a result, performance — algorithmic complexity — can be problematic.
In short, modeling the algorithmic complexity for computations requiring randomness and statistical methods can be quite challenging.
Space complexity
Technically, algorithmic complexity is concerned not only with time, but also with space — memory. The amount of space or memory required to execute an algorithm for a given amount of input is referred to as space complexity. Most of this paper is concerned with time complexity, but memory and space complexity can be a big factor as well, if not for smaller amounts of input data, then for larger amounts of input data.
This paper won’t delve deeply into space complexity, but there are a few key points:
- An algorithm might generate a large amount of data, even for a small amount of input data.
- Even a relatively simple algorithm can generate a large amount of data.
- Even if an algorithm doesn’t generate a large amount of data for final results, a significant amount of data may be needed for intermediate results, which may be discarded when the algorithm completes, but still consume resources while the algorithm is running.
- Space or memory includes both main memory — commonly RAM or random access memory — and storage — commonly on a spinning disk or flash memory. Memory tends to be transient while storage tends to be permanent or persistent, although those distinctions are not necessarily always the case.
- Some algorithms may use very little space, while others, especially those which are data-intensive, such as data analysis, logging, real-time data, and databases, can require dramatic amounts of space.
- Memory and storage requirements are frequently an overlooked aspect of the design, development, evaluation, and deployment of algorithms and applications. Deployment of a poorly-characterized algorithm can lead to very unpleasant surprises.
Combinatorial explosion
A combinatorial explosion occurs with any problem that requires that the number of possible solutions is the product of the number of possibilities for each parameter, so that even for a relatively small number of parameters the product can be a very large number.
This will likely occur with any problem whose solution has exponential algorithmic complexity — O(2^n), or worse.
The term is also used for any problem, algorithm, or application that requires a very large amount of computing resources despite seeming to be relatively simple on the surface.
The basic idea is that a problem or application may seem relative simple or manageable, but due to the fact that there are multiple parameters that result in an exponential or factorial degree of algorithmic complexity the amount of computing is likely to be far greater than might seem apparent at first blush to an innocent bystander.
There are some points here:
- The problem may be ripe for implementation on a quantum computer where quantum parallelism can potentially avert the combinatorial explosion.
- Avoid problems, algorithms, or applications that involve a combinatorial explosion.
- Try to simplify such problems.
- Be prepared to allocate sufficient computing resources to handle the combinatorial explosion.
- For some applications, despite the fact that the logic of the algorithm is known it may be necessary to hold off and wait until hardware advances to the point where the combinatorial explosion can be more readily handled.
- Consider whether a first-fit solution might be acceptable and cost a lot less than an optimal best-fit solution.
- Consider sub-optimal but acceptable heuristics which dramatically reduce the algorithmic complexity.
N-body problems and quantum computing
Specialized heuristics, indexes, and distributed computing can be used to optimize the performance of algorithms for search and matching — delivering very impressive gains — but ultimately there is a limit to such attempts to cheat on the complexity of the real world.
Many real-world problems whose solutions have exponential algorithmic complexity are simply beyond the ability of even our fastest computers, biggest memories, largest networks, and cleverest heuristics and indexing techniques for all but the simplest cases. Sure, we keep innovating and expanding on all of those fronts, but we’re still only tackling relatively modest real-world problems.
Back in 1981, physics professor and Nobel laureate Richard Feynman noted quite simply in his paper Simulating Physics with Computers that as difficult as it was to use a traditional computer to simulate a relative simple quantum mechanical system, a quantum mechanical system was able to perform the task effortlessly and instantaneously, so he suggested that a computer based on quantum mechanics would be a great way to simulate physics.
As he noted, it is not impossible to simulate a quantum mechanical system using traditional computers, but the algorithmic complexity is exponential, so that simulating even a system of moderate size would be exceedingly difficult or expensive. And at some point it simply wouldn’t be practical at all using all available resources.
Similarly, so-called N-body problems such as simulating the Newtonian mechanical motion of N celestial bodies is extremely expensive for traditional computing for other than trivial cases. Quantum computing has been suggested as a reasonable approach to N-body problems.
The underlying issue is that the system is dynamic, so that you have to juggle too many balls in the air to get anything done, which doesn’t work for other than relatively small systems. And when you cannot determine the location of a ball with any precision or reliability, traditional juggling simply doesn’t work at all.
There are plenty of real-world problems that have such complexity, such as:
- Driverless cars in a region coordinating their own movement without the use of a central traffic controller.
- Patrons at a large performance venue wishing to exit as quickly as possible. Or to be seated as rapidly as possible. All without any central traffic control or local control either.
- Protein folding.
- Optimizing travelling salesman problems.
- Optimizing complex business processes, particularly those which are geographically disperse at many sites.
- Predicting outcomes for complex adaptive systems.
- Managing large distributed networks of Internet of Things devices when the devices may interact, without central control.
- Simulating physical systems at the level of quantum mechanics.
- Simulating chemical reactions.
- Predicting the effects of drugs and other chemicals.
- Designing new materials.
- Evaluating the chemical properties of new battery designs.
- Simulating large celestial systems at the cosmological level.
The common case for quantum computing is that the comparable classical computing algorithm has exponential complexity or worse while the quantum algorithm has polynomial complexity or better. The net effect is that the quantum algorithm offers an exponential speedup. These are generalities, so for particular problems the algorithmic complexity and quantum advantage can vary greatly.
One general issue for quantum computing is that it is probabilistic rather than deterministic, so that getting a result that is at least approximately deterministic requires repeatedly rerunning a quantum algorithm so that statistical aggregation will average the probabilistic results to get a result which at least seems to be deterministic. This will of course have some negative impact on the performance of a quantum algorithm, but generally it should be less than the gains which accrue from quantum parallelism within the quantum algorithm itself. Still, it is a factor which should be analyzed carefully, both to assure the accuracy of the result, and to properly assess the algorithmic complexity of the algorithm.
Quantum computing is still new, relatively unproven, and in need of significant development, but has significant potential for solving computationally hard problems.
What are P, NP, NP-complete, and NP-hard algorithms?
In short, don’t worry about them! They are of interest primarily only to theoretical computer scientists. They focus on mathematical proof of abstract notions of algorithmic complexity.
P stands for polynomial time algorithm or problem.
NP stands for non-deterministic polynomial time algorithm or problem.
The distinctions between NP, NP-complete, and NP-hard are a little too complex to even attempt to reduce them to plain language that any non-theoretical computer scientist can really make sense out of.
For more details:
- Wikipedia Computational complexity theory article.
- Wikipedia NP-completeness article.
What are BQP and QMA algorithms?
Again, in short, don’t worry about them! They are of interest primarily only to theoretical computer scientists, especially those focusing on quantum computing. They focus on mathematical proof of abstract notions of algorithmic complexity.
BQP stands for bounded-error quantum polynomial time algorithm or problem.
QMA stands for Quantum Merlin Arthur algorithm or problem. It is the quantum equivalent to NP.
The distinctions between BQP, QMA, P, and NP are a little too complex to even attempt to reduce them to plain language that any non-theoretical computer scientist can really make sense out of.
For more details:
- Wikipedia Quantum complexity theory article.
- Wikipedia BQP article.
- Wikipedia QMA article.
For more of my writing on quantum: List of My Papers on Quantum Computing.