Upper bound != worst case, though I can see why the language is confusing.
When we have a timing function f(n) we pretend f has one input, which is n = the size of the input to the algorithm. In reality, it's actually more like f(n, m) where m = a parameter specifying the _exact_ input to the algorithm. (Example: sorting algorithms have many possible input lists of length n.)
So when we write f(n) = O(g(n)), we are trying to make two simplifications:
(1) Simplify away the actual input, replaced by only the _size_ of the input; and
(2) simplify the actual function f to something that describes f but is much simpler.
I will make up an artificial example that is not perfect but is useful to explain the ideas. Suppose for any input size n we have parameters m in the range [-n-sin(n), n+sin(n)] and the exact timing function
f(n, m) = n + sin(n) + m
Then the first simplification is to determine how to ignore m. We could pick out the best m for n (the smallest f(n, m) for a fixed n), pick out the worst m, or take the average over all m. This is best-case, average-case, and worst-case complexity. For this function, we have:
Then we could get an upper bound on _any_ of these.
f(n) = O(1) [best case]
f(n) = O(n) [avg and worst case, not always the same, but they are the same in this example]
If we wanted a lower bound, we could say that
f(n) = Omega(n) [avg & worst case]
because f(n) >= n - 1 for all n (in general there could be more room for flexibility on the right-hand side, but it's not needed in this example).
The point here is that step 1 chooses a simplification across input instances, that we call worse case or average case, etc; and step 2 chooses a simplification in how to write down the function f(n), and this is what we call an upper bound for big-oh, or for theta, both an upper and lower bound.
Yes. Big-O is the worst case for the input, not for the algorithm. If the input is the expected average, then its Big-O is the upper-bound/worst-case for the average. But if the input is the worst case or just the algorithm itself, its Big-O will be an upper-bound/worst-case for the entire algorithm.
In other words, the original article's point should have been about vocabulary. Big-O is about worst-case, but sometimes it's implied that the Big-O analysis is about the algorithm's average case, not the algorithm's worst case.
When we have a timing function f(n) we pretend f has one input, which is n = the size of the input to the algorithm. In reality, it's actually more like f(n, m) where m = a parameter specifying the _exact_ input to the algorithm. (Example: sorting algorithms have many possible input lists of length n.)
So when we write f(n) = O(g(n)), we are trying to make two simplifications: (1) Simplify away the actual input, replaced by only the _size_ of the input; and (2) simplify the actual function f to something that describes f but is much simpler.
I will make up an artificial example that is not perfect but is useful to explain the ideas. Suppose for any input size n we have parameters m in the range [-n-sin(n), n+sin(n)] and the exact timing function
f(n, m) = n + sin(n) + m
Then the first simplification is to determine how to ignore m. We could pick out the best m for n (the smallest f(n, m) for a fixed n), pick out the worst m, or take the average over all m. This is best-case, average-case, and worst-case complexity. For this function, we have:
f(n) = 0 [best case] f(n) = n + sin(n) [avg case] f(n) = 2(n + sin(n)) [worst case]
Then we could get an upper bound on _any_ of these.
f(n) = O(1) [best case] f(n) = O(n) [avg and worst case, not always the same, but they are the same in this example]
If we wanted a lower bound, we could say that
f(n) = Omega(n) [avg & worst case]
because f(n) >= n - 1 for all n (in general there could be more room for flexibility on the right-hand side, but it's not needed in this example).
The point here is that step 1 chooses a simplification across input instances, that we call worse case or average case, etc; and step 2 chooses a simplification in how to write down the function f(n), and this is what we call an upper bound for big-oh, or for theta, both an upper and lower bound.