The thesis asserts this: If an algorithm A computes a partial function f from natural numbers to natural numbers then f is partially recursive, i.e., the graph of f is recursively enumerable.
The thesis has been formulated in 1930s. The only algorithms at the time were sequential algorithms. Sequential algorithms were axiomatized in 2000. This axiomatization was used in 2008 to prove the thesis for sequential algorithms, i.e., for the case where A ranges over sequential algorithms.
These days, in addition to sequential algorithms, there are parallel algorithms, distributed algorithms, probabilistic algorithms, quantum algorithms, learning algorithms, etc.
The question whether the thesis is true in full generality is actively discussed from 1960s. We argue that, in full generality, the thesis cannot possibly be true.
> These days, in addition to sequential algorithms, there are parallel algorithms, distributed algorithms, probabilistic algorithms, quantum algorithms, learning algorithms, etc.
These can all be converted into sequential algorithms with some overhead. Is there an actual proof in the video or just an argument that something that can't be made sequential must exist?
A parallel OR will terminate even if one of the subclauses does not terminate, whereas sequential OR might not terminate in this case. So if you define an algorithm as a computation that terminates, parallel OR cannot be fully emulated by sequential OR.
The problem is that, as far as I know, sequential algorithms form a uniform class, because e.g. equivalence between different sequential models like Turing machines and the untyped lambda calculus can be proved. AFAIK, this is not possible for parallel models of computation, there are different models of them with different expressivity. (I'm not an expert on that, so take this with a grain of salt and please correct me if I'm wrong!)
Well... you can still execute the OR essentially via BFS vs DFS... Spend some "cycles" or steps executing one path, then the other, then keep alternating. Eventually you will hit the termination (if it exists).
Or essentially, you can always simulate parallel algorithms on sequential hardware.
Of course you are correct if the sequential OR is executed naively :-)
...or if you force side-effect free short-circuiting within the sequential OR execution. But then it wouldn't really the same OR (as in the parallel case), now would it?
What are you using the acronym OR to mean? Why can't you define a sequential machine that simulates it and terminates under the same conditions?
For all the parallel modes of computation I've seen, you can pretty easily simulate them on a single machine by just keeping the full state of each machine and simulating each time unit 1 machine at a time. It's slow, but equivalence isn't about speed.
Oh man, you and overdrivetg are right, of course. I was silently assuming that A and B are atomic operations or inputs to an open system; otherwise it should be possible to emulate parallel OR.
Now this casual remark during my work time is starting to bother me a lot, and I really hope some distinguished expert could jump into the discussion. We know that the pi calculus can embed the lambda calculus (and similar for other parallel models of computation), and I've always assumed that the opposite is not generally true - that you cannot express all parallel computations on Turing machines or in lambda calculus such that the same input yields the same output, at least not if you allow some of the computations not to terminate.
Now I'm no longer sure about that and cannot find any foundational theorem that would prove this.
Am I the victim of a prejudice? :o Where are the computer scientists when you need them?
If you want an expert's opinion, I have a PhD in theoretical computer science. I'm not familiar with pi calculus, but parallel models of Turing machines can be fully simulated by standard Turing machines by just simulating a time step on each machine before moving on to the next time step.
Since merely having a PhD in theoretical computer science doesn't really count for being an expert in this area, do you happen to have a reference for that?
Don't get me wrong, I do believe you, but am looking for the relevant theoretical results about this.
That's fair. I only mentioned it because you asked for a computer scientist.
One simple model of a parallel Turing machine is one with multiple tapes, each with its own read-write head. They all operate simultaneously based on the global state. This is a very general form of parallelism.
Wikipedia has an article [1] about them with some references for proofs that they're equivalent to single-tape Turing machines, but I encourage you to try to prove it yourself. It's fairly simple to prove, I'm pretty sure I've seen it as an undergraduate-level homework problem.
If you want a hint, try increasing the number of tape symbols so that you can encode the entire global state on one cell of the single-tape machine. It's a bit more work, but you can also show that two symbols are enough to simulate a Turing machine with any number of symbols.
The proof is a little trickier than I remembered. There's a bit more housekeeping to handle that each head can be on a different cell of their respective tapes.
Try coming up for a proof for the case where each head moves in the same direction so they're always all on the same position of their tapes first. This can be extended to the general case by some annoying bookkeeping.
First of all, the paper that this article is reporting on doesn't actually prove a separation between quantum and classical algorithms at all. It's an oracle separation result, which makes it an important result in complexity theory, but not a statement about actual classical or quantum algorithms.
Second, the result is unrelated to the Church-Turing thesis in the first place, since it's only about relative efficiency, whereas the Church-Turing thesis is about computability.
Every quantum algorithm can be simulated by a classical computer, so quantum computing cannot possibly disprove the Church-Turing thesis. However, there's a stronger variant of the Church-Turing thesis which also takes performance into account, and it's plausible that quantum computers can contradict this stronger version (although we're very far away from actually proving that).
I’m still unclear on whether probabilistic algorithms are “algorithms” in the strict sense - in that definition an algorithm should have deterministic behavior, which a probabilistic method does not have. But let’s say your “random” is a deterministic prng- does that then make the PA a deterministic algorithm?
Basically the distinction can be best illustrated with bozo/bogo sort. The definition many people use is:
1. Randomly shuffle the input
2. See if it is sorted, if not go to 1.
The problem, as explained to me by an algorithms lecturer a long time ago, is that the algorithm is not guaranteed to finish, and more over the same input may or may not result in it finishing on different runs.
Per that lecturer a “correct” bozo sort is
1. Generate a list of every permutation of the input
2. Iterate the list until you find a sorted entry and return that.
It has computable best case, worst case, and average case (all terrible). It is guaranteed that the same input will always produce the same output (terminate vs not terminate).
Obviously you could optimize this if you had an algorithm that could shuffle an input such that you were guaranteed that if you shuffled it’s own output you were guaranteed to produce every ordering of the original input. Then you would get closer to: shuffle; check; repeat. But still note that there is no random involved :)
See [1], which is about the 'complexity class' Bounded-error Probabilistic Polynomial time.
Similar to the P=NP conjecture, there is a BPP=P conjecture.
Also consider [2], which concerns variations of the Church-Turing thesis. Notably, it mentions probabilistic computation.
To answer your original question, indeed probabilistic algorithms are an extension of normal algorithms. One way to make the extension (and retain the property that algorithms give the same output on the same input) is to add a 'random oracle'.
Do you have a better source than "an anonymous lecturer once said..."? I've never heard such a strict definition of algorithm before.
Randomized algorithms are often used to avoid falling into worst-case scenarios and improve run-time, like with quicksort, or to avoid settling in local minima like with simulated annealing.
"Starting from an initial state and initial input (perhaps empty),[4] the instructions describe a computation that, when executed, proceeds through a finite[5] number of well-defined successive states, eventually producing "output"[6] and terminating at a final ending state"
The "finite" bit I guess is the key thing, that I think was what my lecturer would have been getting at - the canonical bozo sort is not - you could just run for ever, as you might not /ever/ produce a correctly sorted version of the input. That article also points to "randomized algorithms", taking some random input (you could image random input being an RNG seed I guess?).
That article does include monte carlo algorithms, which technically could not terminate (Montecarlo simulation of a bouncing photon could technically go for ever) so I am left confused.
The lecturer was Prof. Tadao Takaoka, who was great, and apparently (per google) was on the editorial board of the journal Algorithms (http://www.mdpi.org/algorithms/editors.htm)
I think your "correct" bozo sort example from earlier shows that there is a finite number of states (all of the permutations of the input array), just an unbounded number of state transitions.
I think for something like this, you could do asymptotic analysis of the expected run-time. Assuming all elements are distinct, there are n! permutations, so each would have a 1/n! chance of being selected each iteration. (But it's rather late, and this is a poor medium for discussing math, and someone has probably already done this, so I'm not going to go any further.)
small thought experiment: what if we decouple the issue of "finite" from the probabilistic aspect?
Perhaps-Algorithm A:
step 1: x := sample a value from a roll of a fair D6
step 2: print x
Is Perhaps-Algorithm A an algorithm?
Can we argue that we're in a well defined state after step 1 has executed? We have some variable x bound to one value from {1, ..., 6}, with a known probability distribution, although we don't know which integer we've got.
edit: on the other hand perhaps we're just haggling over definitions here. it is definitely the case that non-terminating algorithms exist and can be useful, and non-deterministic algorithms exist and can be useful. it might be the case that some powerful theory and analysis restricts attention to terminating, deterministic algorithms, it doesnt meant that that's all there is.
The problem is that because it’s random there is a chance it will never complete, so the same input alternates between halting and non-halting, so the average time is infinite. Which is clearly nonsensical, hence it isn’t an algorithm.
I think that the intuition here has to do with the dimensionality of the problem, in the sense that we have proved that in 3 dimensions a random walk will never return to it's starting point, bit in two dimensions it will. Mapping all probabilistic algorithms into demensional random walk space is probably extremely difficult though.
Sorry, I had it reversed in my head. For 1 and 2 dimensions they will return to the starting point with probability 1. For 3 dimensions it simply that it is not probability 1. Obviously a weaker result.
Technically speaking, even physical computers have error rates and are, therefore probabilistic. With large distributions, your mean approaches your expected value -- the same kind of analytical result you get from boolean logic. Similarly, biological systems run on probabilistic algorithms and the fact that two twins starting from the same DNA can be so remarkably similar after millions of steps it a testament to the power of probabilistic algorithms.
I don't have the time to put that together, but here's a somewhat cleaned up (i.e. times and some newlines removed) paste of the closed captions. This may be helpful if you're good at scanning a lot of text.
Without having looked at the details, why do I feel like all of the other types of algorithms should be able to be shown to be equivalent to the sequential case under the assumption of local realism? That is to say, if you don't consider that an algorithm must be implemented in this universe then I suspect that many things are possible. Am I missing something here?
Conversely, those algorithms which cannot be converted to sequential possibly only run on exotic Turing Machines like ones with infinite bands or infinite parallelism and those likely don't have a physical equivalent because it looks like our universe only supports finite computation. In other words, the algorithms which violate Church-Turing might not be interesting at all, or only for analysis in an idealized setting.
Indeed. I had a similar thought, which initially seemed profound. "Maybe this means that the universe can encode the results of non sequential algorithms but no one can ever decode them. Douglas Adams was right all along!" Of course if you can't decode anything then in theory the universe is currently running an infinite number of undecodable simulations of itself right now. Related piece from Scott Aaronson [0].
————
The thesis asserts this: If an algorithm A computes a partial function f from natural numbers to natural numbers then f is partially recursive, i.e., the graph of f is recursively enumerable.
The thesis has been formulated in 1930s. The only algorithms at the time were sequential algorithms. Sequential algorithms were axiomatized in 2000. This axiomatization was used in 2008 to prove the thesis for sequential algorithms, i.e., for the case where A ranges over sequential algorithms.
These days, in addition to sequential algorithms, there are parallel algorithms, distributed algorithms, probabilistic algorithms, quantum algorithms, learning algorithms, etc.
The question whether the thesis is true in full generality is actively discussed from 1960s. We argue that, in full generality, the thesis cannot possibly be true.