Hacker News new | past | comments | ask | show | jobs | submit login
Tail recursion in Python (chrispenner.ca)
193 points by zweedeend on March 17, 2018 | hide | past | favorite | 80 comments



Someone recently pointed out to me you can bypass the recursion limit with an inbuilt decorator, because it's basically a memoiser.

lru_cache, from the functools library.

The example given in the docs [0] is:

    import functools

    @functools.lru_cache(maxsize=None)
    def fib(n):
        if n < 2:
            return n
        return fib(n-1) + fib(n-2)
[0] https://docs.python.org/3/library/functools.html#functools.l...


This only works in specific cases (namely those where dynamic programming algorithms suffice), and does not avoid the recursion limit in general.


Don't dismiss one of my favorite higher order functions so soon :)

"Recursion + memoization provides most of the benefits of dynamic programming, including usually the same running time." -- Steven Skiena

lru_cache decorator is great for people who are happy to let the language handle the caching of results for them, and often leads to code which is much more concise than the dynamic programming approach. The limitation you are referring to is that the decorator uses a dictionary to cache results and that dictionary uses the arguments as keys so the arguments need to be hashable. That limitation can be avoided by using immutable data structures (Clojure also has a higher order function called memoize which does the same thing and has no limitations because the core data structures in Clojure are immutable) and although Python not having structural sharing can mean that this approach can hurt memory and GC efficiency a bit, but that trade-off is at least worth considering :)

Still have to keep the stack depth less than sys.getrecursionlimit() so no substitute for tail recursion but surely a substitute for dynamic programming in a lot of cases.


You can only avoid the recursion limit in cases where dynamic programming would also work, as you have to explicitly call the function in reverse stack order to avoid having the stack build up. If you want fib(10000) you need to call fib(1) through fib(9999) first, as if you were implementing a dynamic programming solution.

This isn't dismissive. lru_cache is one of my favorites too, but it has limitations.


But that isn't a limitation of lru_cache, for example the same higher order function when used in Clojure i.e. memoize with recur for tail recursion will not cause stack overflow. The stack build up is because python doesn't support tail call optimization, not a limitation of lru_cache, just wanted to make it clear because you can use similar higher order functions in other languages which support tail call optimization without any limitations. Deep recursion in Python without sys.setrecursionlimit() is probably not a good idea, memoization can't help you in that. My point was geared towards presenting this pattern of memoization using a higher order function + recursion as an alternative to dynamic programming and in languages with tco and immutable data structures it works beautifully :)


I agree that this isn't a limitation of the Platonic ideal of an lru_cache function. I thought we were talking about actual Python code.


It's worth pointing out that python expands the datatype of numbers as needed (ending up at BigInt or similar, I belive). So any stack rewriting would have to accommodate an accumulator that starts as an integer and expands to arbitrarily many bits. It might be easily handled as I guess all arguments are references to python objects, and the regular code for expanding numbers could switch out the reference - but the point remains that proper tail call optimization in python needs to deal with objects as arguments.


This does not seem to me like a big hurdle. Surely this is the sort of thing that should be hidden in the Integer class?


It won't help unless you call it in a specific order e.g., fib(10_000) may produce RecursionError unless you run for n in range(10_000): fib(n)


Right, it's a memoiser.

You side-step some recursion through previously stored results. It works well for some class of algorithms, which coincides with quite a large subsection of problems where TCO would help formulate algorithms.

There are still a bunch of limits, because you're caching results, not eliminating call frames.

The first obvious drawback is performance and memory use: All results get stored in a dictionary.


Is that really tail recursion though ? Seems like you are making two recursive calls to fib(). I thought tail recursion requires a single final call to recursive function.

Your memorization helps, but seems you will still run out of stack space if you call it with a big number without a warm up.


I don’t think op is claiming that method is tail recursive, just pointing out you can get away with using recursion and LRU cache.


The hackyness/speed issues aside:

When compiling/transpiling/whatever between languages, I have found that relying on regular procedure calls and TCO is generally a lot simpler than having to force the looping facility of one language into the semantics of another language.

The only one I can actually imagine porting other loops to is the common lisp loop macro, but that is probably the most flexible looping facility known to man.

Edit: and oh, cool thing: racket and guile has expanding stacks and doesn't have a recursion limit other than the whole memory of the computer. This is pretty handy when implementing something like map, since you can write a non-tail-recursive procedure so that you don't have to reverse the list at the end.


With regards to stacks that can use all of the memory: Gambit and AFAIK Chicken behave that way, too. This is one of the reasons I chose Scheme over OCaml (and Haskell) over a decade ago when looking for a new language to move to.


Haskell does not have a recursion limit. You can freely use as much memory as you want via recursion. It doesn’t even really have a stack in the traditional sense. The STG machine does use a stack for evaluation, but it’s often completely different from what you might expect if function calls in Haskell actually necessarily corresponded to C-style functions. There is a default limit to the physical stack size, but it’s something like 512MB and you can change it with a command line flag.

https://wiki.haskell.org/Stack_overflow


Well, both racket and guile dynamically grows/shrinks the stack. Chicken does not. It turns everything into tail calls and copies the stack when it's full and discards whatever is not in scope (simplified).

Gambit seems to also not grow the stack dynamically, but I could be wrong.


First, I'm talking about the stack in Scheme (the high level language), since that's what we are talking about here (you gave map as an example); whether there's a C stack used underneath somewhere only matters in this context if its size is tied to the stack size available to Scheme programs. Also, some might argue that Scheme needs to implement call/cc and hence "can't use a stack" for storing Scheme call frames as that would not be efficient, which is correct if you tie the word "stack" to implementations as a single array only. A singly linked list can also work as a stack[1]. Since Scheme gives first class access to continuations, the "call stack" is sometimes correspondingly called the "continuation stack" instead, which then makes more sense.

[1] https://en.wikipedia.org/wiki/Stack_(abstract_data_type)

Gambit definitely does grow the Scheme continuation stack; if you let it grow infinitely, it increases memory use of the whole process until it swaps or runs into a memory limit set via ulimit -v; in the latter case the Gambit runtime throws an out of memory exception in some situations, or reports an out of memory error and exits the system in others. If the procedure returns, the memory is given back first to the heap and then at some point (if not re-used) to the OS.

With regards to Chicken, as you say, it transforms the code into continuation passing style, allocates every continuation frame first on the C stack and then copies surviving frames into a second zone (it basically uses a generational garbage collector with 2 generations). Each long term continuation frame is essentially allocated on the heap (or whatever it is that the second zone is allocated from). Hence I expect that there is no limit on the size of the continuation stack in Chicken, either.


Even Python doesn't need to have stack limit - just make sure C stack is large enough (e.g. using ulimit or pthread_attr_setstacksize) and use `sys.setrecursionlimit(1000000000)`.


Making the C stack large enough is not solving it on 32 bit architectures with enough physical RAM that you can't/don't want to waste address space. And on 64 bit architectures address space isn't a problem, but the memory from a temporary large stack can't be re-used without swapping the old stack contents out which is slow.


Yes, you could make the stack larger, or, you could avoid needing to keep a gigantic useless stack in memory with this technique in the first place.


> racket and guile has expanding stacks and doesn't have a recursion limit other than the whole memory of the computer

I'm not familiar with how these two in particular work internally, but this may actually be more a side effect related to the implementation of call/cc than recursion. A popular technique is to truncate the stack when a continuation is captured. So you obviously need a stack that can expand.


It was not by accident, but it might have something to do with the delimited continuations implemented for guile 2.2


To add onto the point about expanding stacks: What's especially nice about this feature is that it means that you don't need to tune your algorithms to be tail recursive when they could be expressed more clearly as non-tail recursion. Functions like map would actually be less efficient on average if it was tail recursive because you would need to re-iterate the list to reverse it.


With guile and Racket, a non-linear reverse! at the end of a map is as fast as doing a non-tail-recursive map.

There are trade-offs for both. The TCO'd map is a lot faster to restore when using continuations, but is not multi-shot continuation safe.


This can also be done using trampolines without using try/catch method: https://github.com/0x65/trampoline



JS fully disabled in this day and age?


I've noticed a shift over the last while how privacy-protective people are becoming "out-group" and a little weird.

I mean, I personally don't care; I've always been a little weird. But it is funny to see technical preferences as a signaling mechanism.

Funny, that is, until it hits a certain point... http://www.wired.co.uk/article/chinese-government-social-cre...


"Blacklist all by default, whitelist as needed" is how we build most secure systems right? I'll admit it feels strange applying that construct to day to day browsing.


battle is over, privacy lost


Where can I buy your browsing history?


I have started using a "Quick Javascript Switcher" extension some years ago to easily opt-in for certain pages but have js disabled by default.

This was one of the best quality of life decision in terms of web browsing I have ever made.

The vast majority of pages that I randomly access (e.g. from hacker news) are text based and usually work just fine without js. But the time until I can start reading is much faster (less jumping around of content) and I don't get the growth hackers modals shoven down my throat two paragraphs in. The pages I use regularly are usually white listed


Yep, a cheap way to minimize ads, tracking and browser exploits.

Also avoiding downloading JS libraries bigger than Quake while on the go.


It's more common than you think.


But still pretty uncommon.


Only on Hacker News :)


Flash fully disabled in this day and age? ;-)

More like "disabled by default," actually. It's mostly ads/tracking, popovers, and other annoyances, and it's easy to selectively turn it back on where you really need it. This approach isn't for the general public yet.


No page shows JavaScript for me until I enable it with NoScript. It's too sad that Firefox Focus on Android doesn't allow plugins or disabling JS, it make it makes the whole thing pointless.


opt-in rather then no opt-out?


I fully disable it.


> def tail_factorial(n, accumulator=1):

> if n == 0: return 1

> else: return tail_factorial(n-1, accumulator * n)

The second line should be "if n == 0: return accumulator"


0! == 1

EDIT: Oops. As pointed out below, the code is indeed incorrect, and my comment is irrelevant.


True, but irrelevant. For all values of n > 1, that function will return 1, which is clearly not what the author intended.


Your code is still allocating a new stack frame anyway. So no optimization is happening. You are simply avoiding a stack overflow which is not the purpose of tail-call optimization.

I'm not sure if there is any advantage when language/compiler does not provide a proper tail recursive optimization.


It's a gross exaggeration to say there's no advantage. Who decided that stack frame re-use is "the purpose" of tail-call optimization, while not blowing the stack is not? It seems to me that being able to run the function at all is more important than whether it runs quickly.


> It turns out that most recursive functions can be reworked into the tail-call form.

This statement in the beginning is not entirely correct. A more accurate statement would be that all recursive programs that are _iterative_ (if they are loops in disguise), can be rewritten in a tail-call form. That is, there must be a single chain of function calls.

The inherently recursive procedures cannot be converted into a tail-call form.


A patch that implements TCO in Python with explicit syntax like 'return from f(x)' could likely get accepted, ending these hacks


Would it? My impression is that Guido is fairly against any such thing occurring [0].

> So let me defend my position (which is that I don't want TRE in the language). If you want a short answer, it's simply unpythonic.

[0] http://neopythonic.blogspot.com.au/2009/04/tail-recursion-el...


His primary concern is with implicit tail recursion

I tried making such a patch in the past, got stuck in the much of trying to update the grammar file in a way that wouldn't complain about ambiguity

Main thing to get from tail calls vs loops is the case of mutually recursive functions


His primary concern seems more to be stack traces.

At the time, an explicit style, with patch, was proposed to python-ideas. [0] It was based around continuation-passing-style, and the conclusion reached then by the community was the same. TCO, explicit or not, isn't wanted in Python.

> And that's exactly the point -- the algorithms to which TCO can be applied are precisely the ones that are not typically expressed using recursion in Python. - Greg Ewing [1]

> <sarcasm>Perhaps we should implement "come from" and "go to" while we're at it. Oh, let's not leave out "alter" (for those of you old enough to have used COBOL) as well! </sarcasm> - Gerald Britton [2]

Feel free to try again, maybe things have changed.

To be clear, I wish Python did have a mechanism to express these sorts of problems, but I don't think the Python team themselves want them. This issue has come up more than a few times, and the dev team have never been satisfied that Python really needs it.

[0] https://mail.python.org/pipermail/python-ideas/2009-May/0044...

[1] https://mail.python.org/pipermail/python-ideas/2009-May/0045...

[2] https://mail.python.org/pipermail/python-ideas/2009-May/0045...


It's actually not likely at ALL. Guido van Rossum said[0] on multiple occasions that it's un-pythonic and it won't happen.

Edit: I didn't see shakna's comment.

[0] http://neopythonic.blogspot.de/2009/04/tail-recursion-elimin...


That would be great, especially as it doubles as an annotation/assertion that TCO is both expected and required at that specific point in the code.


I experimented with something similar to this way back[1], but took a slightly different approach - you can replace the reference to the function itself inside the function with a new function[2], one that returns a 'Recurse' object. That way it looks like it's calling the original method but really it's doing your own thing.

1. https://tomforb.es/adding-tail-call-optimization-to-python/

2. https://gist.github.com/orf/41746c53b8eda5b988c5#file-tail_c...


I'm not a pythonista, but this code seems to get rid of the recursion limitation of the interpreter. Does it actually "optimize" things and make the function take a constant space as it is calling itself?


It takes a constant space since it is not even recursive. The decorator makes it a non-recursive function with a loop.

It'll effectively side-steps the recursion limit in Python. For runs under the limit anyway, it'd be interesting to see whether it's any faster. It trades function call overhead for exception handling overhead.

By the way, the first example where it has `return 1` is wrong. It shoudl `return accumulator`. Clicking the GitHub link someone suggested this in December.


You can also do this by rewriting functions with a decorator.

https://github.com/Bogdanp/tcopy


  def tail_factorial(n, accumulator=1):
    if n == 0: return 1
    else: return tail_factorial(n-1, accumulator * n)
This just returns 1 every time.


It should be:

  def tail_factorial(n, accumulator=1):
    if n == 0: return accumulator
    else: return tail_factorial(n-1, accumulator * n)


This article and the other comments here are interesting, but some are trying to be a bit too clever. The original article isn't too bad, but one of the other comments suggests re-writing the contents of the function at run time, which I really don't think is a practical suggestion (think about debugging such a thing).

If I wanted to do this in practice, I'd just write the trampoline out explicitly, unless I wanted to do it a huge number of times. Doing it this way only takes a couple of extra lines of code but I think that's worth it for the improvement in explicitness, which is a big help for future maintainers (possibly me!).

    from functools import partial

    def _tail_factorial(n, accumulator):
        if n == 0: 
            return accumulator
        else: 
            return partial(_tail_factorial, n - 1, accumulator * n)

    def factorial(n):
        result = partial(_tail_factorial, n, 1)
        while isinstance(result, partial):
            result = result()
        return result


Tail recursion is a programming idea left over from the LISP era. It's from when iteration constructs were "while" and "for", and there were no "do this to all that stuff" primitives. Python doesn't really need it.


Tail recursion is unrelated to WHILE and FOR.

Scheme also did not just introduce tail recursion, but full tail call optimization.

Python sure does not need it, it already has a more complex iteration stuff like generators.


Tail calls aren't always just used for some simple iteration. For example, you could have several mutually recursive functions calling each other in tail position. If you wanted to turn that into a loop, you'd have to roll all those functions into a single loop body, which would be made even less elegant due to the lack of goto statement. (TCO essentially turns a call into a goto whenever possible.)


Lots of languages can express it better though - even without gotos. For example in python you can do:

    while some_condition:
        x = one_generator(y)
        y = other_generator(x)
where the generators yield values. No need for goto, no TCO, no magic.

Even in languages like C, a nicer way to express it may be via two explicit state machines rather than going full Duff's device at this problem.


Python's generators are more magic. It's similar to some kind of COME FROM mechanism.


Weird comparison. Come from has no indication on the other side that it will happen. Generators are pretty explicit with yield. On the calling side they can be explicit with a next() call.


A generator may have multiple yields, if you call next(), then it comes from that call to the last yield call - based on the current execution context. The yield waits that the execution comes back to it.

The idea of function calls is much simpler - no yield magic necessary.


I used it to play with some functional programming in Python https://github.com/lion137/Functional---Python


> def tail_factorial(n, accumulator=1):

> if n == 0: return 1

> else: return tail_factorial(n-1, accumulator * n)

Does this ever return the accumulator?

[ed: ah, no. I see the first comment on the article is about this bug; it should return accumulator, not 1]


Tail recursion is a bad idea in multicore land. You end up with a one sided tree structure that can't be parallel processed.


Interesting use of exceptions.


Indeed although generally it's usually a bad idea to misappropriate the exception throwing / handling mechanism for other purposes, as it's probably be less well optimised, performance-wise, than other parts of a VM.


not in python. exceptions for flow control are not looked down upon unless it’s gratuitous usage. many frameworks do exactly this.


Even the language itself does this: if a generator that is being processed by a for loop returns (rather than yield), the language will raise a StopIteration exception, which the for loop with catch and use as a signal that it should exit.


Yes! the more I dive into general py libraries the more I see `try: import pylib2 except: pylib2 = None` etc,


This is the same as recur in Clojure. It's not general TCO, though, which is much more powerful.

I do think it's a shame that Python doesn't have general TCO. It's said to be unpythonic because it means there will be two ways to do things. But some things are so easily expressed as a recursion but require considerable thought to be turned into a loop.


The nice thing about recur in Clojure is that it won't even compile if the call isn't in the tail position. I've inadvertently made a code change that moved the recur call out of the tail position and the error became immediately obvious. With TCO you might not even notice until your stack blows up on a deep nesting.


> But some things are so easily expressed as a recursion but require considerable thought to be turned into a loop.

Do you have some examples of problem+solutions where tco works fine (in a language with tco) - but the manual translation is hard(ish)?

I wonder in part after reading the Julia thread on tco - and difficulties with providing guarantees in the general case with tco:

https://github.com/JuliaLang/julia/issues/4964


Usually, I implement state machines with mutually tail recursive functions. Each function represents one state.


Right. The general rewrite would be a loop with a switch and state functions that returned a state? (i was going to say state functions that called back to a step function, but I guess that'd still build a call stack).


> I do think it's a shame that Python doesn't have general TCO. It's said to be unpythonic because it means there will be two ways to do things.

The usual complaint I hear is about stack traces, not “two ways to do things”, which Python rather often provides anyway.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: