Sunday, July 7, 2013

Stackless Clojure with core.async

Let's start off by writing a length function that computes the length of a sequence.

(defn length [l]
  (if (empty? l) 
    0
    (inc (length (rest l)))))

In many dialects of Scheme, this recursive strategy would be the idiomatic way to write the function. However, in Clojure, this gives us a stack overflow:

=> (time (length (range 20000)))
StackOverflowError   clojure.lang.ChunkedCons.more (ChunkedCons.java:48)

Clojure's stack limitation is something I often wrestle with. One idiomatic way to translate this code to Clojure is to write the function accumulator-style, using loop and recur to avoid stack consumption.

(defn accumulator-style-length [l]
  (loop [size 0
         l l]
    (if (empty? l) size
      (recur (inc size) (rest l)))))

But sometimes the translation to accumulator-style is not so straightforward, and I find myself wishing for a way to eliminate stack consumption while preserving the flow of the original code.

When I first read the announcement of core.async, I was struck by the comment that behind the scenes, it is converting go blocks into a combination of state machines and continuations. It occurred to me that this had a lot of similarity to the kind of continuation-passing transform one can do to turn stack-consuming code into tail-recursive code.

Just for fun, I wanted to see if core.async could be used to preserve the shape of a recursive function, while removing the stack limitation. Here's what I came up with:

(use 'clojure.core.async)

(defn go-length [l]
  (go (if (empty? l)
        0
        (inc (<! (go-length (rest l)))))))

(defn length [l]
  (<!! (go-length l)))

Notice that go-length follows the shape of the original recursive length function. We've wrapped the recursive call in <! and wrapped the entire body of the function in a go block. This function alone, however, will not produce the result we need. So we use this as a helper function to the length function which simply extracts the value from the call to go-length.

=> (time (length (range 20000)))
"Elapsed time: 54.570256 msecs"
20000

Hallelujah! It works. No stack overflow. Let's compare against the accumulator version:

=> (time (accumulator-style-length (range 20000)))
"Elapsed time: 1.500303 msecs"
20000

OK, so we're not going to win any performance awards with our core.async version of length, which means that this little stunt will probably remain more in the realm of "nifty little trick" than "practical way to code recursive functions", but I still think it's very cool that it works.

3 comments:

  1. "In many dialects of Scheme, this recursive strategy would be the idiomatic way to write the function. However, in Clojure, this gives us a stack overflow:"

    Really? I'd assume the idiomatic way to write this in scheme would be the tail recursive way, with a local function featuring an accumulator.

    ReplyDelete
    Replies
    1. Yes, really. Modern dialects of Scheme, like Racket, use clever tricks so that the size of the stack is limited only by your overall memory, so there's little reason not to use the straightforward non-tail-position recursion. And in fact, somewhat counterintuitively, the non-tail-position recursion implementation can be faster than the tail-position version.

      Delete
  2. Well, the revised report on scheme *requires* a scheme implementation to perform tail-call optimization.

    It does NOT require a scheme implementation to do clever trampoline/CPS transformations on generally recursive functions.

    Thus, the *idiomatic scheme way* must not rely on such implementation-defined capabilities.

    The idiomatic racket way might very well be what you describe.

    The idiomatic scheme (and lisp) way has to be to craft your recursions with tail-calls.

    ReplyDelete