<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-7321273324412229482</id><updated>2011-12-28T15:42:06.350-08:00</updated><title type='text'>Thoughts on Programming</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://programming-puzzler.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7321273324412229482/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://programming-puzzler.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Puzzler</name><uri>http://www.blogger.com/profile/05992502488191304160</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>5</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-7321273324412229482.post-5754408362657468752</id><published>2011-11-20T02:54:00.000-08:00</published><updated>2011-11-20T03:40:41.105-08:00</updated><title type='text'>Review of 2011 free Stanford online classes</title><content type='html'>Over the summer, Stanford announced that they would be offering their AI class online for free. It made headlines, and a few weeks later they announced that they would be offering their intro to databases class and their machine learning class as well.&lt;br /&gt;&lt;br /&gt;I've been working through the material for all three classes so that I would know whether they were worth recommending to the high school students I work with, and also to satisfy my own personal curiosity about how the online classes would be conducted. Summary: The database and machine learning classes are excellent. Ironically, the AI class is pretty bad, even though it was the poster child for this wave of online offerings.&lt;br /&gt;&lt;br /&gt;The database class is the most accessible. I believe it is a freshman class at Stanford, and I think most CS-oriented high schoolers would do just fine with it. As with many CS classes, it certainly helps to have a strong background in discrete math. Specifically, prior exposure to mathematical logic, set theory, and relations makes it significantly easier to follow the discussions of relational algebra and relational design theory. The videos are fast-paced and interesting. The randomized quizzes that you can take over and over until you get 100% are a brilliant way to empower students to keep working until they have achieved mastery. The online homework system for practicing queries against a live database works quite well, and the exercises cover a nice range of difficulty from easy to hard. The teacher's weekly "screenside chats" and vibrant forum community really make it feel like you're "taking a class" rather than just working through a sterile set of videos and exercises. The material is well organized and is generally posted two to three weeks ahead of time for those who want to get ahead. All in all, it's the best example I've ever seen of what online education can potentially be.&lt;br /&gt;&lt;br /&gt;The machine learning class is of similarly high quality. It shares the same video technology and the same quiz engine. The machine learning class also features weekly programming assignments, using the free language Octave. The programming write-ups are very clear, and you can keep submitting your program until you get it perfect. The submission process is very easy. Unfortunately, I won't be able to recommend this class to many high school students. This class is a very math-centric approach to machine learning, and I think to fully appreciate the material you need to have a certain comfort level with the basics of linear algebra, and it helps to have seen multivariate calculus. I doubt many high school students have that mathematical background.&lt;br /&gt;&lt;br /&gt;Interestingly, just a few months ago, I watched some videos on "iTunes University" of the Stanford machine learning class, taught by the same professor (Andrew Ng). It is instructive to contrast my experience watching those classroom videos with my experience in the online class. The classroom videos tended to be quite long and slow, watching the professor scrawl long mathematical derivations on multiple blackboards. Without being able to see and do the related homeworks and programming assignments, it became difficult to follow the material. In contrast, the online course videos are much more briskly paced (because they know you can pause or rewatch the video if you don't get something), and the assignments do a great job of solidifying the knowledge before moving on to the next topic. It's amazing how much better the overall experience is with the online class than just watching the videos of the classroom lectures.&lt;br /&gt;&lt;br /&gt;As I said up top, the AI class is astonishingly bad compared to the other two. This is all the more surprising given that it is the one that gained the most widespread attention when it was announced. The website is much more poorly organized than the sites for the other two classes. The videos are poor quality - I mean this in both the literal sense (the video image is of a dimly lit piece of paper and the audio is muffled) and the content sense (the pace is much slower, failing to take advantage of the medium's ability to be paused or rewound). The questions interspersed in the video don't seem to be particularly well chosen to solidify knowledge; instead the questions are often just prompts to motivate the next topic -- you're not really expected to know the answer to the question when it is asked. This means that the only means to really solidify the knowledge is the homework quiz. These quizzes are poorly presented (rather than a clearly expressed, written statement, you have to listen to the instructor verbally explain the question) and there is no immediate feedback. Unlike the other two classes, the quiz is not randomized, so there is one set of questions and then you must wait a week to compare your answers against the correct answers (and the mechanism for checking your answers is somewhat clunky). The whole thing seems like the profs weren't ready to go prime time with this class. Three weeks into the class, the classroom forum section was still "coming soon", for example. In fact, when I last looked, they had completely punted on the forum section, and the page just said to "use Reddit" instead. Also, the videos tend to be posted quite late. Honestly, if I had nothing to compare it to, I might think it was okay, but relative to the other two classes, it is mediocre at best.&lt;br /&gt;&lt;br /&gt;If I were to judge the AI class solely in terms of its content, rather than on its presentation, my review wouldn't be any better. The class is really a breadth survey of various topics in AI, with no programming to back it up. Unless you're going to dig in and actually program some of these things, I really don't see the point. I believe the actual Stanford version of the class offered a programming component, but that it was dropped from the online class for logistical reasons. This is understandable, but it really takes away from the value of the class. One reason the machine learning class is so much better is because they did find a way to incorporate programming assignments.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7321273324412229482-5754408362657468752?l=programming-puzzler.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://programming-puzzler.blogspot.com/feeds/5754408362657468752/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://programming-puzzler.blogspot.com/2011/11/review-of-2011-free-stanford-online.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7321273324412229482/posts/default/5754408362657468752'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7321273324412229482/posts/default/5754408362657468752'/><link rel='alternate' type='text/html' href='http://programming-puzzler.blogspot.com/2011/11/review-of-2011-free-stanford-online.html' title='Review of 2011 free Stanford online classes'/><author><name>Puzzler</name><uri>http://www.blogger.com/profile/05992502488191304160</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7321273324412229482.post-292293396258078667</id><published>2010-08-10T22:13:00.000-07:00</published><updated>2010-08-10T22:20:39.162-07:00</updated><title type='text'>Racket vs. Clojure</title><content type='html'>I've been asked by several people to explain why I use Clojure for my professional work rather than Racket.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;ABOUT RACKET&lt;/h2&gt;&lt;br /&gt;I have been using Racket (a dialect of Scheme) for several years to teach kids how to program.  Although Racket is a great first language, it's definitely not a "toy language".  In fact, Racket offers a number of interesting features not found in other languages, making it an attractive option for real-world work.  Racket puts into practice state-of-the-art research on macros, continuations, contracts, and interoperation between static and dynamically typed code.  The integrated Scribble system makes it easy to provide high-quality documentation and/or write literate programs.  It comes with a pleasant, lightweight IDE complete with an integrated debugger and profiler (as well as innovative features such as a specialized macro debugger).&lt;br /&gt;&lt;br /&gt;I'm a fan of functional programming and dynamic typing.  I know how to write and think in Racket from my many years teaching it, so with all these features, it should be a slam dunk for me to use it professionally, right?&lt;br /&gt;&lt;br /&gt;Well, no....&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;IT'S ALL ABOUT THE DATA STRUCTURES&lt;/h2&gt;&lt;br /&gt;I have discovered that for me, the #1 factor that determines my programming productivity is the set of data structures that are built-in to the language and are easy to work with.  For many years, Python set the standard for me, offering easy syntax to manipulate extensible arrays (called lists in Python), hash tables (called dictionaries in Python), tuples (an immutable collection that can serve as keys in a hash table), and in recent versions of Python, sets (mutable and immutable), heaps, and queues.&lt;br /&gt;&lt;br /&gt;Racket, as a dialect of Scheme, places the greatest importance on singly-linked lists.  OK, that's a reasonable starting point -- you can do a lot with linked lists.  It also offers a vector, which is an old-fashioned non-extensible array that is fixed in length.  (Who wants fixed-length arrays as a primary data structure any more?  Even C++ STL offers an extensible vector...)&lt;br /&gt;&lt;br /&gt;Vectors are mutable, which is both a plus and a minus.  On the plus side, it allows you to efficiently write certain classes of algorithms that are hard to write with linked lists.  It serves a purpose that is different from linked lists, so there is value to having both in the language.  The huge minus is that Racket simply isn't oriented towards working conveniently with mutable vectors.  Working with mutable data structures conveniently demands certain kinds of control structures, and certain kinds of syntaxes.  You can write vector-based algorithms in Racket, but they look verbose and ugly.  Which would you rather read:&lt;br /&gt;a[i]+=3 or  (vector-set! a i (+ (vector-ref a i) 3)) ?&lt;br /&gt;But if you can get past the more verbose syntax, there's still the fundamental issue that all the patterns change when you move from using a list to a vector.  The way of working with them is so fundamentally different that there is no easy way to change code from using one to another.&lt;br /&gt;&lt;br /&gt;Racket goes further than most Scheme implementations in providing built-in data structures.  It also offers, for example, hash tables (and recently sets were added).  But the interface for interacting with hash tables is a total mess.  The literals for expressing hash tables use dotted pairs.  If you want to construct hash tables using the for/hash syntax, you need to use "values".  If you want to iterate through all the key/value pairs of a hash table, it would be nice if there were an easy way to recursively process the sequence of key/value pairs the way you would process a list.  Unfortunately, Racket provides no built-in lazy list/stream, so you'd need to realize the entire list.  But even if that's what you'd want to do, Racket doesn't provide a built-in function to give you back the list of keys, values or pairs in a hash table.  Instead, you're encouraged to iterate through the pairs using an idiosyncratic version of its for construct, using a specific deconstructing pattern match style to capture the sequence of key/value pairs that is used nowhere else in Racket.  (Speaking of for loops, why on earth did they decide to make the parallel for loop the common behavior, and require a longer name (for*) for the more useful nested loop version?)  Put simply, using hash tables in Racket is frequently awkward and filled with idiosyncracies that are hard to remember.&lt;br /&gt;&lt;br /&gt;There are downloadable libraries that offer an assortment of other data structures, but since these libraries are made by a variety of individuals, and ported from a variety of other Scheme implementations, the interfaces for interacting with those data structures are even more inconsistent than the built-ins, which are already far from ideal.&lt;br /&gt;&lt;br /&gt;I'm sure many programmers can live with the awkwardness of the built-in data structures to get the other cool features that Racket offers, but for me, it's a deal breaker.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;ENTER CLOJURE&lt;/h2&gt;&lt;br /&gt;Clojure gets data structures right.  There's a good assortment of collection types built in: lists, lazy lists, vectors, hash tables, sets, sorted hash tables, sorted sets, and queues. ALL the built-in data structures are persistent/immutable.  That's right, even the *vectors* are persistent.  For my work, persistent vectors are a huge asset, and now that I've experienced them in Clojure, I'm frustrated with any language that doesn't offer a similar data structure (and very few do).  The consistency of working only with persistent structures is a big deal -- it means you use the exact same patterns and idioms to work with all the structures.  Vectors are just as easy to work with as lists.  Equality is simplified.  Everything can be used as a key in a hash table.&lt;br /&gt;&lt;br /&gt;Data structures in Clojure get a little bit of syntactic support.  Not a tremendous amount, but every little bit helps.  Code is a little easier to read when [1 2 3] stands out as a vector, or {:a 1, :b 2, :c 3} stands out as a hash table.  Lookups are a bit more terse than in Racket -- (v 0) instead of (vector-ref v 0).  Hash tables are sufficiently lightweight in Clojure that you can use them where you'd use Racket's structs defined with define-struct, and then use one consistent lookup syntax rather than type-specific accessors (e.g., (:age person) rather than (person-age person)).  This gets to be more important as you deal with structures within structures, which can quickly get unwieldy in Racket, but is easy enough in Clojure using -&gt; or get-in.  Also, by representing structured data in Clojure as a hash table, you can easily create non-destructive updates of your "objects" with certain fields changed.  Again, this works just as well with nested data.  (Racket structs may offer immutable updates in future versions, but none of the proposals I've seen address the issue of updating nested structured data.)  Furthermore, Clojure's associative update function (assoc) can handle multiple updates in one function call -- contrast (assoc h :a 1 :b 2) with (hash-set (hash-set h 'a 1) 'b 2).&lt;br /&gt;&lt;br /&gt;Even better, the process for iterating through any of these collections is consistent.  All of Clojure's collections can be treated as if they were a list, and you can write algorithms to traverse them using the same pattern of empty?/first/rest that you'd use on a list.  This means that all the powerful higher-order functions like map/filter/reduce work just as well on a vector as a list.  You can also create a new collection type, and hook into the built-in sequence interface, and all the built-in sequencing functions will automatically work just as well for your collection.&lt;br /&gt;&lt;br /&gt;Although the sequencing functions work on any collection, they generally produce lazy lists, which means you can use good old recursion to solve many of the same problems you'd tackle with for/break or while/break in other languages.  For example, (first (filter even? coll)) will give you the first even number in your collection (whether a list, vector, set, etc.) and it will do so in a space-efficient manner -- it doesn't need to generate an intermediate list of *all* the even numbers in your collection.  Some garbage is generated along the way, but it can be garbage collected immediately and with relatively little overhead.  Clojure also makes it easy to "pour" these lazy sequences into the collection of your choice via into.  Racket's lack of a built-in lazy list makes it difficult to use map/filter/etc. for general processing of collections.  If you use map/filter/etc., you potentially generate a lot of intermediate lists.  You can use a stream library, but it was probably designed for other Scheme dialects with a naming scheme for the API that doesn't match Racket's built-in list functions or integrate well with Racket's other sequencing constructs.  So often you end up writing the function you need from scratch (e.g., find-first-even-number) rather than composing existing building blocks.  In some special cases, you can use one of the new for constructs, like in this case, for/first.&lt;br /&gt;&lt;br /&gt;A polymorphic approach is applied through most of Clojure's design.  assoc works on vectors, hash tables, sorted hash tables, and any other "associative" collection.  And again, you can hook into this with custom collections.  This is far easier to remember (and more concise to write) than the proliferation of vector-set, hash-set, etc. you'd find in Racket.  It also makes the various collections more interchangeable in Clojure, making it easier to test different alternatives for performance implications with fewer, more localized changes to one's code.&lt;br /&gt;&lt;br /&gt;Summary:&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt; Clojure provides a full complement of (immutable!) data structures you need for everyday programming and a bit of syntactic support for making those manipulations more concise and pleasant.&lt;br /&gt;&lt;li&gt; All of the collections are manipulated by a small number of polymorphic functions that are easy to remember and use.&lt;br /&gt;&lt;li&gt; Traversals over all collections are uniformly accomplished by a sequence abstraction that works like a lazy list, which means that Clojure's higher order sequence functions also apply to all collections.&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;CLOJURE'S NOT PERFECT&lt;/h2&gt;&lt;br /&gt;The IDEs available for Clojure all have significant drawbacks.  You can get work done in them, but any of the IDEs will probably be a disappointment relative to what you're used to from other languages (including Racket).&lt;br /&gt;&lt;br /&gt;Debugging is difficult -- every error generates a ridiculously long stack trace that lists 500 Java functions along with (maybe, if you're lucky) the actual Clojure function where things went awry.  Many of Clojure's core functions are written with a philosophy that they make no guarantees what they do with bad input.  They might error, or they might just return some spurious answer that causes something to blow up far far away from the true origin of the problem.&lt;br /&gt;&lt;br /&gt;Clojure inherits numerous limitations and idiosyncracies from Java.  No tail-call optimization, no continuations.  Methods are not true closures, and can't be passed directly to higher-order functions.  Proliferation of nil and null pointer exceptions.  Slow numeric performance.  Compromises with the way hashing and equality works for certain things to achieve Java compatibility.  Slow startup time.&lt;br /&gt;&lt;br /&gt;Some people love Clojure specifically because it sits on top of Java and gives them access to their favorite Java libraries.  Frankly, I have yet to find a Java library I'd actually want to use.  Something about Java seems to turn every library into an insanely complex explosion of classes, and Java programmers mistakenly seem to think that JavaDoc-produced lists of every single class and method constitutes "good documentation".  So for me, the Java interop is more of a nuisance than a help.&lt;br /&gt;&lt;br /&gt;Clojure has a number of cool new ideas, but many of them are unproven, and only time will tell whether they are truly valuable.  Some people get excited about these features, but I feel fairly neutral about them until they are more road-tested.  For example:&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt; Clojure's STM implementation - seems promising, but some reports suggest that under certain contention scenarios, longer transactions never complete because they keep getting preempted by shorter transactions.&lt;br /&gt;&lt;li&gt; agents - if the agent can't keep up with the requests demanded of it, the agent's "mailbox" will eventually exhaust all resources.  Perhaps this approach is too brittle for real-world development?&lt;br /&gt;&lt;li&gt; vars - provides thread isolation, but interacts poorly with the whole lazy sequence paradigm that Clojure is built around.&lt;br /&gt;&lt;li&gt; multimethods - Clojure provides a multimethod system that is far simpler than, say CLOS, but it requires you to explicitly choose preferences when there are inheritance conflicts, and early reports suggest that this limits extensibility.&lt;br /&gt;&lt;li&gt; protocols - This is an interesting variation on "interfaces", but it's not clear how easy it will be to compose implementations out of partial, default implementations.&lt;br /&gt;&lt;li&gt; transients - Nice idea for speeding up single-threaded use of persistent data structures.  Transients don't respond to all the same interfaces as their persistent counterparts, though, limiting their usefulness.  Transients are already being rethought and are likely to be reworked into something new.&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;So it's hard for me to get excited about these aspects of Clojure when it remains to be seen how well these features will hold up under real-world use.&lt;br /&gt;&lt;br /&gt;I'm sure that for many programmers, Clojure's drawbacks or unproven ideas would be a deal breaker.  We all care about different things.  But for me, Clojure's clean coherent design of the API for working with the built-in data structures is so good, that overall, I prefer working in Clojure to working in Racket.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7321273324412229482-292293396258078667?l=programming-puzzler.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://programming-puzzler.blogspot.com/feeds/292293396258078667/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://programming-puzzler.blogspot.com/2010/08/racket-vs-clojure.html#comment-form' title='15 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7321273324412229482/posts/default/292293396258078667'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7321273324412229482/posts/default/292293396258078667'/><link rel='alternate' type='text/html' href='http://programming-puzzler.blogspot.com/2010/08/racket-vs-clojure.html' title='Racket vs. Clojure'/><author><name>Puzzler</name><uri>http://www.blogger.com/profile/05992502488191304160</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>15</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7321273324412229482.post-8355022397555906727</id><published>2010-07-26T19:59:00.000-07:00</published><updated>2010-07-26T20:59:32.273-07:00</updated><title type='text'>Translating Code from Python and Scheme to Clojure</title><content type='html'>When coming to Clojure from another language, it takes a while before you start "thinking in Clojure".  While ramping up, it helps to understand how to solve a problem in a language you're already familiar with, and then translate the code in some methodical fashion into Clojure.&lt;br /&gt;&lt;br /&gt;This article will look at a simple function, remove-first, and look at how you would implement that function in Python and Scheme, and then how to methodically transform those implementations into Clojure.  In all of these implementations, I'm going to ignore ways to write the function using shortcuts provided by the standard library, and focus on the implementations using standard iteration and/or recursive techniques.  This will provide the clearest example of how the translation process works and can generalize to other types of functions.&lt;br /&gt;&lt;br /&gt;Problem Statement: remove-first takes an item and a collection, and returns a new collection which is identical to the original, except the first instance of item (if any) has been removed from the collection.  If item is not in the collection, the new collection should be identical to the original (since nothing needs to be removed).&lt;br /&gt;&lt;br /&gt;First, let's look at the Python implementation.  Python's primary collection data structure is called a "list", but this is a bit of a misnomer, because in most languages, the term "list" is used to describe some sort of linked or doubly-linked list.  Python's list is nothing of the sort.  Python's list  allows fast (destructive) insertion and removal at the back end of the list, and fast lookup by index.  In most languages, this would be called an extensible array or extensible vector (in Java, it's called an ArrayList).&lt;br /&gt;&lt;br /&gt;The problem statement for remove-first calls for returning a new copy of the collection with the first instance of item removed.  Is this really idiomatic for Python?  It is certainly possible to write remove-first as a destructive function that actually modifies the original collection by removing the first instance of item.  In fact, such a destructive method is built-in to the list class (list.remove(item)).  But removing from the middle of a Python list is not an especially efficient operation, and Python has a culture of slices, comprehensions, and many other list operations that return fresh copies.  So yes, I think it is reasonable to talk about how to write a non-destructive removal in Python.&lt;br /&gt;&lt;br /&gt;Now remember, for the purposes of this article, we're trying to look at how to translate iterative algorithms, so it's a cheat to use built-in constructs that work around this.&lt;br /&gt;&lt;br /&gt;So this doesn't count, because it uses the built-in destructive removal:&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;def removeFirst(itemToRemove, coll):&lt;br /&gt;    newColl = coll[:]  # copies the collection&lt;br /&gt;    try:&lt;br /&gt;        newColl.remove(itemToRemove)  # throws an error if item is not present&lt;br /&gt;        return newColl&lt;br /&gt;    except:&lt;br /&gt;        return newColl&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Nor does this, although this is arguably the most idiomatic Python version of removeFirst, because it uses the built in index function which handles the iteration behind the scenes:&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;def removeFirst(itemToRemove, coll):&lt;br /&gt;    try:&lt;br /&gt;        i = coll.index(itemToRemove)&lt;br /&gt;        return coll[:i] + coll[i+1:]&lt;br /&gt;    except:&lt;br /&gt;        return coll[:]&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;In Python, the standard practice for writing such a function in an iterative fashion is to create a new list, and then iterate over the items of the initial list, adding the appropriate ones to the new list, and then returning the new list at the end.  This pattern of setting up an accumulator, and then using a for loop to add things to the accumulator, is a common one in Python.  For this particular problem, in English you might say, "I'm going to go through the items, adding them one at a time to the new collection.  If I hit one that matches the item to remove, I skip it and add the rest of the items directly to the new collection."&lt;br /&gt;&lt;br /&gt;But is it better to iterate directly over the items, or is it better to iterate over the indices and access the items through the indices?  When possible, it's preferred to iterate directly over the items, but unfortunately, Python has no good way to express "the rest of the items" once you hit an item that matches the one to remove.  You can only get at "the rest of the items" if you know the index in order to take a slice from there to the end.  So iterating through items versus iterating through indices yield slightly different strategies for this particular problem.&lt;br /&gt;&lt;br /&gt;If you really wanted to iterate directly over the items, you'd probably need to use a flag to track whether you've already removed the first occurrence of the item:&lt;br /&gt;&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;def removeFirst(itemToRemove, coll):&lt;br /&gt;    newCollection = []&lt;br /&gt;    alreadyRemovedItem = False&lt;br /&gt;    for item in coll:&lt;br /&gt;        if (alreadyRemovedItem):&lt;br /&gt;            # We've already removed the first instance of item&lt;br /&gt;            # so we're just in &amp;quot;copy&amp;quot; mode&lt;br /&gt;            newCollection.append(item)&lt;br /&gt;        else:&lt;br /&gt;            # We need to test whether the item matches itemToRemove&lt;br /&gt;            if (item == itemToRemove):&lt;br /&gt;                # don't copy this item over, but set alreadyRemovedItem flag&lt;br /&gt;                alreadyRemovedItem = True&lt;br /&gt;            else:&lt;br /&gt;                newCollection.append(item)&lt;br /&gt;    return newCollection&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;which could be refactored by reorganizing the if/else branches into the shorter:&lt;br /&gt;&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;def removeFirst(itemToRemove, coll):&lt;br /&gt;    newCollection = []&lt;br /&gt;    alreadyRemovedItem = False&lt;br /&gt;    for item in coll:&lt;br /&gt;        if (alreadyRemovedItem or item != itemToRemove):&lt;br /&gt;            newCollection.append(item)&lt;br /&gt;        else:&lt;br /&gt;            alreadyRemovedItem = True&lt;br /&gt;    return newCollection&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Alternatively, if you iterate through the indices, then you can use a list slice to capture the notion of "the rest of the list".&lt;br /&gt;&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;def removeFirst(itemToRemove, coll):&lt;br /&gt;    newCollection = []&lt;br /&gt;    for index in range(len(coll)):&lt;br /&gt;        item = coll[index]&lt;br /&gt;        if (item == itemToRemove):&lt;br /&gt;            # skip this item and rather than add the rest of the items&lt;br /&gt;            # one by one, we can just add the rest to newCollection&lt;br /&gt;            # all in one step&lt;br /&gt;            newCollection.extend(coll[index+1:])&lt;br /&gt;            # We're done now&lt;br /&gt;            return newCollection&lt;br /&gt;        else:&lt;br /&gt;            newCollection.append(item)&lt;br /&gt;    return newCollection&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Admittedly, this last version is a bit of a cheat by my own definition of avoiding built-ins that bypass iteration, since that's effectively what extend is doing.  But I don't mind it as much since we at least had to iterate until we found the item to remove, so it sufficiently demonstrates iteration techniques.&lt;br /&gt;&lt;br /&gt;I think either of these versions (iterating through items or iterating through indices) are good examples of common patterns that occur in Python, and are worth knowing how to translate to Clojure.&lt;br /&gt;&lt;br /&gt;Let's begin by translating the version that iterates through items.&lt;br /&gt;&lt;br /&gt;Clojure has a built-in datastructure that corresponds very closely to Python's lists.  In Clojure, it is called a vector.  Like Python lists, it  allows fast access to an element by index, and allows fast insertion and removal at the back end of the collection.  The main difference is that a Clojure vector, like all built-in Clojure data structures, is persistent and immutable.  That means that any operation on a Clojure vector returns some sort of fresh copy -- the original remains unchanged.  At first, this might sound wildly inefficient, but altered Clojure vectors can share a lot of internal structure with their source, precisely because of this guarantee of immutability, so it's actually pretty fast.&lt;br /&gt;&lt;br /&gt;But this requires a different way of thinking about algorithms.  We can't just create a new empty vector and destructively add things to it and return it, as we do with Python.  (Well, of course, since Clojure sits on top of Java, you can easily use Clojure to create a Java ArrayList and use exactly the same pattern as Python, but you're here to learn "the Clojure way", right?)&lt;br /&gt;&lt;br /&gt;Perhaps the most naive way to translate the code is to simulate a mutable vector in Clojure by wrapping a vector in some sort of mutable reference type (e.g., an atom).  You'd need to do the same thing to the already-removed-flag.  The @ sign is then used to look at the current contents of the mutable reference.  This is very bad form for this kind of algorithm, and not particularly efficient, but it works, and has an almost exact parallel to the Python code:&lt;br /&gt;&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;; This is bad style, don't do this!!!&lt;br /&gt;(defn remove-first [item-to-remove coll]&lt;br /&gt;  (let [new-collection (atom []),&lt;br /&gt;        already-removed-item (atom false)]&lt;br /&gt;    ; doseq is just like Python's for loop&lt;br /&gt;    (doseq [item coll]&lt;br /&gt;      (if (or @already-removed-item (not= item item-to-remove))&lt;br /&gt;        (swap! new-collection conj item)  ; just like append&lt;br /&gt;        (reset! already-removed-item true)))  ; sets flag to true&lt;br /&gt;    @new-collection))&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The better way to tackle this translation is to closely analyze the original, and identify any accumulator or variable that changes as you loop, and then thread those through the loop.  I'll explain shortly what I mean by threading values through the loop, but there's one other issue that needs to be considered.  You've already seen Clojure's doseq construct in action as a counterpart to Python's for loop, but it's only relevant for triggering destructive or side-effect-filled actions -- it's not the right tool for the job when trying to build up a persistent Clojure vector.  Clojure has a for construct but it's not a general looping construct, rather, it corresponds to Python's list and generator comprehensions.&lt;br /&gt;&lt;br /&gt;Clojure really only has one general-purpose looping construct, known as loop/recur.  In time, you'll be able to see how to go directly from a Python-style for loop to a Clojure loop/recur, but initially, it's far easier to see how to go from a Python-style while loop to a Clojure loop/recur.  So as an intermediary step in translating our Python code to Clojure, let's begin by rewriting the Python for loop into a while loop.  Here is one way to do that rewrite:&lt;br /&gt;&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;def removeFirst(itemToRemove, coll):&lt;br /&gt;    newCollection = []&lt;br /&gt;    alreadyRemovedItem = False&lt;br /&gt;    collIterator = iter(coll)&lt;br /&gt;    while(True):&lt;br /&gt;        try:&lt;br /&gt;            item = collIterator.next()&lt;br /&gt;            if (alreadyRemovedItem or item != itemToRemove):&lt;br /&gt;                newCollection.append(item)&lt;br /&gt;            else:&lt;br /&gt;                alreadyRemovedItem = True&lt;br /&gt;        except:&lt;br /&gt;            # next() triggers an error when the end of the list is reached&lt;br /&gt;            # so we're done&lt;br /&gt;            return newCollection&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The advantage of rewriting as a while loop is that it helps us identify a couple of very important things.  First, it let's us see the exit condition.  The exit condition occurs when you've reached the end of the list, at which point newCollection holds the answer and can be returned.  Second, it makes it a tad easier to analyze which things from outside the loop are being updated while inside the loop.  The assignment to item is just creating a local variable within a given iteration of the loop, so that's not really the kind of thing we're looking for.  But we should take note that newCollection, initialized outside the loop, is destructively extended within the loop, and the flag alreadyRemovedItem can also change from within the loop.  Furthermore, it's now quite obvious that something needs to track the iteration through coll; this is done by collIterator which is updated each time through the while loop by the call to its next() method.  So collIterator, newCollection and alreadyRemovedItem are the things we're going to need to thread through our Clojure loop/recur structure.&lt;br /&gt;&lt;br /&gt;Once these things have been identified, we're ready to tackle the Clojure translation.  Iterators in Clojure work quite differently than in Python.  Almost any collection in Clojure can be converted to a "seq" (short for sequence) by calling a function called, you guessed it, seq.  For the moment, go ahead and think of it as an iterator.  The iterator will be nil when the collection is exhausted.  You call first to get the item the iterator is pointing at, and next to (non-destructively) advance the iterator.&lt;br /&gt;&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;(defn remove-first [item-to-remove coll]&lt;br /&gt;; Start a loop, identifying and initializing the things that will change in the loop&lt;br /&gt;  (loop [coll-iterator (seq coll),&lt;br /&gt;         new-collection [],&lt;br /&gt;         already-removed-item false]&lt;br /&gt;    ; coll-iterator is nil when you're at the end of the collection.&lt;br /&gt;    ; Clojure treats all non-nil, non-false values as true.&lt;br /&gt;    (if coll-iterator&lt;br /&gt;      ; We haven't reached the end of the collection&lt;br /&gt;      (let [item (first coll-iterator)]&lt;br /&gt;        (if (or already-removed-item (not= item item-to-remove))&lt;br /&gt;          ; update new-collection and advance coll-iterator.  We do this using recur&lt;br /&gt;          ; to jump back to loop, rebinding coll-iterator to (next coll-iterator),&lt;br /&gt;          ; new-collection to (conj new-collection item), and&lt;br /&gt;          ; leaving already-removed-item unchanged&lt;br /&gt;          (recur (next coll-iterator) (conj new-collection item) already-removed-item)&lt;br /&gt;&lt;br /&gt;          ; else, advance iterator, leave new-collection unchanged,&lt;br /&gt;          ; and set already-removed-item to true&lt;br /&gt;          (recur (next coll-iterator) new-collection true)))&lt;br /&gt;      ; We have reached the end of collection, so new-collection is the answer&lt;br /&gt;      new-collection)))&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;      &lt;br /&gt;Note that nothing destructive is happening here; it's all mutation-free.  (next coll-iterator) actually returns a new iterator object, (conj new-collection item) actually creates a new vector with item appended to the end.  Clojure makes these operations cheap, and recur lets us pass the new objects back to the top of the loop and reuse the names given in the loop construct.&lt;br /&gt;&lt;br /&gt;Now I'll let you in on a little secret.  Rather than thinking of (seq coll) as returning an iterator, you can think of it as returning a linked-list-style view of the collection.  Better yet, virtually all sequential-style functions in Clojure call seq implicitly, so that means, for all practical purposes, you can pretend that any Clojure collection is a linked list.  Lists are able to answer three important questions:  are you empty, what is your first element, and what are the rest of your elements?  These correspond to empty?, first, and rest in Clojure.  So just by thinking of our collection as a list, we have an extremely powerful way to iterate through it using recursion.  We can rewrite the above Clojure code with this in mind:&lt;br /&gt;&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;(defn remove-first [item-to-remove coll]&lt;br /&gt;  (loop [coll coll,  ; no explicit call to seq is needed, &lt;br /&gt;                     ; we can reuse the name coll for clarity&lt;br /&gt;         new-collection [],&lt;br /&gt;         already-removed-item false]&lt;br /&gt;    (if (empty? coll)&lt;br /&gt;      new-collection&lt;br /&gt;      (let [item (first coll)]&lt;br /&gt;        (if (or already-removed-item (not= item item-to-remove))&lt;br /&gt;          (recur (rest coll) (conj new-collection item) already-removed-item)&lt;br /&gt;          (recur (rest coll) new-collection true))))))&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This concludes our mechanical translation of the iterate-through-items Python version.  It may look odd to you if you've never seen this kind of looping before.  Some programmers actively prefer this style of looping because it makes it abundantly clear which things are changing each time through the loop, and how, and precisely what the exit condition is and what value you exit with.  In other words, it's arguably more explicit and easier to analyze a loop/recur structure than a for loop that's mucking around with mutable objects located outside the loop.  There's truth to this, but I also sympathize with those who find loop/recur to be less readable.  For loops nest well and allow for some pretty intricate control flow with judicious use of continue and break; complex nested for loops like that can be hard to analyze, but the equivalent loop/recur can be even worse.  Nevertheless, loop/recur is the way general looping is done in Clojure, so for now, we'll just accept it along with its advantages and disadvantages and move on.&lt;br /&gt;&lt;br /&gt;Now it's time to look at the iterate-through-indices version.  Again, we begin by converting the Python for loop to a Python while loop.&lt;br /&gt;&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;def removeFirst(itemToRemove, coll):&lt;br /&gt;  newCollection = []&lt;br /&gt;  index = 0&lt;br /&gt;  while (True):&lt;br /&gt;    if (index == len(coll)):&lt;br /&gt;      # We're done now&lt;br /&gt;      return newCollection&lt;br /&gt;    else:&lt;br /&gt;      item = coll[index]&lt;br /&gt;      if (item == itemToRemove):&lt;br /&gt;        # Add the rest of the elements all at once and we're done.&lt;br /&gt;        newCollection.extend(coll[index+1:])&lt;br /&gt;        return newCollection&lt;br /&gt;      else:&lt;br /&gt;        newCollection.append(item)&lt;br /&gt;        index += 1&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This time, we see the things that change while looping are newCollection and index.  The above code can now be mechanically translated to:&lt;br /&gt;&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;(defn remove-first [item-to-remove coll]&lt;br /&gt;  (loop [new-collection [],&lt;br /&gt;         index 0]&lt;br /&gt;    (if (= index (count coll))&lt;br /&gt;      new-collection&lt;br /&gt;      (let [item (coll index)]&lt;br /&gt;        (if (= item item-to-remove)&lt;br /&gt;          ; into is like Python's extend, subvec is like Python's slice&lt;br /&gt;          (into new-collection (subvec coll (inc index)))&lt;br /&gt;          (recur (conj new-collection item) (inc index)))))))&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Notice how in Clojure code, we don't need to explicitly call return, return is implied.&lt;br /&gt;&lt;br /&gt;In Python we had two basic strategies, iterate by items and iterate by indices, and both had an analog in Clojure.  But remember, the reason why we needed two strategies in Python was that there was no way to capture the "add the rest of the items to the collection" concept in the version that iterated through items, so we were forced to choose between iterating through items and use a flag to go into "copy items until reaching the end of the list"-mode, or use indices so we could take a slice.&lt;br /&gt;&lt;br /&gt;But Clojure offers us a way to fuse these two strategies together, because its basic iteration mechanism (the linked-list-style view) DOES make it extremely easy to work with the notion of "the rest of the items" without any index manipulation or slicing.&lt;br /&gt;&lt;br /&gt;Fusing the two Python strategies in Clojure, we get this:&lt;br /&gt;&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;(defn remove-first [item-to-remove coll]&lt;br /&gt;  (loop [coll coll,&lt;br /&gt;         new-collection []]&lt;br /&gt;    (if (empty? coll)&lt;br /&gt;      new-collection&lt;br /&gt;      (let [item (first coll)]&lt;br /&gt;        (if (= item item-to-remove)&lt;br /&gt;          ; Extend new-collection with rest of coll and return in one step&lt;br /&gt;          (into new-collection (rest coll))&lt;br /&gt;          (recur (rest coll) (conj new-collection item)))))))&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;There's one downside to this implementation, namely, we're paying a performance penalty for gradually building up this vector by extending an immutable object one item at a time, when we don't care about and will never use the intervening steps between the empty vector and the final new collection.  It's not a huge penalty, but it's real.  Fortunately, Clojure offers a "recipe" to convert such functions.  Upon entering the loop, you initialize the thing you're building to a transient (somewhat more mutable) vector.  Then you append to it using conj! rather than conj, and finally you convert it back to immutable at the end using persistent!.  The result looks like this:&lt;br /&gt;&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;(defn remove-first [item-to-remove coll]&lt;br /&gt;  (loop [coll coll,&lt;br /&gt;         new-collection (transient [])]&lt;br /&gt;    (if (empty? coll)&lt;br /&gt;      (persistent! new-collection)&lt;br /&gt;      (let [item (first coll)]&lt;br /&gt;        (if (= item item-to-remove)&lt;br /&gt;          (into (persistent! new-collection) (rest coll))&lt;br /&gt;          (recur (rest coll) (conj! new-collection item)))))))&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Note that the overall shape of the code has not changed, we've just added a few annotations that improve performance.  But remember this step is optional, the previous version is perfectly fine for most purposes.&lt;br /&gt;&lt;br /&gt;Although we were mainly trying to mimic the Python code, it's worth noting that Clojure code is highly polymorphic.  Whereas the Python code only takes Python lists (and maybe some of the above versions would take strings), this Clojure algorithm will work on any collection (arrays, lists, lazy lists, vectors, sets, maps, strings, etc.) because all have that wonderful property of being viewable as lists.  However, the returned collection is specifically a vector, no matter the input, so depending on the context, the polymorphism of the input may have limited utility.&lt;br /&gt;&lt;br /&gt;Now it's time to move on to Scheme.  We'll look at a standard Scheme implementation of remove-first, and see how to translate that into Clojure.  These examples have been tested in the Racket dialect of Scheme.&lt;br /&gt;&lt;br /&gt;In Scheme, the most basic, native collection type is the linked list.  The three fundamental operations on a Scheme list are empty?, first, and rest (sound familiar?) and you can also non-destructively add an item to the front of the list with (cons item list).  Adding to the back of a list is a slow operation, so the strategy of building up a new collection by adding to the back is not a particularly desirable one in Scheme.  Instead, the strategy is to use recursion, essentially allowing the call stack to build the sequence of items that need to be added, eventually, to the front of an empty list.  It sounds confusing, but once you have your head wrapped around recursion, it all makes perfect sense (and if you don't understand recursion, head directly to htdp.org).&lt;br /&gt;&lt;br /&gt;In any case, a typical Scheme implementation looks like this:&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;(define (remove-first item-to-remove coll)&lt;br /&gt; (if (empty? coll)&lt;br /&gt;   empty&lt;br /&gt;   (let ([item (first coll)])&lt;br /&gt;     (if (equal? item item-to-remove)&lt;br /&gt;         (rest coll)&lt;br /&gt;         (cons (first coll) (remove-first item-to-remove (rest coll)))))))&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Clojure also has a list collection, and the translation is about as straightforward as it could possibly be, requiring only a couple syntactic changes:&lt;br /&gt;&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;(defn remove-first [item-to-remove coll]&lt;br /&gt;  (if (empty? coll)&lt;br /&gt;    ()  ; literal name for empty list&lt;br /&gt;    (let [item (first coll)]&lt;br /&gt;      (if (= item item-to-remove)&lt;br /&gt;        (rest coll)&lt;br /&gt;        (cons (first coll) (remove-first item-to-remove (rest coll)))))))&lt;br /&gt;&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;But there's a catch.  Because Scheme relies on this kind of programming style, Scheme implementations are designed in such a way so that the stack has rather huge limits, basically being limited by your overall memory rather than some specific call-stack memory limitation.  In other words, if the resulting list is small enough to fit in your computer's memory, than in all likelihood the call stack necessary to process it with recursion will fit as well.  So call stack limitations are mostly a non-issue in Scheme.&lt;br /&gt;&lt;br /&gt;However, Clojure is limited to Java stack limits, so this style of writing will definitely place a limit on the size of collection that can be processed by this function.  Fortunately, there is a rather simple solution.  Clojure offers seamless interoperation between lazy lists and regular lists.  Lazy lists solve the stack problem by avoiding the recursive step, returning immediately with a list-like object that can be probed on-demand for first and rest information by the consumer.  Further elements will be computed as needed, and will be driven by the consumer's looping process.&lt;br /&gt;&lt;br /&gt;This is accomplished by wrapping a call to lazy-seq around some part of the computation.  There are at least three reasonable places to place the call to lazy-seq.  You can put lazy-seq around the full body of the function.  &lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;(defn remove-first [item-to-remove coll]&lt;br /&gt;  (lazy-seq (if (empty? coll)&lt;br /&gt;              ()&lt;br /&gt;              (let [item (first coll)]&lt;br /&gt;                (if (= item item-to-remove)&lt;br /&gt;                  (rest coll)&lt;br /&gt;                  (cons (first coll) (remove-first item-to-remove (rest coll))))))))&lt;br /&gt;&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;You can place it around the cons.&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;(defn remove-first [item-to-remove coll]&lt;br /&gt;  (if (empty? coll)&lt;br /&gt;    ()&lt;br /&gt;    (let [item (first coll)]&lt;br /&gt;      (if (= item item-to-remove)&lt;br /&gt;        (rest coll)&lt;br /&gt;        (lazy-seq (cons (first coll) (remove-first item-to-remove (rest coll))))))))&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;You can place it around the recursive call to remove-first.&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;(defn remove-first [item-to-remove coll]&lt;br /&gt;  (if (empty? coll)&lt;br /&gt;    ()  ; literal name for empty list&lt;br /&gt;    (let [item (first coll)]&lt;br /&gt;      (if (= item item-to-remove)&lt;br /&gt;        (rest coll)&lt;br /&gt;        (cons (first coll) (lazy-seq (remove-first item-to-remove (rest coll))))))))&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Each choice results in slightly different laziness behavior, i.e., when various elements are computed, but the overall semantics of the sequence remains the same and stack overflows will be avoided.  Placing the lazy-seq around the recursive function call will cause remove-first to compute the first element right away, and delay the rest.  Placing the lazy-seq around the full body will prevent any computation until it is asked for by a consumer.  Placing the lazy-seq around the cons results in in immediate behavior for the nil and removable-item-at-front-of-list case, and delayed behavior otherwise.&lt;br /&gt;&lt;br /&gt;All are acceptable choices, but preferences vary.  Probably placing lazy-seq around the full body is the most common style you'll see in Clojure, although I tend to place it where the laziness is actually required (like around the recursive call, or around the cons).&lt;br /&gt;&lt;br /&gt;Converting remove-first so that it returns lazy lists definitely generates some additional overhead than a strict list.  However, this overhead pays for itself if you ever end up using just part of the list, because no time is spent generating the parts you don't need.  There's something very refreshing, freeing, and efficient about writing functions that find the first object with some particular property by using the strategy of taking the first item from the list of ALL objects with that particular property, knowing full well that the complete list will never be generated.  Lazy lists can be used as an alternative to many traditional control structures (the above example of taking the first item from a lazy-list of objects satisfying a given description is an elegant substitute for something that would require a for-loop-break iteration in a traditional language).  Generally speaking, lazy lists are more useful than strict lists, and for that reason, lazy lists are the norm in Clojure rather than the exception.&lt;br /&gt;&lt;br /&gt;Since Clojure offers both vectors (similar to Python's lists) and lists (similar to Scheme's lists), we have seen that it is possible to convert both algorithmic styles into Clojure.  With Python, the main thing that needed to be dealt with was adapting the algorithm to build an immutable, rather than a mutable, vector.  This was further complicated by the fact that Clojure's general loop/recur looping construct doesn't exactly match up with Python's for loop construct and a conversion process is needed.  But the final result captured the spirit of the Python code well.  Coming from Scheme was easier, and the only real modification that was necessary was to output a lazy list rather than a strict list.  Both versions can take any collection as an input, but one produces vectors and the other produces lazy lists.  Both implementations are legitimate choices, depending on the desired usage.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7321273324412229482-8355022397555906727?l=programming-puzzler.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://programming-puzzler.blogspot.com/feeds/8355022397555906727/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://programming-puzzler.blogspot.com/2010/07/translating-code-from-python-and-scheme.html#comment-form' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7321273324412229482/posts/default/8355022397555906727'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7321273324412229482/posts/default/8355022397555906727'/><link rel='alternate' type='text/html' href='http://programming-puzzler.blogspot.com/2010/07/translating-code-from-python-and-scheme.html' title='Translating Code from Python and Scheme to Clojure'/><author><name>Puzzler</name><uri>http://www.blogger.com/profile/05992502488191304160</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7321273324412229482.post-1681475109701823660</id><published>2009-04-21T01:39:00.000-07:00</published><updated>2009-04-21T03:53:56.508-07:00</updated><title type='text'>ADTs in Clojure</title><content type='html'>Today on the Clojure mailing list, a user asked:&lt;br /&gt;"Is the concept of Abstract Data Types useful in Clojure?&lt;br /&gt;If yes, how would you implement one?"&lt;br /&gt;&lt;br /&gt;This is an important question because coding to interfaces and abstractions, rather than concrete data, is a key aspect to writing scalable programs.&lt;br /&gt;&lt;br /&gt;My favorite treatment of ADTs is in &lt;a href="http://www.info.ucl.ac.be/~pvr/book.html"&gt;Concepts, Techniques, and Models of Programming by Peter Van Roy and Seif Haridi&lt;/a&gt;.  This book shows all the variations of ADTs: stateful and declarative (aka functional), bundled and unbundled, open and secure.  I'm going to walk through a few of these variations in Clojure.&lt;br /&gt;&lt;br /&gt;Let's consider a classic stack ADT.  Since Clojure primarily emphasizes functional programming, I'm going to focus on functional implementations of this ADT, i.e., versions in which pushing and popping a stack are non-destructive and actually return a new stack.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Open unbundled&lt;/h2&gt;&lt;br /&gt;&lt;br /&gt;Now the whole point of an ADT is to separate the implementation from the interface, but for starters, let's implement a stack ADT using an underlying implementation of a list.  In an "unbundled" implementation, the stack data is completely separate from the functions that operate on it.  It is "open" because we make no attempt to hide the implementation details of the stack.  We are simply going to trust the programmer to use our interface and ignore those details.&lt;br /&gt;&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;&lt;br /&gt;(defn stack-new [] nil)&lt;br /&gt;(defn stack-push [s e] (cons e s))&lt;br /&gt;(defn stack-top [s] (first s))&lt;br /&gt;(defn stack-pop [s] (rest s))&lt;br /&gt;(defn stack-empty? [s] (empty? s))&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This open, unbundled implementation is probably the simplest way to code an ADT in Clojure.  Because it is the simplest, this is probably the most common kind of coding you'll see in Clojure, and thus many would argue it is the most idiomatic.  But just because it's the easiest to code doesn't mean it's the best.&lt;br /&gt;&lt;br /&gt;The most obvious problem with an open implementation is that it is a leaky abstraction.  In other words, client code can easily see that this stack is implemented as a list.  And it would be all too easy to forget that this implementation is supposed to be hidden, and call a list function on our stack (e.g., count).&lt;br /&gt;&lt;br /&gt;I have seen several comments on the clojure mailing list suggesting that it is inevitable that any ADT written in Clojure will be inherently leaky.  But that is not the case...&lt;br /&gt;&lt;br /&gt;&lt;h2&gt; Secure unbundled &lt;/h2&gt;&lt;br /&gt;&lt;br /&gt;The idea is to wrap the data structure in something that effectively locks it with a unique key, so it can only be unwrapped with the correct key.  The only functions that know the key are the ones that serve as the interface for the ADT, so nothing else can inspect or tamper with the contents of the data structure.  For optimum security you'd need to use cryptographic techniques, but for illustration purposes, I've simply used gensym to generate a unique key.&lt;br /&gt;&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;(let [security-key (gensym),&lt;br /&gt;      wrap (fn [x]&lt;br /&gt;             (fn [k]&lt;br /&gt;               (if (= security-key k) x (throw (new Exception))))),&lt;br /&gt;      unwrap (fn [x] (x security-key))]&lt;br /&gt;&lt;br /&gt;  (defn stack-new [] (wrap nil))&lt;br /&gt;  (defn stack-push [s e] (wrap (cons e (unwrap s))))&lt;br /&gt;  (defn stack-top [s] (first (unwrap s)))&lt;br /&gt;  (defn stack-pop [s] (if (stack-empty? s) nil&lt;br /&gt;                          (wrap (rest (unwrap s)))))&lt;br /&gt;  (defn stack-empty? [s] (empty? (unwrap s))))&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Notice how wrap and unwrap are local to these functions.  If you play around with these stack functions, you'll see that the stacks you generate appear to the outside world as functions, and you need to pass it the right key to get at the innards.  In this case, a gensym is relatively easy to forge, but you should get the idea.  The innards are very well protected and there is essentially no way to manipulate the stack other than through the approved interface functions.&lt;br /&gt;&lt;br /&gt;One problem with unbundled ADTs is that they lack polymorphism.  If you have multiple stack implementations, each one will require its own set of interface functions with different names (or the same names in different namespaces).  If you want to write a function that operates over any stack implementation, it will first require you to pass in some sort of map of the interface functions to use.  Although this technique is relatively common in functional languages (especially in the ML family), it's a fairly clunky way to achieve polymorphism.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt; Secure bundled &lt;/h2&gt;&lt;br /&gt;&lt;br /&gt;The next logical step is to bundle the data along with the functions that know how to operate on it.  This looks something like this:&lt;br /&gt;&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;&lt;br /&gt;(let [stack-object (fn stack-object [s]&lt;br /&gt;       (let [push (fn [e] (stack-object (cons e s))),&lt;br /&gt;             top (fn [] (first s)),&lt;br /&gt;             pop (fn [] (if (seq s) (stack-object (rest s)) nil)),&lt;br /&gt;             empty? (fn [] (empty? s))]&lt;br /&gt;         {:push push, :top top, :pop pop, :empty? empty?}))]&lt;br /&gt;  (defn stack-new [] (stack-object nil)))&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;In this implementation, a stack is an associative map of four interface functions.  The actual data of the stack is hidden by lexical scoping so that only the interface functions can see it.  &lt;br /&gt;&lt;br /&gt;The big problem here is that the syntax for manipulating these bundled stacks is fairly unpleasant.  For example,&lt;br /&gt;&lt;code&gt;&lt;br /&gt;(def stack1 (stack-new))&lt;br /&gt;(def stack2 ((stack1 :push) 2))&lt;br /&gt;(def stack2-top ((stack2 :top)))&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;You can improve the readability by providing interface functions that look like the unbundled version:&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;(defn stack-push [s e] ((s :push) e))&lt;br /&gt;(defn stack-top [s] ((s :top)))&lt;br /&gt;(defn stack-pop [s] ((s :pop)))&lt;br /&gt;(defn stack-empty? [s] ((s :empty?)))&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;But there's a big difference between this and the unbundled version.  Mainly, we get polymorphism.  If you have two different concrete implementations of the ADT, your client code doesn't care.  When you call stack-push, for example, the function looks up the correct push function for this given implementation in the bundle, i.e., (s :push) and calls &lt;i&gt;that&lt;/i&gt; to do the pushing.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt; Secure bundled, another way &lt;/h2&gt;&lt;br /&gt;&lt;br /&gt;You may have noticed that the secure bundled approach is essentially the way that OO languages provide ADTs.  Since Clojure interoperates with Java, it stands to reason that you should be able to use Java to implement your ADTs.  Yes, you can.&lt;br /&gt;&lt;br /&gt;The first step is to define your interface.  You can either do this directly in Java, or use Clojure's ability to generate Java interfaces with something like this:&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;&lt;br /&gt;(gen-interface&lt;br /&gt; :name adt.IStack&lt;br /&gt; :methods [[push [Object] adt.IStack]&lt;br /&gt;           [top [] Object]&lt;br /&gt;           [pop [] adt.IStack]&lt;br /&gt;           [empty? [] Boolean]])&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This code would need to be compiled with Clojure's compile function, and that requires a bit of tricky setting up of classpaths and namespaces, but it's doable.  Then, the bundled ADT looks very similar to above:&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;&lt;br /&gt;(let [stack-object (fn stack-object [s]&lt;br /&gt;       (proxy [adt.IStack] []&lt;br /&gt;         (push [e] (stack-object (cons e s)))&lt;br /&gt;         (top [] (first s))&lt;br /&gt;         (pop [] (if (seq s) (rest s) nil))&lt;br /&gt;         (empty? [] (empty? s))))]&lt;br /&gt;  (defn stack-new [] (stack-object nil)))&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;You could also make it even more Java-like by using gen-class.  The dot syntax is a bit cleaner than the map-based syntax of the previous version, for example:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;(. s top)&lt;br /&gt;(. s push 2)&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;And as before, you can clean it up even more by creating unbundled interface functions that actually call the bundled versions behind the scenes.&lt;br /&gt;&lt;br /&gt;This is how most of the Clojure core interfaces and data structures are implemented, so in some sense, one could argue that this is the most idiomatic approach of all.  However, I think many Clojurians would prefer to get away from the Java approach if there is a better way.&lt;br /&gt;&lt;br /&gt;And bundled versions are not without their problems.  The CTM book gives a great example of a collection ADT and the challenge of writing a union function on the two collections.  The problem with the bundled version is that bundles must dispatch on one input, so the union function, when written from the perspective of the first collection, doesn't have access to the private parts of the second collection.  A version of the ADT that can see the innards of both inputs might be considerably more efficient.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt; Secure unbundled, revisited &lt;/h2&gt;&lt;br /&gt;&lt;br /&gt;If the primary limitation of bundled ADT implementations is their single-dispatch nature, perhaps there is a way to go back to the secure unbundled version, but leverage Clojure's multimethods to gain a more sophisticated kind of polymorphism.&lt;br /&gt;&lt;br /&gt;In this final, most sophisticated variant, I'm going to go ahead and show two concrete implementations, the list-based one we've been working with as well as a vector-based implementation, so you can see how the two implementations coexist side by side.  &lt;br /&gt;&lt;br /&gt;First, we define the interface as multimethods that dispatch on the type of stack.  Notice how we don't include the constructor stack-new as part of the polymorphic interface.  We'll need a separate constructor for each concrete implmentation.&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;&lt;br /&gt;(defmulti stack-push (fn [s e] (type s)))&lt;br /&gt;(defmulti stack-top type)&lt;br /&gt;(defmulti stack-pop type)&lt;br /&gt;(defmulti stack-empty? type)&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Since we're dispatching on type, we can't quite use the same wrapping representation as before (because functions can't have metadata).  This time, the wrapped representation of the stack will be a map with one field (:wrapped-stack) and metadata with the appropriate type.&lt;br /&gt;&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;code&gt;&lt;br /&gt;(let [security-key (gensym),&lt;br /&gt;      wrap (fn [x]&lt;br /&gt;            (with-meta&lt;br /&gt;              {:wrapped-stack&lt;br /&gt;               (fn [k]&lt;br /&gt;                 (if (= security-key k) x (throw (new Exception))))}&lt;br /&gt;              {:type ::list-stack})),&lt;br /&gt;      unwrap (fn [x] ((x :wrapped-stack) security-key))]&lt;br /&gt;&lt;br /&gt;  (defn list-stack-new [] (wrap nil))&lt;br /&gt;  (defmethod stack-push ::list-stack [s e] (wrap (cons e (unwrap s))))&lt;br /&gt;  (defmethod stack-top ::list-stack [s] (first (unwrap s)))&lt;br /&gt;  (defmethod stack-pop ::list-stack [s] (if (stack-empty? s) nil&lt;br /&gt;                                            (wrap (rest (unwrap s)))))&lt;br /&gt;  (defmethod stack-empty? ::list-stack [s] (empty? (unwrap s))))&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The vector-based version is almost the same:&lt;br /&gt;&lt;pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%"&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;(let [security-key (gensym),&lt;br /&gt;      wrap (fn [x]&lt;br /&gt;            (with-meta&lt;br /&gt;              {:wrapped-stack&lt;br /&gt;               (fn [k]&lt;br /&gt;                 (if (= security-key k) x (throw (new Exception))))}&lt;br /&gt;              {:type ::vector-stack})),&lt;br /&gt;      unwrap (fn [x] ((x :wrapped-stack) security-key))]&lt;br /&gt;&lt;br /&gt;  (defn vector-stack-new [] (wrap []))&lt;br /&gt;  (defmethod stack-push ::vector-stack [s e] (wrap (conj (unwrap s) e)))&lt;br /&gt;  (defmethod stack-top ::vector-stack [s] (peek (unwrap s)))&lt;br /&gt;  (defmethod stack-pop ::vector-stack [s] (if (stack-empty? s) nil&lt;br /&gt;                                              (wrap (pop (unwrap s)))))&lt;br /&gt;  (defmethod stack-empty? ::vector-stack [s] (empty? (unwrap s))))&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;In fact, they are so similar, it seems clear you could write a macro to abstract out some of the commonalities.&lt;br /&gt;&lt;br /&gt;In my mind, this is definitely the most interesting implementation of ADTs.  By using multimethods, there is the potential to implement ADTs that would be rather difficult to implement efficiently in other languages.  Unfortunately, it is also quite clear that this version is considerably more work to write than the naive, open unbundled version we started out with.&lt;br /&gt;&lt;br /&gt;I would very much like to see additional syntactic support for making secure, unbundled ADTs easier to write.  Something that can simplify or eliminate the need for explicit wrapping and unwrapping would be essential, and of course, a better system for generating security keys than gensym.  It's not clear to me whether this support could be provided entirely by a library, or whether additional constructs in the core would be needed.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt; What about equality? &lt;/h3&gt;&lt;br /&gt;Another big open question in my mind is, "What happens when you try to add equality?"  When you use Clojure's open structures to store your data (like maps), you get equality and hash codes for free.  But how hard would it be to add equality and hash code functionality to these secure implementations of ADTs?  Is it easier with some of these implementation styles than others?  I haven't played around with this aspect of Clojure enough to give an answer yet.  I'd love to hear comments from those who have.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7321273324412229482-1681475109701823660?l=programming-puzzler.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://programming-puzzler.blogspot.com/feeds/1681475109701823660/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://programming-puzzler.blogspot.com/2009/04/adts-in-clojure.html#comment-form' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7321273324412229482/posts/default/1681475109701823660'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7321273324412229482/posts/default/1681475109701823660'/><link rel='alternate' type='text/html' href='http://programming-puzzler.blogspot.com/2009/04/adts-in-clojure.html' title='ADTs in Clojure'/><author><name>Puzzler</name><uri>http://www.blogger.com/profile/05992502488191304160</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7321273324412229482.post-2301700338629261294</id><published>2009-01-08T01:08:00.000-08:00</published><updated>2009-01-08T02:29:20.667-08:00</updated><title type='text'>Laziness in Clojure – Traps, workarounds, and experimental hacks</title><content type='html'>&lt;span style="font-size:180%;"&gt;The power of laziness&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Consider the following snippet of Clojure code, which is similar to code found in several Clojure blogs, and code found in Clojure's contributed code base:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;(def whole-numbers (iterate inc 0))&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;This defines whole-numbers to a lazy sequence of all the numbers 0, 1, 2, ….  Laziness is a powerful concept, and it allows us to simulate an infinite sequence.  As you need more numbers, more are generated.&lt;br /&gt;&lt;br /&gt;So for example, let's say you want to find the first whole number that satisfies a given predicate function (let's call it pred).  In an imperative language, you might do something like this:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;i = 0&lt;br /&gt;while not pred(i):&lt;br /&gt;  i += 1&lt;br /&gt;return i&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;On the other hand, with laziness, you can do something like this:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;(first (filter pred whole-numbers))&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;Gotcha!&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;But there is a subtle problem with the definition of whole-numbers.  Do you see it?  I'll give you a hint:  (iterate inc 0) does in fact produce the lazy sequence of whole numbers.&lt;br /&gt;&lt;br /&gt;The problem is that we gave a name to it.  Huh?  How could that make a difference?  Yes, as surprising as it may seem, (first (filter pred (iterate inc 0))) works just fine, whereas (first (filter pred whole-numbers)) is likely to run slowly, and may even crash your program at runtime for certain inputs.&lt;br /&gt;&lt;br /&gt;How is this possible?  How could giving something a name break your code?  That's just crazy, right?&lt;br /&gt;&lt;br /&gt;Well, this pitfall stems from a design decision in Clojure that all lazy sequences are cached.  In other words, as Clojure expands the sequence, it automatically caches the values in a linked list, so that they never have to be computed again.  The next time you traverse the sequence, you are just seeing the values that were previously cached.&lt;br /&gt;&lt;br /&gt;Caching has a number of benefits.  If your sequence is very computation-intensive, and you plan to traverse it multiple times, then caching can give a huge performance boost.  If your sequence represents a traversal of some sort of non-persistent data structure,  caching is essential to ensure that repeat calls to first and rest always yield the same result.&lt;br /&gt;&lt;br /&gt;But Clojure caches all lazy sequences, which creates a number of traps for the unwary.  The “unnamed” version of the code works because the garbage collector collects the cached cells as it goes.  (Even though it is somewhat wasteful to cache all these cells and immediately throw them away, Java's collection of short-lived garbage is very fast, and it's not as much of a performance hit as you might expect).  However, when you give a name to the whole-numbers, the garbage collector can't do any collection.  As you traverse the whole-numbers sequence, you'll have a huge performance hit as massive gobs of memory are allocated.  And eventually, you'll go too far, run out of memory, and your program crashes.&lt;br /&gt;&lt;br /&gt;To see this in action, go compare the following on your Clojure setup:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;(nth (iterate inc 0) 20000000)&lt;br /&gt;(nth whole-numbers 20000000)  &lt;br /&gt;;uses the above def of whole-numbers&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;On my machine, the first example completes in just a few seconds.  The other crashes.&lt;br /&gt;&lt;br /&gt;Sometimes cached lazy sequences are very useful, but in this case, caching just gets in the way.  No one would ever want to cache the whole-numbers sequence.  It's significantly faster to generate the sequence via incrementing every time you need to traverse the sequence, than it is to cache and hold the values.  Furthermore, because the sequence represents an infinite sequence, there's no upper limit to the memory consumption.  If you use a named version of the whole numbers in your program, there's a good chance that eventually, your program will break with certain large enough inputs.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;Workarounds&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Programming is all about abstracting out commonalities in code, so it's rather frustrating that you can't give a name to the whole-numbers sequence and need to type (iterate inc 0) everywhere.  This is a fairly short definition, but you can imagine how with a more complex sequence, giving a name might really be essential.&lt;br /&gt;&lt;br /&gt;Fortunately, there is a workaround.  Instead of giving the whole-numbers sequence a name, you give a name to a function that knows how to produce the sequence.  In other words,&lt;br /&gt;&lt;br /&gt;&lt;code&gt;(defn whole-numbers [] (iterate inc 0))&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Now, you have to change all your uses of whole-numbers to a function call as well:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;(nth (whole-numbers) 20000000)  ;This works!&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;The reason this works is that every time you call this function, it produces a brand new sequence of whole-numbers, which is unnamed, so the garbage collector can collect as it goes.&lt;br /&gt;&lt;br /&gt;So when you're writing code in Clojure, every time you create a lazy sequence, you need to ask yourself two questions:&lt;br /&gt;&lt;br /&gt;1.    What will happen to my code if the entire sequence becomes realized in memory?  Is the sequence too big to be accommodated?&lt;br /&gt;2.    Is it cheaper to generate the items in the sequence from scratch each time, than to allocate the memory necessary to cache the items?&lt;br /&gt;&lt;br /&gt;If the answer to either of these questions is yes, then you should avoid naming your sequence, or wrap it in a function.&lt;br /&gt;&lt;br /&gt;If you plan to program in Clojure, you need to be aware of this pitfall, and you must know the workaround.  From what I can tell from the various blogs and posts on Clojure's google group, many new users are falling into this trap and naming potentially large sequences, creating fragile code with a very real danger of failure.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;Not Satisfied&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;There are several reasons why I find this workaround unsatisfying.&lt;br /&gt;&lt;br /&gt;1.    It imposes a signifcant cognitive burden because it not only affects the way you name a sequence (by wrapping it in a function), but it also affects every place you use the sequence.  In fact, once you have a library filled with a combination of regular lazy sequences, and some of these function-wrapped lazy sequences, then every time you use one of your sequences, you have to remember which kind it is in order to call it correctly.&lt;br /&gt;2.    It's not perfect.  Although wrapping a sequence in a function prevents any global var from pointing at the sequence, it is still possible for some function that manipulates the sequence to accidentally hold onto a reference to some portion of the sequence, causing a memory crash.  For example,&lt;a href="http://groups.google.com/group/clojure/browse_thread/thread/15f0463d96d8f4f0/4e255850cf2d29f5?q=lazy&amp;amp;lnk=ol&amp;amp;"&gt; it was recently pointed out on the google group&lt;/a&gt; that when filtering a large sequence (even if the sequence is unnamed), the filter function, as it is filtering, holds onto a portion of the sequence as it scans ahead to find the next element for the filtered sequence.  If the elements that pass the filtering test are spread out too far, the program crashes.  Several people looked closely at the code, and couldn't figure out why it was crashing.  Eventually, Rich Hickey, the designer of the language pointed out what was going on.  He plans to write a more complicated version of filter in a future version of Clojure that will avoid this particular problem, but that's not really the point.  The concern here is that even knowing the function-wrapping trick, cached lazy sequences represent a certain kind of danger that is difficult to isolate and understand.  When several Clojure programmers have trouble finding the source of a memory crash in a one-line piece of code, you can imagine how difficult it will be on a large body of code.&lt;br /&gt;3.    If you know for sure in advance that you'd rather not have the sequence be cached, there is currently no way to express that in Clojure.  This workaround doesn't actually suppress caching, it just makes it so the garbage collector can throw away the cached values right away.  You're still incurring a (small, but measurable) performance penalty from the unnecessary caching.&lt;br /&gt;4.    Even if the workaround worked reliably, this is a pretty big “gotcha” for newcomers to the language.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;Is there another way?&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;A few weeks ago, &lt;a href="http://groups.google.com/group/clojure/msg/23f782085801f784"&gt;I posted about this topic on the Clojure google group&lt;/a&gt;, questioning the wisdom of this design decision to have all lazy sequences cache their values.  I argued that it would be a cleaner design if lazy sequences did not cache their values, unless the programmer explicitly asked for the sequences to be cached.  Basically, my argument revolved around two main points.  First, it's easier to convert an uncached sequence to a cached sequence than vice versa.  Second, if you forget to cache something that should be cached, it's merely a performance optimization problem, but if you forget to uncache something that needs to be cached, your program can crash.  So, it's safer if the language defaults to uncached.&lt;br /&gt;&lt;br /&gt;Rich Hickey, the designer of Clojure, responded by saying that he had already experimented with uncached lazy sequences, and that such a choice causes a different set of problems with performance problems and code breaking – problems which he found to be even more common than the ones I've raised here.  He encouraged me to try my own experiments, and report back.  The rest of this blog post goes into more details about the nature of my experiments since that thread.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;Four categories of sequences&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Before taking the plunge and doing some experimenting, my first step was to analyze the various use cases of lazy sequences that need to be dealt with.  I found that lazy sequences fall into one of four categories:&lt;br /&gt;1.    A sequence where the function to generate the sequence is very fast to compute, and the function is a “pure” function in the sense that it always generates the same output.  For this use case, you never want to cache, and it becomes especially important if the sequence is large.&lt;br /&gt;2.    A sequence that will only be traversed once.  If you really know for sure in advance that you will only be traversing it once, then you're better off not caching.  Again, this is especially important if the sequence is large.&lt;br /&gt;3.    A sequence where it is slow to compute successive elements, and you'll possibly need to do this more than once.  Caching in this case is important for performance.&lt;br /&gt;4.    A sequence where the function to generate successive elements is not guaranteed to return the same values.  This can come up with Java interop, with I/O, and other sequence views of a non-persistent data structure.  Caching is essential here to impose a sane, consistent view on the data.&lt;br /&gt;&lt;br /&gt;RH seemed especially concerned about how category 4 sequences break without caching.  Now I'll be the first to admit that I haven't written a huge body of work in Clojure, but looking through my code from the past few weeks, I discovered that I didn't have any category 4 sequences in my code.  First, I don't tend to deal with Java interop;  I write entirely using the core Clojure data structures, which are all immutable.  So any lazy sequence I write is guaranteed to be consistent, even without caching.  I also don't do much with I/O.  I tend to just write functions that I can use interactively from the REPL.&lt;br /&gt;&lt;br /&gt;Now if I were going to create a lazy sequence from an ephemeral source, I would almost certainly be using one of Clojure's built-in functions that do this conversion for you, such as line-seq, resultset-seq, file-seq, iterator-seq, enumeration-seq, etc., rather than using lazy-cons directly.  So as long as those functions return a cached sequence, I pretty much don't have to worry about category 4.  Furthermore, RH has said that he is working on a completely new abstraction (tentatively called streams), that (as I understand it) is a better fit for I/O and other ephemeral sources than the sequence abstraction.  I speculate that once he has developed this new abstraction, the concerns about category 4 sequences will largely go away.  People will generally write “streams” over ephemeral sources, and then convert them to cached lazy sequences with a one-time call to stream-seq.  So as long as stream-seq builds a lazy sequence, category 4 is well supported, and we can analyze the relative merits of cached vs. uncached sequences for categories 1-3 separately.&lt;br /&gt;&lt;br /&gt;Since category 4 sequences aren't really present in my own code base, my experiments mainly revolve around trying to discover what feels natural for categories 1-3.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;Experiment #1 – Totally Uncached&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Since RH experimented with uncached laziness in a previous version of Clojure, the code for an uncached lazy sequence builder is right there in the existing Clojure codebase, but the corresponding macro (lazy-seq as opposed to lazy-cons), has been commented out.  So for my first experiment, I wanted to look at what it would feel like to code in an environment where everything in Clojure built from lazy-cons (except the ones that represent ephemeral “streams”) is uncached.&lt;br /&gt;&lt;br /&gt;To do this, I created a namespace called uncached, in which I copied over most of Clojure's core constructs that use lazy-cons (but not file-seq, enumerator-seq, etc.).  Within this namespace, I modified lazy-cons to create an uncached LazySeq rather than LazyCons.  In other words,&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;(defmacro lazy-cons [first-expr rest-expr]&lt;br /&gt;(list 'new 'clojure.lang.LazySeq&lt;br /&gt;(list `fn (list [] first-expr) (list [(gensym)] rest-expr))))&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;So anything I import from this namespace will build an uncached sequence.  To make the experiment as extreme as possible, I then went to every bit of Clojure code I've written (which I admit isn't that much, but hey, experiments have to start somewhere), and excluded all these functions from the core, using my uncached imported functions instead.&lt;br /&gt;&lt;br /&gt;I found that by using uncached sequences, my code felt a little zippier from a performance standpoint.  I found that the vast majority of the sequences that I construct are only used for one pass.  Furthermore, most of my sequences are very long (possibly infinite), and can be generated quickly, so the uncached behavior was a great default fit for me.&lt;br /&gt;&lt;br /&gt;One interesting example from my code is a function I wrote that produces a sequence of all the permutations of a given collection.  Now generating the next permutation in the sequence is not exactly a trivial operation.  So caching does in fact speed things up if you're going to traverse the permutation sequence multiple times.  However, what I discovered is that there's such a huge time hit from allocating the memory for caching purposes the first time through, that you'd have to traverse the permutation sequence at least 20 times to begin to make up for the time lost from caching.  Even so, caching becomes completely impractical once you hit permutations of 10+ items, so I've concluded that a permutations sequence should just be uncached.&lt;br /&gt;&lt;br /&gt;Now at one point, I applied a filter to my permutation sequence, to extract permutations that had a certain property.  This filtered sequence is something that did in fact make sense to cache, provided I intended to use it more than once  Fortunately, the Clojure api already includes a function called cache-seq which does exactly that.  I found it very easy to get the caching behavior I wanted for this specific case – at the point where I defined the filtered sequence, I wrapped it in a call to cache-seq.  Alternatively, I could have called vec on the sequence to explicitly realize the sequence.&lt;br /&gt;&lt;br /&gt;&lt;code&gt;(def fs (cache-seq (filter pred (permutations (range 10)))))&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;So, at least in my own code, the default of not caching sequences worked rather well.  There was one instance where I needed to cache the sequence, and it was easy to accomplish that.  But again, I need to admit that I've only written a small amount of Clojure code (probably no more than 2kloc).  So I can't claim this proves anything.  I'm providing my &lt;a href="http://www.filepanda.com/file/1pycyhc2ynvb/"&gt;simple uncached library&lt;/a&gt; so that others can also try this very interesting experiment.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;Experiment #2 – Take your pick&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;If we assume that defaulting to uncached isn't right for everyone, there's still the open question of what it would be like to program in a version of Clojure that offers a choice between cached and uncached versions of lazy-cons and its core sequence functions.  To explore this option, I made use of the same “uncached” library, but rather than excluding the core functions and overriding them with the uncached versions, I just “require”d my uncached library so that both versions of functions were available to me.  So I could call lazy-cons or uncached/lazy-cons, map or uncached/map, for or uncached/for.&lt;br /&gt;&lt;br /&gt;One really nice aspect of Clojure's design is the way that anything that adheres to the sequence interface works just fine with all the sequence functions.  So the amazing thing is that whether you choose to build a cached or uncached sequence, it makes not one bit of difference to the consumers of your sequence.  So once you make your decision as to whether a sequence should be cached or uncached, you can basically just forget about it and everything works seamlessly as you pass that sequence around.&lt;br /&gt;&lt;br /&gt;Despite that, at first it felt like a burden to have to constantly think about whether I needed a cached or uncached version of a sequence.  But then again, I had already been doing similar analysis to avoid getting burned by a memory crash from caching, so really it wasn't much different than before.  The main difference was that now I could really specify that I wanted something uncached, rather than using the function-wrapping workaround.  Consuming the two types of sequences was now equally easy, and I got a slight performance boost as well.&lt;br /&gt;&lt;br /&gt;I also noticed something rather interesting in the patterns of when I tended to call cached vs. uncached versions of the core functions.&lt;br /&gt;&lt;br /&gt;For some of the functions, I was always calling the uncached versions, namely cycle, repeat, replicate, interleave, interpose, take, take-while, butlast, concat, and lazy-cat.  And as I think about it further, I honestly can't think of any time you'd want a cached version of these functions.  Remember that if your underlying sequences that you are operating on are cached, these functions will be equally persistent, so it's really a question of how time-consuming their operations are, and these have very little overhead.  For this reason, I believe that, even if Clojure makes no other changes in its approach to laziness, it would be a simple, non-breaking, but significant improvement to change the above core functions to internally use lazy-seq as opposed to lazy-cons.&lt;br /&gt;&lt;br /&gt;On the other hand, I found that distinct, filter, remove, and drop-while were the most likely to need to be cached.&lt;br /&gt;&lt;br /&gt;If everything cleanly fell into the category of either definitely needing to be cached, or definitely needing to not be cached, things would be simple.  Alas, that is not the case.  For things like map, for, drop, and take-nth, it all totally depends on how complex the functions are (or how big the n is).&lt;br /&gt;&lt;br /&gt;So for those functions, it is very useful to be able to choose cached or uncached.  But this begs the question of what will happen when other programmers start creating sequence-producing functions.  In some cases they'll be able to make an executive decision in advance as to whether the resulting sequence is cached or uncached.  But what about the cases where the consumer will need to be able to make a choice.  Do we expect the programmer to provide both a cached and an uncached version?&lt;br /&gt;&lt;br /&gt;Contrast this with experiment #1, in which lazy-cons always produces uncached sequences.  With such behavior, the programmer of a new sequence-producing function just uses (uncached) lazy-cons – the consumer knows it will be uncached, and can easily turn it into cached at point of naming, if necessary.&lt;br /&gt;&lt;br /&gt;Summarizing Experiment #2, I'll say that I really liked having added control, and the ability to select cached or uncached sequences, but I just can't imagine how people will easily write libraries that provide both options.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;Experiment #3 – Intelligent auto-selection&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Since some of the sequences should clearly be cached, and some clearly not cached, it would be ideal if the borderline cases could be chosen intelligently by the language in ordrer to completely remove the cognitive burden of constantly having to choose.  At first, I thought maybe a scheme would work in which the cached/uncached behavior of the lazy-cons depends on the nature of the thing you're consing onto.  But this isn't really useful.  The desired cached/uncached behavior depends more on the complexity of the delayed function.  After some experimentation, I feel that it is not possible to automate the decision.  So this experiment was definitely a failure.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;Experiment #4 – Uncaching a cached sequence&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The major problem with Experiment #2 is that it forces library writers to supply two flavors of their sequence generating functions, which is impractical.  So I tried to get really clever.  For this experiment, I went back to the standard lazy-cons behavior, i.e., caching by default.  But then, I tried to write a macro that would suppress caching for any sequence built with lazy-cons.  I did this by setting up a global *cached* var that has a root binding of true.  Lazy-cons does whatever behavior the *cached* var is set to.  A special uncached macro binds the var to false.  Like this:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;(def *cached* true)&lt;br /&gt;&lt;br /&gt;(defmacro lazy-cons&lt;br /&gt;[first-expr rest-expr]&lt;br /&gt;(list 'if '*cached*&lt;br /&gt;    (list 'new 'clojure.lang.LazyCons&lt;br /&gt;      (list `fn (list [] first-expr)&lt;br /&gt;        (list [(gensym)] rest-expr)))&lt;br /&gt;    (list 'new 'clojure.lang.LazySeq&lt;br /&gt;      (list `fn (list [] (list 'binding ['*cached* 'false]&lt;br /&gt;                                first-expr))&lt;br /&gt;        (list [(gensym)] (list 'binding ['*cached* 'false]&lt;br /&gt;                                rest-expr))))))&lt;br /&gt;&lt;br /&gt;(defmacro uncached [&amp;amp; rst]&lt;br /&gt;`(binding [*cached* false]&lt;br /&gt;  ~@rst))&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This basically works, in the sense that you can say something like (uncached (map * (iterate inc 0) (iterate inc 1)))) and the uncached macro affects all the calls to lazy-cons within map and iterate, so  you've forced this thing to be uncached “all the way down”.   But the way my macro works, uncached sequences become extremely slow.  Because bindings aren't captured by the closures, the instruction to rebind *cached* has to be threaded through the delayed closures.  This noticeably hinders the performance of uncached sequences.  If you flipped it around and made uncached the default, then cached sequences would suffer the performance hit.  Is there a better way to write this macro?  If not, I must decree this experiment to be a failure.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:180%;"&gt;Conclusions&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I find Clojure's current cache-by-default-with-no-option-for-uncached laziness to be unsatisfying.  I genuinely hope there is a better solution, and I want to help find it.  Clearly sequences generated from ephemeral stream-like entitities must always be cached.  But dealing with the other types of sequences, I find that my experiments with uncached-by-default-with-option-for-cached laziness turned out to be quite pleasant.  This may very well be a function of my own programming niche, so I've provided a &lt;a href="http://www.filepanda.com/file/1pycyhc2ynvb/"&gt;simple uncached library&lt;/a&gt; so others can try to replicate this experiment with their own code.  If more people report success with uncached-by-default, maybe a stronger case can be made for change.&lt;br /&gt;&lt;br /&gt;My other experiments were less successful, although I learned quite a bit from trying them, which is why I reported on those experiments as well.  Most importantly, I gained a deeper understanding of what types of functions tend to produce sequences that need to be cached and which ones tend to produce sequences that should be uncached.  This suggests that, at a minimum, some of the core library functions would benefit from being changed to produce uncached sequences.&lt;br /&gt;&lt;br /&gt;Perhaps someone else will see a way to turn one of these approaches into something workable, or provide an entirely new solution.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7321273324412229482-2301700338629261294?l=programming-puzzler.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://programming-puzzler.blogspot.com/feeds/2301700338629261294/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://programming-puzzler.blogspot.com/2009/01/laziness-in-clojure-traps-workarounds.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7321273324412229482/posts/default/2301700338629261294'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7321273324412229482/posts/default/2301700338629261294'/><link rel='alternate' type='text/html' href='http://programming-puzzler.blogspot.com/2009/01/laziness-in-clojure-traps-workarounds.html' title='Laziness in Clojure – Traps, workarounds, and experimental hacks'/><author><name>Puzzler</name><uri>http://www.blogger.com/profile/05992502488191304160</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry></feed>
