Hacker News new | past | comments | ask | show | jobs | submit login

First of all, there's the issue of asymptotic complexity.

Usage of various data structures determines the asymptotic complexity of whatever you're doing. However, the thing that some uneducated people don't understand about asymptotic complexity is that this metric doesn't measure performance, but rather growth.

Usage of a linked list may mean that searches in it will always be O(n). However, depending on the case, this may be worse than logarithmic or O(1) complexity only for large-enough values of N, because for small values the constant factor plays a role too. Say for instance that you're searching for some value in a linked list. If you're talking about 100 items tops that's being traversed, that's probably going to be faster than a recursive function without TCO searching in a tree.

Priceless is the moment you realise that thinking in terms of asymptotic complexity is the most important thing you could ever do for performance.

Because it's a rather stupid thing to worry about things like cache locality if you don't first optimize the algorithms used. Because, for example, a quick-sort is going to be more efficient for most cases than a bubble sort and a bubble sort is going to be more efficient than a quick-sort for nearly sorted lists, with all the branch predictions or cache locality you could ever pull.

For a real world example, think of databases like MySQL. Performance on inserts in most databases, such as MySQL, deteriorates at an exponential rate, even though most of them are written in hard-core C with all CPU optimizations thrown at it that you can think of. This means that at scale, in one moment your database server is running fine, but in the next moment your server is gone. By comparisson, with a database where inserts degrade linearly, you can notice problems with months in advance.

All one can accomplish with CPU or GPU optimizations is improving the constant factor. This constant factor can be significant indeed, but at large scale it pales in comparison with the speed benefits you get from proper algorithms.

Going further, after you get your algorithms right, which is much easier to do with clean code that uses the right data-structures, you can then easily optimize the underlying implementation of those data-structures. For lists, for the interface of "push()" and "pop()" or of "queue()" and "dequeue()", you can use arrays instead, or linked lists where the items are arrays, or balanced binary search trees, or freaking Patricia tries, or whatever floats your boat, as long as you can maintain the FIFO or LIFO contract.

So that's why the advice is stupid. Because it's not putting things into context.

In fact, I would tell people - try not to use linked lists, because the notion of head and tail is an imperative concept that leaked into the functional world. Which really means it's not a future-proof concept and you'll have problems optimizing it, because you're still thinking in terms of how, versus what.




>In fact, I would tell people - try not to use linked lists, because the notion of head and tail is an imperative concept that leaked into the functional world.

Citation needed. Haskell's most basic data structure is a singly-linked list. The concept of a linked list seems to fit quite well with functional programming, as adding an additional element is simply a matter of making an element that points back to the unchanged previous element. And, linked lists work very well with recursion, as you can process the head, then move on to the rest of the list.

Are you aware of a functional datatype that can store an arbitrary amount of information and is simpler than a linked-list?


I got this idea from Guy Steele's keynote at Scala Days, named "What Fortress and Scala can learn from each other". For reference, Guy Steele worked on multiple programming languages, including Scheme and he can explain it better than me:

http://skillsmatter.com/podcast/scala/scala-days-keynote-301...

The gist of the matter is this - lists preserve insertion order (instead of the elements themselves having a natural order) and can only be accessed sequentially. You cannot easily split lists in two, or multiple chunks, which means we'll have a hard time doing automatic parallelizing of computations performed on linked lists.

Think of this simple expression:

    someList.filter(_ % 2 == 0).map(_ + 10)
If the underlying data-structure would be a balanced binary tree, you could easily split the work required in multiple threads. Because you're specifying the "what" instead of "how" and you can let the platform choose the best solution, like splitting it in threads or offloading that to a GPU, right? Well, linked list have the "how" encoded in them.

Maybe it helps to think of data-structures for what they really are: frozen algorithms.

Guy Steele is working on some pretty cool stuff in Fortress. Check that video out.


That makes sense. Still, I don't think its fair to say that linked lists are an imperative concept. While they lack one of the most mentioned advantages of functional programming, they still feel 'natural' in functional programming, not like they are borrowed from imperative.

Furthermore, I am not aware of another functional data structure which has O(1) append time, which allows us to construct lists element by element easily.


You can have amortized O(1) with trees for example. Appending elements to the end of Vectors in Scala or Clojure is done in amortized O(1), being a tree with a high branching factor. And I'm not sure how the immutable Queue is implemented in Scala, but it's also O(1) for queue() and dequeue() and it can't be a list.

It's true though, Lists are easy to implement, easy to reason about and speaking of frozen algorithms, you can view a linked list as being a frozen tail-recursion :-)


Linked lists are a good fit for functional languages, and arrays aren't a great fit: this can be a major obstacle to writing Haskell code that is both idiomatic and efficient.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: