January 25, 2004

LinkedList vs. ArrayList clarification

My last entry about ArrayList vs. LinkedList performance (when adding only to the end of the list) seemed to be misunderstood by a few people.

In the post's comments, Robert Watkins said "The size of the array counts," and explained why. This isn't true - a major point of my original post was that no matter what the initial size is, and no matter how many items you add to its end, ArrayList is always at least twice as fast than LinkedList.

Robert also said "If you have to build the array from scratch, without knowing the length in advance, AND the list is long (in the millions), the linked list will be faster." This isn't true either - LinkedList's performance compared to ArrayList decreased in my tests as the number of elements increased.

Robert also said "if your major use is to turn the ArrayList into an array, I suggest you check out the toArray() method on the List interface, rather than iterating." If you look at the source code, ArrayList does not explicitly override toArray, so list.toArray calls AbstractCollection.toArray, which calls iterator() and iterates to build the array. Using toArray is still iterating.

Most importantly, my tests were only relevant for situations where you are using a List to add items to the end, and iterate over the list, and nothing more. This kind of use is common for lists of listeners as well as lists being used temporarily to build arrays.

Also, while ArrayList might be twice as fast as LinkedList for append operations, the difference is still only something like a few nanoseconds per list, for small lists (under 50 elements). This means you probably should just use whichever one you think makes your code look nicer.

Posted by keith at January 25, 2004 01:32 PM | TrackBack
Comments

Well, a thousand reasoned opinions doesn't equal one case of jumping in and finding out, so I'm going to defer to you. :)

Some points, though:
1) Appending to a linked list really should be constant time, as should iterating. I don't know why you're seeing a decrease in performance.
2) Appending to an array list is constant time, EXCEPT when the list has to resize. Iterating is always constant time.
3) Given "enough" resize operations, the linked list should eventually exceed the array list. Of course, that "enough" might be a massive amount. :)
4) I never said that the List.toArrays() method didn't iterate, I just meant it makes the code look nicer (one line instead of three).
5) In general, LinkedList is really only preferred when you're changing elements in the middle. For a simple queue or stack, ArrayList is preferred. Of course, your code should code to the interface, so it's really only the construction that cares about this.

Posted by: Robert Watkins at January 25, 2004 05:48 PM

Robert, you're right that appending to LinkedList is constant time. But because ArrayList doubles its size whenever it needs to grow, the amortized cost of appending to ArrayList is also constant. (N insertions will cost O(N) for both lists -- for any value of N.)

The performance decrease on LinkedList happens for several reasons:
1) LinkedList uses more memory.
2) LinkedList allocates memory from all over the heap. Therefore the locality of reference is not going to be that good and less stuff is found from your cache.
3) LinkedList has lots of small nodes that the GC must process, whereas ArrayList has one big continuous array.

Now all these are related to the memory management of your virtual machine. If you populate your LinkedList in a loop using copying collector (with cheap contiguous allocation), your results are going to look very different than if you use mark-and-sweep collector and allocate more data in addition to the list-nodes. (Still the ArrayList is probably going to win.)

Morale? Allocation patterns in benchmarks are so different from real world patterns that it is hard to make good general statements based on benchmarks. Even more so, if you don't know what happens behind the curtains.

Posted by: Juha Komulainen at January 27, 2004 10:47 AM

Jack Shirazi's excellent book "Java Performance Tuning" contains a chapter on LinkedList vs. ArrayList performance comparison. He gives the results of benchmarking for various operations (insertion in the beginning/midpoint/end of the list, internal iteration, iteration using ListIterator), various size of the list, and for six various flavours of JVMs.

Short resume is that ArrayList outperform LinkedList in most cases. I'd recommend you to read this chapter. And the whole book too, it contains a lot of interesting (and sometimes even shocking) stuff.

Posted by: Alexei Barantsev at January 30, 2004 01:57 AM

An ArrayList is faster than a LinkedList when used for only indexing, calling get(i). But a LinkedList out performs an ArrayList when used for modifying the data structure, calling remove(obj), contains(obj), addFirst(obj), addLast(obj).

1. Calling remove(obj) in an ArrayList is slower than calling remove(obj) in a LinkedList.

2. Calling contains(obj) in an ArrayList is slower than calling contains(obj) in a LinkedList.

3. You should try not to call remove(i) in an ArrayList or in a LinkedList. When you call this in an ArrayList, all of the items after the one that you have just deleted have to move down by one. Imagine calling it on an ArrayList of size 50,000! Say you call remove(1), then 49,999 items need to move down by one! Calling it on a LinkedList is even worse, because first the LinkedList must be converter to recognize indexing, then the same thing that happened to the ArrayList happens to the LinkedList.

4. Calling remove(obj) on a LinkedList is very fast, because all the LinkedList needs to do is point the previous node to the next node and the current node is deleted! Whereas calling remove(obj) on an ArrayList, first the list needs to get converter to recognize nodes, etc.

5. Never use any of the xxx(i) methods in a LinkedList. Remember, a LinkedList points to nodes, not to indecies.

6. Never use any of the xxx(obj) methods in an ArrayList, with the exception of a few like add(obj). ArrayLists point to indecies, not to nodes.

Posted by: Tony at April 1, 2004 04:01 PM

So which would be better to implement a circular list a LinkedList or an ArrayList?

Posted by: Pamela Trout at June 10, 2004 05:05 PM

Anyone know where I can find more information?

Posted by: Stacy at August 17, 2004 07:27 AM
Post a comment