Arrays and ArrayLists

The differences between Arrays, ArrayLists, and Linked Lists can be confusing for a beginning programmer, especially if you come from a language like Python which abstracts these concepts away from you. The purpose of this post is to explain Arrays and ArrayLists – we’ll leave Linked Lists for a future post.

These are extremely important concepts if you want to be able to do well in a technical interview or be a good software engineer.

By the end of this post you will be able to answer:

What is the time complexity of removing an item from the end of an ArrayList?

What about inserting an item onto the end?

If you can’t answer the above questions, then this post is for you.


The elements of an Array all live next to each other in memory.

Creating two arrays looks like this:

Array my_birthday = [7, 23, 1996];
Array squares = [4, 9];

Inside your computer’s memory, this looks like:

Arrays and Strings Images (2).jpg

The computer’s memory is a big table of numbers. Given a memory address, our computer can return the item at that address almost instantly.

Arrays and Strings Images (3)

 

We can also store new values at a given location in a similar manner.

Why do we care about how an array is stored in memory? This mental model is key to understanding the properties of arrays intuitively.

Element Accessing

When I type the following in a script, what happens under the hood?

my_birthday[0]; // 7
my_birthday[1]; // 23
my_birthday[2]; // 1996

To get the 0th item in my array, the computer gets the first 32 bits (because each element is a 32 bit integer) after 100 (the starting address of the array). The bits at this location would be 00000000000000000000000000000111 in binary, which is 7 in decimal.
Because each element is a 32 bit integer, I know that the 1st item starts 32 bits after the 0th, at address 132. The 2nd item is at location 100 + 2*32 = 164. If there were 100 items in the array, the 100th item would be location 100 + 100*32 = 3300. Figuring out the location, and thus accessing the 1000th item of an array takes nearly the same time as accessing the 1st.

The time to access an element in an array is independent of its position in the array, or the length of the array.

This is pretty sweet. Even if your array is millions of elements long, retrieving the first item takes the same amount of time as retrieving the last. In further articles, we’ll look at some other data structures that do not have this property.

Element Setting

Lets say we want to make me 4 years older. We can update the value of an array with the following syntax in most languages:

my_birthday[2] = 1992;

Under the hood, this works very similarly as element accessing. First we calculate that we need to change the 32 bits starting at 100 + 2*32 = 164, and we set those bits to the binary representation of the integer “1992.” Like before, the time this takes is independent of the length of the array.

Arrays are really great at setting and getting values, but there is one large restriction that I haven’t yet mentioned.


Arrays are fixed-length.

The number of items in an array is set in stone as soon as it is created. Arrays have no insert or delete method. Lets say I want to add 16 onto my array of squares. I can’t add it directly to the end, because in reality the data in our memory is very tightly packed and there is likely a conflict with some other piece of data. Instead, I need to create a new array with enough room on the end, and then copy each element of the array over to an unallocated section of memory.

Arrays and Strings Images (5)

This will take much longer for an array of 10,000 elements than for an array of 100 elements, because many more elements must be copied over. This is horribly inefficient and no-good. Luckily this is where our friend the ArrayList comes to the rescue.

The ArrayList

Under the hood, an ArrayList is just an Array, except it has methods for inserting and deleting elements. In addition, this insert method works in a way that drastically reduces the amount of times we’ll need to copy the underlying Array to a new location. It works like this:

Whenever you need to add an item onto the end, if there is no previously allocated space, all the elements are copied to a new location and twice as much memory is allocated as before.

Arrays and Strings Images (8).jpg

Most of the time there will be unallocated space on the end, in which case the time cost is next to nothing. Whenever the array needs to be expanded, all of the items are copied over, which is still slow. However, these expansions are fairly rare, so the on average appending onto an ArrayList is very fast. For example, if 100 items were added onto the end of a 4 element array, it would only have to be expanded 5 times, at sizes 4, 8, 16, 32, and 64. The other 95 appends will be practically free.

A Word on Time Complexity

When talking about the running time of algorithms, we often use “big O” notation. Big O means “on the order of”. The time complexity of copying every element from an array of length n is O(n) because the running time is proportional to n, the length of the array.

If the running time of an operation is independent of it’s the length of the array, then we say that it’s O(1). This is the case for accessing an element of an array. When there is extra unallocated space on the end of an ArrayList, appending is O(1). In fact, we say that appending onto an ArrayList is O(1) on average because it needs to be expanded so rarely (Building up an array of length 1 billion requires just 30 expansions).

Deleting

Deleting an element from an ArrayList requires moving over every other element.

Arrays and Strings Images (7)

Deleting the 2nd element of a 5 element ArrayList is much quicker than if the ArrayList were 1000 elements longer, so the time it takes to delete an element is (on average) proportional to the length of the ArrayList, or O(n).


When would you use an ArrayList?

All the time! Whenever you’re gathering data, an ArrayList will usually be your friend. It has all the great fast getting/setting properties of Arrays, except you can also add items to it really fast.

ArrayLists are hugely important in many languages. In Python, the basic List is actually an ArrayList. 

Consider the following:

  1. In what situations is an ArrayList better than an Array?
  2. In what situations is an ArrayList similar to an Array?
  3. Are there any situations in which an ArrayList could be worse than an Array?

In Conclusion

You’ve hopefully learned the following:

  1. The mechanics of getting/setting items from an Array.
  2. A useful memory model for thinking about Arrays and other data structures.
  3. The Basics of Big O Notation.
  4. What ArrayLists are and how they differ from Arrays.

 

See a mistake? Have any questions? Please leave a comment below!

 

Leave a Reply