The shortcomings of std::collections::LinkedList

Posted by Rahul Singh on January 10th, 2021 Tagged Rust, LinkedList

A lot has already been written about LinkedLists in Rust and the general consensus is that there isn’t really any way to build one. The LinkedList in the Rust Standard Library is the most well known implementation with several shortcomings. Furthermore it seems like no one has really solved the problem of building a useful linkedlist in rust.

I'll walk through the main issues with std::collections::LinkedList and in a future series of posts I'll describe an implementation that overcomes these.

You can only add or remove elements to the ends

The std::collections::LinkedList provides methods to push elements to the front -push_front() - or the back - push_back()- of the list. You can also remove elements from either end - pop_front() , pop_back().

use std::collections::LinkedList;

let mut ll = LinkedList::new();

ll.push_front(2);
ll.push_front(1);
assert_eq!(ll.len(), 2);

ll.pop_front();
ll.pop_back();

assert_eq!(ll.len(), 0);

However the list doesn't provide any methods to insert, remove or modify the elements in the middle of the list. Its not even possible to traverse the list from the beginning to get to a specific node and then insert a node after that (a basic principle in linkedlists).

These operations would only be possible if every node returned a reference to it when it was pushed to the list but the push_front() and push_back() methods do not return a reference.

use std::collections::LinkedList;
  
// Create a LinkedList with some values
let mut list: LinkedList = LinkedList::new();
list.push_back(10);
list.push_back(20);
list.push_back(30);
list.push_back(40);
list.push_back(50);
    
println!("Original list: {:?}", list);
    
// Error Can't Insert an element between 20 and 30
list.insert_after(20, 25);
  
error[E0599]: no method named `insert_after` found for struct `LinkedList` in the current scope
  --> src/main.rs:18:10
   |
18 |     list.insert_after(20, 25);
   |          ^^^^^^
   |

Even if the API was updated to return references, it still wouldn't work because the borrow checker would make those references invalid the moment the list was mutated. There is no way in safe Rust to return long lived references to elements of the list that survive mutations of the list.

Removing an element is an O(n) operation

The remove() method on the list is designed to remove an element at a given index. The operation unfortunately runs in linear time for the same reasons noted above.

use std::collections::LinkedList;

// Create a LinkedList with some values
let mut list: LinkedList = LinkedList::new();
list.push_back(10);
list.push_back(20);
list.push_back(30);
list.push_back(40);
list.push_back(50);
     
// Remove element at index 2 (value 30)
// This is an O(n) operation
list.remove(2);
    
// Remove last element
// This is an O(n) operation
let last_index = list.len() - 1;
list.remove(last_index);

In other languages such as Java, Go or C++ its easy to design a linkedlist that returns pointers or references to the nodes in the list and those pointers can be used at any time to remove a given element in constant time. The LinkedList in rust allows none of this.

Reversing the List is O(2n) in time and space

Reversing a list can't be done in place so the only way to reverse a list is to remove all the elements and then push them back in reverse order.

let mut list = LinkedList::::new();
    list.push_front(1);
    list.push_front(2);
    list.push_front(3);
    list.push_front(4);
    list.push_front(5);

    let mut tmp_list = LinkedList::::new();
    let mut cur: Option = list.pop_front();
    while cur.is_some() {
        tmp_list.push_front(cur.unwrap());
        cur = list.pop_front();
    }

    cur = tmp_list.pop_front();
    while cur.is_some() {
        list.push_back(cur.unwrap());
        cur = tmp_list.pop_front();
    }

The optimal solution is to reverse the list in place by using two pointers starting from each end and simply swapping elements and decrementing the tail pointer and incrementing the head. That requires the ability to move elements around in the middle of the list which std::collections::LinkedList doesn't support.

Cursors: The flawed solution

The solution to the problem seems to be to use Cursors. The RFC was written in 2018 and accepted in 2019. The idea behind the cursor is that it can be used to traverse the list till a particular node and then used to mutate that node or add elements before or after it.

However traversing the list using a cursor is an O(n) operation and the cursors are immediately invalidated as soon as the list is mutated. This means every operation to update a node in the middle of the list still takes linear time.

// Create a new linked list with some initial values
let mut list: LinkedList = LinkedList::new();
list.push_back(1);
list.push_back(2);
list.push_back(3);
list.push_back(4);
list.push_back(5);

println!("Original list: {:?}", list);

// Create a cursor at the front of the list
let mut cursor = list.cursor_front_mut();

// Now use the cursor to move around the list and update elements
// as needed.

// Move to the second element (index 1)
cursor.move_next();

// Insert a new element before the current position
cursor.insert_before(10);
// Insert a new element after the current position
cursor.insert_after(20);

//Mutating the list invalidates the cursor
list.push_back(7);
// Compile Error - Cursor is now invalid
cursor.move_next();

21 |     let mut cursor = list.cursor_front_mut();
   |                      ---- first mutable borrow occurs here
...
35 |     list.push_back(7);
   |     ^^^^ second mutable borrow occurs here
36 |     //Cursor is now invalid
37 |     cursor.move_next();
   |     ------ first borrow later used here

A Cursor can't survive mutations of the list and thus cannot be used to implement O(1) mutations of the list.

Cursors can't also be used simultaneously (the borrowing rules prevent that) which makes reversing a linkedlist in place with two cursors impossible.

let mut list = LinkedList::::new();
list.push_front(1);
list.push_front(2);
list.push_front(3);
list.push_front(4);
list.push_front(5);

let len = list.len();
// Create a cursor at the front of the list
let mut cursor_front = list.cursor_front_mut();
// Create a cursor at the back of the list.
// COMPILE ERROR! - can't have two cursors simultaneously
let mut cursor_back = list.cursor_back_mut();

for _ in 0..len / 2 {
    //Can't implement a swap method because two 
    //simultaneous cursors are not allowed
    swap(cursor_front, cursor_back);
    cursor_front.move_next();
    cursor_back.move_prev();
}
  
error[E0499]: cannot borrow `list` as mutable more than once at a time
  --> src/main.rs:44:27
   |
10 |     let mut cursor_front = list.cursor_front_mut();
   |                            ---- first mutable borrow occurs here
...
13 |     let mut cursor_back = list.cursor_back_mut();
   |                           ^^^^ second mutable borrow occurs here

Cursors clearly don't really solve the problem.

The need for long lived references

The question that naturally arises is why do we need long lived references at all? Why not simply navigate to the node when you need to?

The reason that long lived references to nodes are useful is that they can then be stored in other data structures - for example in a Map when building an LRUCache. This concept applies to not just LinkedLists but to linked data structures in general. In the case of trees, long lived pointers to nodes in a tree can be used to store pointers to sub-trees that match a particular criteria which in turn enables constant time access to that sub-tree.

The many ways it doesn't work

The Rust book goes through lots of examples so there is a lot of work been done around building LinkedLists.

However there is an elegant solution to this problem.

arrsingh.com

Distributed Systems & Artificial Intelligence