icaoberg / Data structures - Splay tree

Created Sun, 03 May 2026 00:00:00 +0000 Modified Sun, 03 May 2026 09:37:14 -0400

A splay tree is a self-adjusting binary search tree with one defining behavior: every time you access a node — whether you are searching, inserting, or deleting — that node is moved to the root. This movement is called splaying, and it is what makes the data structure both elegant and practically useful.

Splay trees were introduced by Daniel Sleator and Robert Tarjan in 1985. Unlike AVL trees or red-black trees, they require no extra bookkeeping per node — no height, no color, no balance factor. The structure stays efficient through the splay operation alone.


Definition

A splay tree is a binary search tree (BST) that satisfies the standard BST property — for every node, all keys in the left subtree are smaller and all keys in the right subtree are larger — with the additional invariant that every access moves the accessed node to the root via a sequence of rotations.

There is no explicit balance guarantee. At any point, a splay tree might look completely unbalanced. But the splay operation ensures that frequently accessed nodes migrate toward the root and rarely accessed nodes drift toward the leaves. Over a sequence of operations, this self-adjusting behavior produces amortized O(log n) time per operation.

Key properties

  • No stored balance information — nodes contain only a key, a value, and left/right child pointers
  • Self-adjusting — the structure reorganizes itself on every access
  • Amortized efficient — any sequence of m operations on a tree with n nodes costs O((m + n) log n)
  • Working-set property — recently accessed items are fast to access again

The Splay Operation

Splaying moves a target node x to the root through repeated rotations. The operation applies one of three cases at each step, chosen based on the relationship between x, its parent p, and its grandparent g.

Case 1: Zig (x’s parent is the root)

When p is the root, perform a single rotation: rotate x over p. This is the terminal step — it happens at most once per splay.

    p              x
   / \            / \
  x   C    →    A   p
 / \                / \
A   B              B   C

Case 2: Zig-zig (x and p are both left children, or both right children)

When x and p are on the same side, rotate p over g first, then rotate x over p. This is the key distinction from a naive “rotate up” strategy — it keeps the tree from becoming a degenerate chain on repeated access to the same path.

      g              x
     / \            / \
    p   D          A   p
   / \        →       / \
  x   C              B   g
 / \                     / \
A   B                   C   D

Case 3: Zig-zag (x is a left child and p is a right child, or vice versa)

When x and p are on opposite sides, rotate x over p, then rotate x over g. This is equivalent to a double rotation (same as in AVL trees).

    g              x
   / \            / \
  p   D          p   g
 / \        →   / \ / \
A   x          A  B C  D
   / \
  B   C

The splay terminates when x reaches the root.


Common Operations

To search for key k:

  1. Walk the tree from the root following BST rules — go left if k is smaller, right if larger.
  2. If k is found, splay that node to the root.
  3. If k is not found, splay the last node visited (the node where the search terminated) to the root.

The splay on a failed search is not wasted work — it still moves a nearby node to the root, which improves locality for future accesses.

def search(tree, k):
    node = bst_find(tree.root, k)   # standard BST walk
    tree.root = splay(node)          # move found (or last) node to root
    return tree.root if tree.root.key == k else None

Time complexity: O(log n) amortized.

Insert

To insert key k:

  1. Search for k as above, which splays the closest node to the root.
  2. If k already exists, done.
  3. Otherwise, split the tree at the root:
    • If k < root.key: detach the left subtree. Create a new node with key k, make the old root’s left subtree its left child, and the old root its right child.
    • If k > root.key: symmetric.
def insert(tree, k):
    if tree.root is None:
        tree.root = Node(k)
        return

    tree.root = splay_to_closest(tree.root, k)

    if tree.root.key == k:
        return  # already present

    new_node = Node(k)
    if k < tree.root.key:
        new_node.right = tree.root
        new_node.left = tree.root.left
        tree.root.left = None
    else:
        new_node.left = tree.root
        new_node.right = tree.root.right
        tree.root.right = None

    tree.root = new_node

Time complexity: O(log n) amortized.

Delete

To delete key k:

  1. Splay k to the root. If not found, done.
  2. Remove the root, leaving a left subtree L and a right subtree R.
  3. Splay the maximum element of L to the root of L. After this splay, L’s root has no right child.
  4. Attach R as the right child of L’s root.
def delete(tree, k):
    tree.root = splay(tree.root, k)
    if tree.root.key != k:
        return  # not found

    left = tree.root.left
    right = tree.root.right

    if left is None:
        tree.root = right
    else:
        # splay max of left subtree to left's root
        left = splay(left, k)   # k > all keys in left, so walks to rightmost
        left.right = right
        tree.root = left

Time complexity: O(log n) amortized.

Split

Split the tree into two trees: one containing all keys ≤ k and one containing all keys > k.

  1. Splay k (or the predecessor of k) to the root.
  2. Detach the right child.
def split(tree, k):
    tree.root = splay(tree.root, k)
    if tree.root.key <= k:
        right = tree.root.right
        tree.root.right = None
        return tree, SplayTree(right)
    else:
        left = tree.root.left
        tree.root.left = None
        return SplayTree(left), tree

Time complexity: O(log n) amortized.

Join

Merge two trees L and R where every key in L is smaller than every key in R.

  1. Splay the maximum element of L to L’s root. It will have no right child.
  2. Set L.root.right = R.root.
def join(left_tree, right_tree):
    if left_tree.root is None:
        return right_tree
    # splay the maximum of the left tree to its root
    left_tree.root = splay_max(left_tree.root)
    left_tree.root.right = right_tree.root
    return left_tree

Time complexity: O(log n) amortized.


Worked Example

Starting with an empty splay tree, insert the keys 10, 20, 5, 15 in order, then search for 5.

Insert 10 — tree is empty, 10 becomes the root.

10

Insert 20 — splay closest to 20 (which is 10). Since 20 > 10, new node becomes root with 10 as left child.

  20
 /
10

Insert 5 — splay closest to 5 (which is 10, reached by going left from 20). Since 5 < 10, split: new node 5 gets 10 as right child, nothing on the left.

  5
   \
   20
   /
  10

Insert 15 — search from root (5). 15 > 5, go right to 20. 15 < 20, go left to 10. 15 > 10, terminate at 10. Splay 10 to root via zig-zag rotations. Then insert 15 between 10 and 20.

After splaying 10:

    10
   /  \
  5   20

Insert 15 (15 > 10, split right): 15 becomes root, 10 on left, 20 on right.

    15
   /  \
  10  20
 /
5

Search for 5 — walk: 5 < 15 → go left to 10; 5 < 10 → go left to 5; found. Splay 5 to root.

5’s parent is 10, 10’s parent is 15 — both are left children (zig-zig). Rotate 10 over 15, then rotate 5 over 10:

  5
   \
   10
    \
    15
      \
      20

5 is now the root. A subsequent search for 5 takes O(1).


Time Complexity

Operation Amortized Worst case (single op)
Search O(log n) O(n)
Insert O(log n) O(n)
Delete O(log n) O(n)
Split O(log n) O(n)
Join O(log n) O(n)

The worst-case per-operation cost is O(n), which can happen when accessing nodes in sorted order on a freshly built tree. However, the amortized cost — averaged over any sequence of operations — is O(log n). Tarjan’s potential function analysis formalizes this guarantee.

Splay trees also satisfy a stronger property: if item i was last accessed t(i) operations ago, accessing it again costs O(log t(i)). This is the working-set theorem, and it makes splay trees particularly effective for workloads with temporal locality.


When to Use a Splay Tree

Good fit:

  • Workloads with strong temporal locality — the same items accessed repeatedly in short windows
  • Caches and LRU-like structures where recently accessed items should be fast to retrieve
  • Applications where simplicity of implementation matters — no balance metadata to maintain
  • Situations where split and join are frequent operations

Poor fit:

  • Real-time or hard latency-bounded systems — the O(n) worst case on a single operation can be a problem
  • Uniform random access patterns — splay trees offer no advantage over a static BST when access is random
  • Read-heavy workloads where writes are rare but you still want predictable per-operation bounds — a red-black tree or AVL tree may be preferable

References

  1. Sleator, D. D., & Tarjan, R. E. (1985). Self-Adjusting Binary Search Trees. Journal of the ACM, 32(3), 652–686.

  2. Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to Algorithms (3rd ed.). MIT Press. Chapter 17 (Amortized Analysis).

  3. Sedgewick, R., & Wayne, K. (2011). Algorithms (4th ed.). Addison-Wesley. Chapter 3.

  4. Allen, B., & Munro, I. (1978). Self-Organizing Search Trees. Journal of the ACM, 25(4), 526–535.