Arpith Siromoney đź’¬

Fibonacci Heaps

Fibonacci heaps are an example of a clever data structure, partly because they maintain a fine balance between being reasonably promising (amortised O(log n)) and yet slightly vague, being little more than a collection of trees. What I’m going to describe here is a bit non-standard but should give you a good idea about how they work.

These heaps are useful as priority queues, so one of the operations required will be adding new nodes, this is easily done by adding a new tree to the collection. It would also be helpful to increase the priority of a node, but if it winds up with a higher priority than its parent, cut it off and add it to the collection of trees. Note that when we want to merge trees in the collection, we can make the lower priority root-node the child of the higher priority root; this ensures the highest priority guys are always on top, which makes it easy to keep track of the “top” of the heap: it’s got to be one of the roots. When removing the top (and letting him enter the temple, or whatever), we’re going to have to deal with his children, so we’ll (as always) add them to the collection. We’ll also merge trees only if they have the same degree. Does this tell us anything about the size of the trees?

It certainly tells us that child[i].degree = i. We can always say we won’t let a parent lose more than one child (how?), which means the degree is never less than i-1. Let’s look at the size of the tree whose root has k children. size(k) can’t be smaller than 2 + sum(size(i-1) for i in range(1, k-1)), that is, 2 + sum(size(i) for i in range(k-2)). This means the size of trees with k-degree-ed roots is no smaller than the k’th Fibonacci number (aha!). And a nice property of Fibonacci numbers is that the k’th number can’t be smaller than the golden ratio raised to the k’th power (remember that F[1] = 2, in our twisted scheme of things).

size(k) >= F[k] >= GOLDEN-RATIO^k

The clever bit is that this bounds the degree of any node in the heap, because it cannot be bigger than the log of its size, so we can think of MAX-DEGREE as log(N) where N is the size of the heap and, of course, the log is to the base GOLDEN-RATIO. This means there are only log many children to deal with when re-computing the new top-of-heap. But before we can talk about performance, I’ll need to describe a marking scheme to make sure that your child[i] never has less than i-1 children. Whenever a child is cut from its parent, we’ll mark the parent, and if the parent was already marked, we’ll cut it (and unmark it) from its parent and add it to the collection (where else), and so on upwards. For the amortised analysis, our potential function will be t + 2m where t is the number of trees and m is the number of marked nodes. Adding a new node increases t by 1, and doesn’t actually cost us very much, so it’s a constant time operation. Extracting the minimum costs us O(degree + t), but since the new t can’t be more than MAX-DEGREE, the amortised cost is O(MAX-DEGREE) which, as we saw, is O(log(N)). Increasing the priority of a node can involve c cascading cuts, but each of them unmarks a node, so the change in potential is c - 2c, and the operation runs in constant amortised time.

That’s about it, really, so a nice thing to end with is a question; how fast can we delete?