Description
Last episode, we learned about AVL trees, a type of self-balancing binary search tree that follows a golden rule: no single leaf in the tree should have a significantly longer path from the root node than any other leaf on the tree. In this episode, we learn about a pattern that we can use to programmatically figure out the minimum number of nodes we’ll need to create any given height-balanced AVL tree, which leads us to the Fibonacci sequence, and relates to the "golden ratio" you might know about from fine art! Trust us, this is really neat stuff.
Show Notes
This episode of the Basecs podcast is based on Vaidehi's blog post, Finding Fibonacci In Golden Trees from her basecs blog series.
Transcript
[00:00:02] SY: Welcome to the Base.cs Podcast where we explore the basics of computer science concepts. I’m your host Saron, Founder of CodeNewbie.
[00:00:08] VJ: And I’m Vaidehi Joshi, Author and Developer.
[00:00:11] SY: And she is the brilliant mind behind the Base.cs Podcast series. Today, we’re talking about…
[00:00:15] VJ: AVL trees again.
[00:00:19] SY: So let’s do a recap of AVL trees. What are they?
[00:00:22] VJ: So AVL trees, otherwise known to me as smarty pants trees, are a kind of tree that is self-balancing. We’ll remember that tree structures have a special characteristic where if they’re not balanced, you lose the ability to search through them and know that everything will be sort of equally distributed. AVL trees are sort of a solution to that problem, the problem of unbalanced trees. And the reason that they’re smarty pants trees is because when you deal with an AVL tree, whether it’s like inserting a new element or removing an element, the tree inherently will reorganize itself through rotations in order to balance itself. That’s kind of the short version of what an AVL tree is.
[00:01:08] SY: And there is a relationship between the height and the number of nodes, right? Like the height can’t be longer than something or shorter than something, something like that. Is that right?
[00:01:19] VJ: Yeah, that’s true. What you’re thinking of is one of the rules of what makes a tree balanced. In our last episode, we talked about this, where a tree is balanced if there is up to a difference of one level. That means you can add a node as long as the difference in the right subtree and left subtree is not more than one level. However, if you exceed that, now the tree is unbalanced so you have to keep in mind the height of each subtree when you’re adding a node because if you add a node and now one subtree is greater than the other by a height level of more than one, it’s unbalanced.
[00:01:59] SY: So is there a way to quantify that relationship, the relationship between the number of nodes and the tree height?
[00:02:07] VJ: There sort of is, yeah. There’s an interesting pattern that sort of emerges when you look at the height of the tree, of any AVL tree, and the number of nodes that you need to create a tree of that height. What I mean by this is if you’re going to add nodes to an AVL tree, you want to make sure that if you’re, for example, creating a tree with a height of three, which means three levels, you want to make sure you’re not breaking the rule we just talked about, right, the levels rule. So you need to know the minimum number of nodes that you need to create that tree.
[00:02:45] SY: Okay.
[00:02:46] VJ: Because if you don’t have that minimum number of nodes, you actually can’t create a balanced tree of that height.
[00:02:50] SY: Okay.
[00:02:50] VJ: So there’s a relationship between the height of the tree and the minimum number of nodes you need to create that tree of that height.
[00:02:58] SY: Okay, that makes sense. So for example, if we have a tree height of one, which means we basically only have a root node, the minimum number of nodes is, well, it’s just one, right?
[00:03:08] VJ: Yeah, exactly.
[00:03:09] SY: So if I want to make a tree with a height of two, then I would have my root note and then I would have my left child and then I’m kind of done. I have my two levels. So the minimum number of nodes I need is two.
[00:03:22] VJ: Exactly. I do want to point out you could have had three and that still would have been an AVL tree that’s balanced because you have a root node on the left child and the right child. In most of our examples, we seem to do root node, left child, right child. So we use three nodes, but I just want to remind ourselves, we’re not talking about how many nodes we can use. We’re talking about the minimum and at a minimum number, you need two because you can have a root node, and you can have a child. The difference is one and that’s okay. And that’s the minimum you need to create a tree with two levels, a tree with a height of two and that’s still fine. We’re just talking about that minimum relationship. We could always add more and it will still work.
[00:04:03] SY: Okay, let’s keep going. So if we have a tree height of three, then we have our root note. We have our left child. To keep it balanced, we have to add our right child and then we can add a left grandchild and that gets us to our third level. So we need a total of one, two, three, four nodes.
[00:04:25] VJ: Exactly. Shall we try to do four?
[00:04:29] SY: Yeah, let’s try to do four. Okay, I think it’s getting a little complicated. Let’s see if we can do this one. Okay, so we have our root node. We have our left child, right child. On our left child, we have first grandchild on the left, the second grandchild on our left. Now, we’ve kind of filled up that subtree or that node. Then we move on to our root’s right child. And then that has two children of its own, this left child and right child. Now we go all the way over to our left grandchild and that grandchild has a left child of its own. That is how we get to our fourth level. We just started our fourth level.
[00:05:06] VJ: Yes. I do want to point out one thing, which is that you actually don’t need two children on the right child of the root node. So you were saying you have the root node. We have the left child, right child. The left child has its own two children, which are the grandchildren. The right child just needs actually one node because, remember, we want the level to be no more than one and the level that we’re talking about, we have the root note is level one, the two children is level two, the grandchildren is level three, and then you’re going to add one more great-grandchild to the left child of the left child.
That just means if you want to add a great-grandchild, you need to make sure that the right subtree has something in level three before you add to level four, right? So you don’t need to have a second child for the right child. You just need to have one and the reason I want to poke at that is because we’re talking about minimum. I totally understand why, inherently, you want to add more, you want to fill it up, but you actually don’t have to in order to make it balanced. You just need the one left child for that right subtree.
[00:06:16] SY: Okay. So if we add all those up, we have our one root node, we have our left child. We have our right child so I’m just going to count one, two, three, four, five, six, which is on level three and then we finally get to level four, which is seven.
[00:06:29] VJ: Exactly. So we’re talking about an AVL tree that has a height of four because we have four levels and the minimum number of nodes you need to create that tree is seven nodes.
[00:06:42] SY: You mentioned that there is a pattern which isn’t super obvious to me. So we started with a height of one, a minimum node of one, a height of two, a minimum node of two, a height of three, a minimum node of four, which was kind of a different pattern. A height of four, minimum node of seven, so, I’m not really seeing like a clear pattern here. What’s the pattern that you were talking about?
[00:07:03] VJ: Let’s look at the tree we just did. The tree we just did was a height of four with seven nodes. And so for now I kind of want to do something weird. I want to sort of dismember our tree for a second. Let’s ignore the root node.
[00:07:16] SY: Uh-oh, sounds painful.
[00:07:17] VJ: I know. Let’s look at just the left subtree and let’s just imagine what that actually looks like without anything else. So to recap, we have the root node, which we’re going to ignore and then we’re just going to look at the left subtree. Remember that we talked about the left subtree being one node which is now our root node, because we’ve dismembered this tree.
[00:07:42] SY: Ignoring the real root node.
[00:07:43] VJ: Exactly, but this node was the left child of the original root, but we have one node at the top of the root node. It has a left child and a right child, and its left child has a left child. It’s still balanced because the difference in levels is no more than one. We filled up the children level, level two, and now we’ve added a node to level three. So the total number of nodes here is one for the root, two, three for the children and four for the one grandchild, so just the left subtree has four nodes. Now, let’s look at the other side of our dismembered tree, which is the right subtree. This has just two nodes. It has one parent node and one child node, still balanced. And the number of nodes here is two, one parent, one child, two levels. The number of nodes we’re talking about in our right subtree is two.
Now, if we look at the left subtree and the right subtree together, even though they’re still dismembered, these two trees are actually things we’ve seen before. The left subtree is exactly the minimum number of nodes you need to make a tree with a height of three and the right subtree is exactly the number of minimum nodes you need to create a tree with a height of two. So to create a tree with a height of four, you need a minimum balanced tree with a height of three and a minimum balance street with a height of two. Basically, what we have here is the two previous minimum trees used to create the next. Does that make sense?
[00:09:24] SY: Where are the other nodes, though?
[00:09:25] VJ: Yeah, right. Okay, so good point. We can’t leave this tree totally dismembered forever. We need to give it its head back. So really, we have the minimum nodes for a height of three and the minimum nodes for a height of two plus the root node. We have the two previous minimum height balanced trees plus one for the root node. This is sort of the pattern here with the height of an AVL tree. We can basically sort of abstract this out because if you were to do the next tree in our pattern here where we did like the height of five, we could create a tree with the height of five by summing the tree with a height of four and three. Those are the two preceding trees. So the pattern is that the height of a tree N where N is the height, it could be anything, the height of a tree N is equivalent to the sum of the heights of the trees that preceded it plus one.
[00:10:24] SY: Okay. Why is that useful? What does that help us do?
[00:10:28] VJ: Well, what we’ve basically stumbled upon here is the formula for figuring out the minimum number of nodes to create an AVL tree of any height we want, which means that if you want to create an AVL tree of a certain height, we now know how many nodes we need to make it and also, we now know based on the number of nodes that we have whether or not we can create a tree with a new level or whether it needs to be of the same level because we now know like “Oh, this is the base number I need to create a tree with a height of five or 17,” or any number we want. So we’ve created this pattern basically and we’ve derived a formula from it because we’ve realized that there’s a correlation here between the number of nodes you need to create a tree of a certain size.
[00:11:22] SY: Okay, so we’re basically taking the sum of two numbers in order to get the next number. Is there a name for this pattern?
[00:11:29] VJ: There is. All the math nerds are going to get excited. So there is a sequence that you probably heard of that is actually coming into play here. It’s rearing its head once again. The Fibonacci sequence and for those of us who need a little refresher, the Fibonacci sequence is just summing two numbers to get the next number. The Fibonacci sequence starts with two numbers zero and one, and in order to find the next number, you just add the two numbers that come before it. For example, zero is the first number. One is the next number. You add them together, you get one then you have one and one you add them together, you get two. You add two and one together, you get three and you can just keep building it on and on and on forever.
[00:12:10] SY: Kind of forever. Yeah.
[00:12:12] VJ: We’re sort of doing that here with our trees because in order to create an AVL tree with a height of five, we took an AVL tree with a height of four and three, and we added one to it. And now we have an AVL tree with a height of five.
[00:12:28] SY: So does that mean that we can do that for an AVL tree with a height of six? I look at the AVL tree for the height of four and five to get the thing for six?
[00:12:37] VJ: Exactly, yes. So if you have a height of six, you’re going to use the formula where the left subtree is the height of six minus one which is the height of five, plus the height of six minus two, which is four, so you do tree with height of five plus tree with a height of four and then you add one to it for the root node. That’s how you figure out the height of the tree using this Fibonacci sequence ideology.
[00:13:04] SY: Okay. So if I remember correctly, the Fibonacci sequence has to do with the golden ratio, right?
[00:13:09] VJ: It does. Oh my goodness! I love the golden ratio. The golden ratio is like another way to visualize the Fibonacci sequence and it’s pretty cool. It’s like in mathematical terms, you might hear it referred to as Phi. It’s like a Greek symbol and the idea with the golden ratio is that two quantities have the golden ratio if the ratio between those two quantities is the same as the ratio of the two elements that are summed together when they’re compared to the two larger quantities, which is really confusing I just said a lot of words, but basically, this is the part where I wish we had a visual because I think if you saw the golden ratio drawn out in a lot of like famous paintings and in art and stuff, you’ll see it where basically if you divide a line into two parts, the whole length of that line divided the larger part is the same ratio. That’s the basic idea and you see the golden ratio in science and art and they teach it in drawing. It’s really cool.
[00:14:15] SY: It’s that swirly thing, right? It kind of looks like a seashell.
[00:14:19] VJ: Yeah, that’s golden spiral, which is literally the golden ratio, just broken down into little bits.
[00:14:26] SY: Okay.
[00:14:26] VJ: I highly encourage people to Google it. It’s really cool. It’s hard to describe with just words.
[00:14:34] SY: So how is this golden ratio related back to AVL trees? How do we bring it all together?
[00:14:39] VJ: The interesting thing with the golden ratio and the height of a tree and the number of nodes, which are the two things we’ve been talking about, if we throw in the golden ratio here, you’ll actually notice that there’s a relationship with the height of a tree and the number of nodes you need to get to it. We’ve already talked about Fibonacci, but in terms of the golden ratio, there’s a logarithmic relationship happening here. Basically, if you take the log base golden ratio, so I’m going to just say log base phi, if you do log base phi of the number of nodes you need to create a tree of height N, you will get the actual height of the tree.
For example, if you do log base phi 12, the number you that that gives you is approximately five and five is the height of the tree that you can create with 12 nodes. So log base phi 12 gives you five which basically means that if you take the log of the number of nodes that you need to create the tree, you get the height of the tree. What that tells us is that as the number of nodes in an AVL tree grows linearly, the height grows logarithmically and we’ve talked about logs a lot. So I won’t get too much into it, but basically, if you start graphing out the number of nodes in a tree and the height, you’ll notice you get a logarithmic line.
[00:16:17] SY: Interesting. So if I’m trying to figure out how to balance a tree and I’m trying to figure out how many nodes I need, I can basically use the golden ratio to help me do that.
[00:16:27] VJ: Yeah, it’s pretty cool because I mean, honestly, you may never even end up having to do this, but it’s cool to realize there’s a lot of math that goes into figuring out this height balanced tree and it’s an interesting property. It’s an interesting characteristic of AVL trees that you can find the golden ratio inside of this data structure randomly, which until I learned about it, I had no idea that it was right there under my nose.
[00:16:53] SY: Very cool. And that’s the end of today’s show. If you liked what you heard, please leave us a review and make sure to check out Vaidehi’s blog post. The link to that is in your show notes.
[00:17:02] VJ: Bye everyone.
[00:17:02] SY: Thanks for listening. See you next week.
Thank you for supporting the show!