Description
When you're dealing with data structures like trees, the balance of its "leaves" (data/nodes) matters. The moment a tree becomes unbalanced, it loses its efficiency, much like a real life tree bending to the weight of one side, unable to efficiently stand tall and grab the light of the sun. Don't let your garden grow full of lopsided saplings, and make sure to plant some AVL trees--your efficient runtime hangs in the balance.
Show Notes
This episode of the Basecs podcast is based on Vaidehi's blog post, The Little AVL Tree That Could from her basecs blog series.
Transcript
[00:00:02] SY: (Music) Welcome to the Base.cs Podcast where we explore the basics of computer science concepts. I’m your host Saron, Founder of CodeNewbie.
[00:00:09] VJ: And I’m Vaidehi Joshi, Author and Developer.
[00:00:12] SY: And she is the brilliant mind behind the Base.cs Blog Series. Today we’re talking about.
[00:00:16] VJ: AVL Trees.
[00:00:18] SY: This season of the Base.cs Podcast is brought to you by Square. If you’re designing a website or building a mobile app, you’re probably going to want to take payments at some point. Square APIs allow you to easily implement their payment form on your website. And with their In-App Payments SDK, you can add it to your mobile app too. You don’t have to worry about dealing with PCI compliance because they’ll take care of that for you. Don’t let anything come between you and your money. Start building with Square over at squareup.com/go/codenewbie.
[00:00:53] SY: AVL trees, what's an AVL tree? Let's start there.
[00:00:56] VJ: It's kind of like building upon a lot of the structures we've already talked about in this series. So we've talked about binary trees and binary search trees and AVL trees are kind of a fancy version of that. There’s specifically a type of balanced tree or a self-balancing tree.
[00:01:14] SY: So we've talked about trees a bunch of times on this show. Quick reminder what a tree is?
[00:01:20] VJ: A tree is a data structure where all your data is stored in nodes and you have a root node. And from the root node, you have children nodes that come from it. So you have a root node. It has children nodes. And usually if you're dealing with binary trees, binary search trees, you have a root node and every node has only two children. So there's kind of like…
[00:01:42] SY: A right and a left.
[00:01:43] VJ: Exactly. And there's sort of like a cascading effect where like, you know, each level gets a little wider if you visualize it. That's your basic tree structure.
[00:01:51] SY: Yeah.
[00:01:51] VJ: And a balanced tree is a tree structure where one level mirrors the other. And what I mean by that is you don't have one left subtree that's way longer or bigger than the right subtree. They're sort of balanced on either end. And I like to imagine it like if you had scales, you know, those old school scales where you weigh things. If you had a balanced tree, your two scales wouldn't really have too much difference in them. Everything sort of like weigh evenly. So if you have a left subtree and a right subtree, they shouldn't really be two different.
[00:02:24] SY: That would be for example. Okay, so let's say you and I were sisters, right?
[00:02:28] VJ: I mean, we’re basically.
[00:02:29] SY: We’re basically sisters with different mister. Yeah, basically. And so if we were sisters and when we had, you know, like our different misters, same mama. So if we had shared a mom and then she had the two of us, we would be a balanced tree. Then if you had a kid and I had a kid, still a balanced tree.
[00:02:48] VJ: Yeah. And even if we had two kids, still balanced.
[00:02:51] SY: Right, because two and two. If you had a kid and then I did not have a kid, would we still be balanced?
[00:02:58] VJ: That's still balanced because the difference between the left subtree and the right subtree, in this case you and me. The difference between the left and right cannot be more than one level. So if I have a kid, it's a difference of one, but that’s the maximum. So far, we're still good.
[00:03:16] SY: But if you had a kid and then that kid had a kid and I still had no kids, so now you're like two ahead of me. That's a problem.
[00:03:25] VJ: And I think I’m pretty old too. I don't know how this is happening.
[00:03:29] SY: We're both pretty old at this point, but that wouldn't be an unbalanced tree.
[00:03:34] VJ: Yeah, that would be unbalanced because now the difference between the left subtree and the right subtree is more than one level.
[00:03:42] SY: Difference of two. Okay. So it sounds like this balancing situation is pretty important. Why does it matter?
[00:03:48] VJ: So it matters a lot because if a tree isn't balanced you lose one of the key characteristics of that tree, which is being able to search through it and rely on the fact that everything sort of divided equally. And what I mean by that is you'll remember in our binary search trees, we talked about logarithmic runtime and how every time you search through the tree you're eliminating half of the data that you need to look at. But if you have an unbalanced tree, let's say you have a root node and like if you're the left subtree, there's only one node on the left, but then let's say I have a kid and a grandkid and another grandkid and another grandkid and it's like a cascading effect and let's say the right subtree has like eight nodes in it now. Now your tree is unbalanced. And more importantly, say you want to search for the node that is like the leaf node of the right subtree, you basically have to check all the nodes in the right and it's like not really efficient and it's kind of silly because now you're not leveraging what a tree data structure can provide.
[00:04:56] SY: Okay, that makes sense. So balancing is super, super important. So are all trees or all binary search trees, are they naturally balanced or most of them unbalanced or how does that work?
[00:05:09] VJ: So it depends on how you add your data because if you are adding your largest number first to your tree and then you're adding more nodes, but they're all smaller, well, now you have an unbalanced tree. But if you are adding your data to your tree structure in a sort of random way or like somehow it's equally divided so that you have nodes that are both larger and smaller than the root node, well, now it's balanced. But I guess the key point here is that you don't know if your binary search tree is going to end up balanced or not because you don't know necessarily what the data that you input is going to look like or how would you know if it's going to be evenly distributed on both sides. It's kind of hard to tell that unless you know your data ahead of time.
[00:05:52] SY: Oh, no. Okay. So then what do we do? Is anything we can do to make sure it's balanced?
[00:05:57] VJ: Aha. This is where you can have a smartypants tree, as I call them.
[00:06:02] SY: Smartypants tree. They’re so smart.
[00:06:06] VJ: So smart. So this is where you can have something like an AVL tree or basically any kind of self-balancing tree was created to solve this problem. And this like something that you'll see a couple times in computer science, like the AVL tree we're talking about today is one solution to this problem, but self-balancing trees is a very real problem. You'll see it all over when you look deep into places that use tree data structures. My point is really that you want to use a data structure that always allows you to be certain that your structure will be balanced, which means, since you don't know what your data input is going to look like, you need to have a structure that can handle any kind of input and self-balance its data, its content as required.
[00:06:54] SY: Interesting. Okay, a self-balancing tree. So how does that work? Because as you mentioned earlier, if I wanted to add some data to a tree, I won't necessarily know where it'll end up. So let's say I’ll put it in some random place and it turns out to not be balanced, how would the tree balance itself?
[00:07:16] VJ: Well, we have nodes, right? And nodes are connected to each other by links or edges in our tree, just like a graph. Okay, say we put something in a place that makes the tree unbalanced as you suggested, we can still move those nodes around. What we need to do is be a little thoughtful when we insert something or delete something because it's not enough to just find the place that it should be. You kind of have to insert the node and then look at whether inserting it throws off the balance.
[00:07:53] SY: Okay.
[00:07:53] VJ: And if it does, then you need to move things around or rotate things, if you will.
[00:07:59] SY: Okay. So there's kind of like a moment of introspection that happens. So like I put my node in a place then I go, “Wait a minute. What did I just do?” And is this in its appropriate place? And how is this affecting my overall tree and my overall life? And then if it throws things all off, then I have the option of fixing it and you said that's called a rotation.
[00:08:21] VJ: Yeah. And the reason that it's called a rotation is if you kind of visualize what you can do in a tree, you can move nodes around but also you can rotate things so that if you have a line of nodes, let's say you have three nodes in a line, a root node and then a child and then the child has a child, you can move it around so that the child becomes the parent and the parent becomes a child, and therefore a sibling of the child's child.
[00:08:53] SY: I think we're probably going to need an example for this one.
[00:08:56] VJ: Yeah, let's do it. Let's do an example. Let's say you have a tree. You know what? I like the simple example of just three nodes. So let's say you have a tree that has a root note of one.
[00:09:07] SY: Okay.
[00:09:07] VJ: And it has a right child that is two.
[00:09:12] SY: Okay.
[00:09:13] VJ: And that child, the right child has a right child that is three.
[00:09:17] SY: Okay.
[00:09:18] VJ: Now first of all, is this tree balanced or unbalanced?
[00:09:22] SY: It is unbalanced because on the right side, we have a grandchild situation, and on the left, we have nothing. So that's two levels difference. And so that is unbalanced.
[00:09:33] VJ: Totally. And like ignoring what these nodes are and the value of the nodes, how could we theoretically balance it?
[00:09:42] SY: So we want there to be basically a left child, right? Like if we take one of the nodes and then just put them on the left side of one so that there's only one child and no grandchildren, then it would be balanced.
[00:09:57] VJ: Yeah, that's one thing we could do. You could also just insert some sort of left child, but that’s not a good solution. Don't do that because now I'm telling you to solve a problem with another problem. But even just if you're visually thinking and you're not even caring about what those nodes actually contain, you just need to make sure that there's no more than one level of difference. So what you said is exactly right. We just need a left child.
[00:10:23] SY: Okay.
[00:10:24] VJ: So what we can do is we can basically rotate these nodes so that we end up with a left child. And because we only have three nodes, we're going to end up with a left child and no grandchild.
[00:10:38] SY: How do we do that?
[00:10:39] VJ: So what we can do is we can make two, the right child of one, and a parent of three, we can make that the new root node and we can take one and make it the left child of two. And if we do this, we're still following the rules of this binary search tree, right? Because one is still less than two and three is still greater than two and now it's balanced because you have exactly two children of the root node, and you don't have any grandchildren throwing it off and you actually have that left child. So that solves that problem.
[00:11:15] SY: Okay. So just to recap. We started by saying we have the node one and then its child was node two and then its child was node three. So we have one, two, three going straight down. Now we've made node two our parent so that now it's node two. Then on the left, we have node one and then on the right we have node three.
[00:11:38] VJ: Exactly. So what we did was we inserted a node into the right subtree, like we had one and two and we inserted three and now everything is unbalanced. So we inserted a node into the right subtree of the right subtree and everything became unbalanced and then what we did is we rotated a node down or down to the left, which is basically another term for it is called a left rotation. And now we have only one level instead of two levels, and instead of being unbalanced, we're balanced.
[00:12:15] SY: Here's a question. If was the mom and then you are my kid and then let's say Levi, our producer, is like your kid, right? I have a relationship with Levi where I'm the grandma, but if that is unbalanced and then we do our rotation and then now you're the mom and then Levi and I are brother and sister, that's kind of weird, right? Because like I used to be the grandma and now I'm the sibling of the person who used to be our grandchild. Do we not care about the relationships between the nodes in that way?
[00:12:50] VJ: Well, what we care about is that the rules of the tree aren't being violated. So we're still following the rule that the root node must be greater than the left child and less than the right child and that was like an important rule to follow. We didn't break it. And the other thing I guess that's worth mentioning is that when you do this rotation, you just need to make sure that you rearrange the pointer so that you don't become orphaned.
[00:13:18] SY: Oh, right, because being orphaned is bad, right?
[00:13:20] VJ: Yes, and also so that I, if I'm number two, the new parent who was the middle generation, we also have to make sure that there's still some reference to me. Otherwise, you could both be two children that don't have any connection to a root node. So you just have to like make sure you rearrange the pointers so that first I need to make sure that I'm the root node and you need to become my left child.
[00:13:43] SY: Okay. So the fact that I'm the grandma is not something that we need to preserve.
[00:13:47] VJ: Yeah, your grandmaness is like ephemeral. It can change. You could become a great-grandchild. You could become a great grandma. It depends on if we added more nodes into the situation.
[00:13:58] SY: I can be anything I want to be.
[00:14:00] VJ: Yeah.
[00:14:01] SY: Yeah.
[00:14:02] VJ: State is transient. Your identity changes.
[00:14:07] SY: Yes. Okay. I'm just so flexible. That's what I'm saying, flexible, open-minded. I can be whatever I want to be. In that example when we had the right child having a right child, that is called a rotation. What happens if it's on the left side? What happens if I have my root note and then I have a left child and then that left child has its own left child? Are we still doing a rotation?
[00:14:29] VJ: It's still a rotation. In both situations, you’re going to have to do a rotation. I want to clarify one thing which is that when you insert a node into the right subtree of a right subtree, you're doing a left rotation.
[00:14:41] SY: That's confusing.
[00:14:42] VJ: I know. It's weird. It's like I insert to the right, but I'm going to have to rotate over to the left, but it kind of makes because if you imagine I'm your right child and you're the grandmother, you have to slide down and I slide up so you're getting rotated to the left and I'm like sitting on top as the root node and you're like, “Whoa!”
[00:15:03] SY: That is what I sound like. Well done.
[00:15:04] VJ: I was waiting for your reaction to my sound.
[00:15:05] SY: Accurate. Accurate.
[00:15:09] VJ: So then to answer your question, if you had three of us in a row, let's say instead of one, two, three, you're ten, I'm nine and Levi eight. So you're the grandmother. I'm your left child, nine, and Levi is my left child, eight. So Levi gets inserted into the equation. Now the tree is unbalanced because now we have three of us in a row. I’m your left, he’s my life. So now what happens is you have to rotate down to the right and I need to rotate up.
[00:15:41] SY: There you go. Okay.
[00:15:42] VJ: So that is a right rotation. So now what will happen is I'm still the parent. Levi's my left child. You're my right child. So it would be nine as the root node and eight as the left child, ten as the right child.
[00:15:55] SY: Okay. So we have our left rotations and our right rotations. Okay. I'm going to switch it up. Are you ready?
[00:15:59] VJ: Yes.
[00:16:00] SY: What if I'm the grandma, you're my left child…
[00:16:05] VJ: I don't like where this is going. Why can't you just keep it simple? Just give me a left child.
[00:16:14] SY: This is growth. This is all about growth and just pushing past our comfort zone. That's what this is.
[00:16:19] VJ: You also can't control your children. Can you? This is a lesson.
[00:16:21] SY: No, you can’t. They decide what they want to be. It's all lots of fun. So you are my left child and then you have a right child named Levi, what do you do then? Do you still do a left rotation? Can you do a left rotation?
[00:16:36] VJ: You can do a left rotation. However, if you do a left rotation, what's going to happen is I'm going to become Levi's left child and he's going to become your left child. Let me add some numbers instead of names.
[00:16:53] SY: Okay.
[00:16:54] VJ: That might help. So let's say your 3 and I'm 1 and Levi is 2. So yeah, when you were 3, you're the root node and I'm your left child, our 3 just looks like 3 and then 1 is the left child of 3. Now Levi gets added and let's say there's like other nodes, it doesn't even matter what other nodes are in the situation, but he gets added as my right child. So he is my right child. So he's the number 2.
[00:17:18] SY: So if we visualize it, it's like we drew like a left arrow. It's like you made a less than symbol. So you have you. Who's the grandma again? I forgot who grandma was.
[00:17:27] VJ: You are. You’re 3 at the top.
[00:17:30] SY: Okay. At the top.
[00:17:32] VJ: You’re just so wise. So you're 3 and I'm on the left, 1, and Levi's below, on the right, 2. So we have 3 and then to the left is 1 and then to the right is 2. So if we do a left rotation on this left subtree, then now Levi is going to slide up and I'm going to slide down.
[00:17:52] SY: And when you slide down, where do you go?
[00:17:55] VJ: I become his left child because remember we're doing a left rotation.
[00:17:58] SY: Oh, right, right, right.
[00:17:59] VJ: I'm rotating down to the left. So that means I've become his left child. So now I've kind of downgraded. I'm the youngest one.
[00:18:08] SY: Okay.
[00:18:10] VJ: You're still grandma, you’re 3.
[00:18:12] SY: I’m still a grandma Yeah.
[00:18:14] VJ: Levi is your left child, 2, and I'm his left child, 1.
[00:18:18] SY: Okay. So we're still unbalanced because we have nothing on the right side.
[00:18:21] VJ: Yes. I mean we've like we took a tree and then we like move things around and we like haven't really solved the problem.
[00:18:27] SY: Okay. Okay.
[00:18:27] VJ: Now actually we're in a situation where we have seen this before because what happens when you have three nodes one after the other as like a chain on the left, what do you do?
[00:18:37] SY: Okay, if it's a chain on the left and I need to get things over to my right, I would do a right rotation?
[00:18:43] VJ: Yeah. So you’re just basically… now you would go down instead of 3 with the left child 2, with the left child 1, we’ll do a right rotation where the older node, the root node, goes down and the middle one becomes the new root. So now 3 slides down and 2 slides up.
[00:19:04] SY: So when we have that less than symbol, we have to do two things, we have to do two rotations. First, we do a left rotation and then now everything is on the left and then we have to do a right rotation to get that right child and now we are balanced.
[00:19:21] VJ: Exactly. You did two rotations and it's funny because the first rotation actually like somehow makes me feel worse because you're like, “Oh, now I'm in like a worse unbalanced situation,” where that one just created a chain.
[00:19:36] SY: Yeah. Yeah. Yeah.
[00:19:37] VJ: But you kind of have to do that so that you can do the easier right rotation.
[00:19:42] SY: And what is that called?
[00:19:43] VJ: So we first did a left rotation followed by a right rotation. So this is called a left-right rotation, which is known as a type of double rotation. So we talked about single rotations first, right? And there were only two of them. It was either left or right. And a double rotation is basically when you do two single rotations together, therefore it's a double rotation. So you can have a left-right rotation or you can have a right-left rotation. And in a right-left rotation, we don't have to go through the example, but instead of a less than symbol, it would be a greater than symbol. So you could have something like 8, 10, 9. You would make it unbalanced into a chain and then you would do a right rotation and then a left rotation and then you would end up with the three nodes evenly balanced with a root node and two children.
[00:20:32] SY: That is not as complicated as I… because it's basically just swapping.
[00:20:37] VJ: Yeah.
[00:20:38] SY: Right? Like we're just swapping things. We're literally just being like, “Oh, you have a new position now.”
[00:20:41] VJ: Yeah.
[00:20:42] SY: Or, “You have a new office. Go to your new office.” And then eventually the office will be in the right shape.
[00:20:46] VJ: There's a lot of structural like reorganization or restructuring in this situation.
[00:20:52] SY: Yeah. Yeah. It's just a little bit of reorg. That’s all it is.
[00:20:55] VJ: Yeah.
[00:20:56] SY: That's not too bad. Okay. So how does this actually work in real life? Let's say I start with an unbalanced tree and then is this double rotation something that I would write a command for? Do I like to run the script to make this happen? How do we actually do these rotations?
[00:21:12] VJ: Well, if you're using an AVL tree structure in all likelihood, you're probably not going to be writing it. You're probably going to be using like… I don't know, some database storage situation or something else that under the hood uses a self-balancing tree and maybe the people who wrote it decided to use an AVL tree. I don't actually even know if any library uses AVL trees, probably someone does. In any case, you'll probably be using something that uses an AVL tree under the hood because that's how they decided to solve the self-balancing situation. And when you use it, maybe unintentionally or you're using it under the hood, it's doing the self-balancing for you when you insert a node. So it's kind of baked into the way that it works. It's part of how that structure is implemented. And it's also worth noting that it's not just when you insert a node. When you delete a node, if a tree becomes unbalanced, you also need to introspect sort of then like check yourself.
[00:22:12] SY: Please go check yourself. Okay. That makes sense. It makes sense that it would be tied to an insertion or a deletion of a node.
[00:22:18] VJ: Yeah. It's just inherent to the structure itself, which is kind of fun because if anyone out there is feeling brave, you can create binary tree and write the implementation and then try to make it self-balancing, try to make it like introspective, and see if you can do that extra little bit of logic because I think I'd be kind of fun.
[00:22:36] SY: Yeah, code those rotations. Yeah. It’s a lot of fun. And that’s the end of today’s show. If you liked what you heard, please leave us a review and make sure to check out Vaidehi’s blog post. The link to that is in your show notes. Also want to give a huge shout-out to Square. Square has APIs and SDKs to make taking payments easy whether you’re building a mobile app in iOS, Android, Flutter, or React Native. If you want to embed a checkout experience into your website, it’s super easy. Square has processed billions of transactions so you know it’s tried and true. Start building with Square over at squareup.com/go/codenewbie. This episode was edited and mixed by Levi Sharpe.
[00:23:17] VJ: Bye everyone.
[00:23:18] SY: Thanks for listening. See you next week.
Thank you for supporting the show!