Description
What is selection sort? How does this algorithm work? And just as importantly, how does it perform? We use broken books and cookies to tell you all about it!
Show Notes
This episode of the Basecs podcast is based on Vaidehi's blog post, Exponentially Easy Selection Sort from her basecs blog series.
Transcript
[00:00:00] SY: Okay. I’ve got a very exciting announcement. We finally have a date for Codeland, our annual conference. It’s happening July 22nd in New York City. Early bird tickets are now on sale. For just 99 bucks, you get breakfast, lunch, dinner, a coding workshop, technical talks and after-party, and of course an amazing community. Sale ends March 10th. Go to codelandconf.com for more info. Link is in your show notes.
[00:00:30] (Music) Welcome to the Base.cs Podcast where we explore the basics of computer science concepts. I’m your host Saron, founder of CodeNewbie.
[00:00:37] VJ: And I’m Vaidehi Joshi, Author and Developer.
[00:00:40] SY: And she is the brilliant mind behind the Base.cs Blog Series. Today we’re talking about...
[00:00:45] VJ: Selection Sort.
[00:00:46] SY: This season of the Base.cs Podcast is brought to you by our wonderful friends at DigitalOcean. DigitalOcean provides the easiest cloud platform to deploy, manage, and scale applications of any size. They remove infrastructure friction and provide predictability so you can spend more time building what you love. Try DigitalOcean for free by going to DO.co/codenewbie and get $100 in infrastructure credit. Link is in your show notes.
[00:01:14] This season is also brought to you by the fine folks at Linode. If you’ve got a personal project, a small business or a big business with lots of data, Linode offers you secure hosting for all your infrastructure needs. They are a Linux cloud hosting provider where you can get a new server up and running in under a minute. Plans start at one gigabytes of RAM for just five bucks a month. And with the promo code CodeNewbie2019, you can get a $20 credit. So go to linode.com/codenewbie and give it a try. Also, they’re hiring. Check out their jobs at linode.com/careers. Link is in your show notes.
[00:01:52] SY: Okay, let’s get started. Okay. Well, today we are talking about algorithms and I feel like we talked about algorithms a bunch last season and we described algorithms as very simply a set of instructions. You know how I feel about that definition, right?
[00:02:08] VJ: Yeah. I know you don’t like it. It’s okay. We can skip it. We’ll just say it and just put it under the rug. It’s fine.
[00:02:14] SY: Okay, where it’s safe and can’t hurt me. Okay, great. So it’s a set of instructions. And today like you said we’re talking about one particular algorithm called “Selection Sort.” So if algorithms truly are a set of instructions, what are the instructions of a selection sort telling us to do?
[00:02:34] VJ: Well, the instructions of what the selection sort algorithm actually does is it basically works through a list of unsorted elements and it basically follows the rule of finding the smallest element and then it puts it at the front of the list and it just continues to do that with all of the unsorted elements. So another way that you can kind of think about it is that it’s iterating through a bunch of elements, finding the smallest one and then moving it. And the moving is actually a bit interesting because what it’s doing is it’s swapping elements. And through the process of running a selection sort algorithm, you actually start to notice that the simple act of finding the smallest value in the unsorted list starts to make two separate buckets of information. You start to have the elements that have been sorted and the elements that have not been sorted.
[00:03:30] SY: Okay. So why do we need the two lists? Like it would just like sort things just as they are. Why do we have to make a new thing?
[00:03:41] VJ: Well, it’s basically the idea of separating items that we know we’ve looked at, we know that we’ve sorted, and the ones that we aren’t sure where they belong yet. Like if you are sorting, what’s a good example? Like let’s just say you are sorting a book whose pages you know came out and all you have to go off of are the page numbers.
[00:04:01] SY: Okay. That’s a really crappy book. That’s like a poorly made book.
[00:04:05] VJ: Yeah. Well, or like maybe the glue fell apart and it’s very old.
[00:04:11] SY: The stitching came undone. Okay. All right. I’ll go with it. I’ll go with it. Sure. Sure.
[00:04:15] VJ: And so now you just have like instead of a book you have a bunch of pages and you were like trying to sort the pages together so that you have a cohesive, you know narrative, a whole book. And if it’s just like sorting papers, then the way that I would do it, maybe you would do it like this too, is I would probably start with you know try to find number one and then find the next smallest one in the list and be like, “Okay, here’s two and then go through and find the next smallest one.” And if I’m pulling every single page from the smallest number because we are starting with the smallest because we’re trying to go in you know order, numerical order, the way that a book would, then by default every time I get the smallest number page from my unsorted pile and I put it in my pile of papers that are sorted, well, I kind of have two lists, right? I have two buckets, the sorted ones, the ones that probably aren’t that big to begin with. But I’m slowly you know compiling the pages and the unsorted heap over there where I’m like, “I don’t know what order they’re in,” but I’m just going through each time, passing through my list and just grabbing the next page. So I’ll just systematically work my way through the book.
[00:05:20] SY: Okay. All right. I get that. That makes sense. Okay. So you talked about this process. It kind of sounds like there’s a couple steps involved. Well, first you need a list of elements to start with. You have to go through them, then you find the smallest one, and then you kind of put it aside and you work your way through the list of elements. Can we walk through an example of that? Because I get the big picture, but I still want to like break it down into steps.
[00:05:47] VJ: Yeah. Sure. Let’s do it.
[00:05:48] SY: Okay. So let’s say we have a list of elements. Let’s do numbers, make it nice and easy, and let’s say the list is 4, 3, 1, 2, and then I say, “Selection sort go. Do your thing.” What does it do first?
[00:06:04] VJ: Well, if it was me and you, we could probably just look at 4, 3, 1, 2 and be like, “Oh, one goes first.”
[00:06:09] SY: Yeah. “I don’t need an algorithm. I got my brain.” That’s me in computer science class. “I don’t need no algorithm. I got my brain.” A plus. Okay. So if we were us, then we’d be able to use our brains.
[00:06:29] VJ: Right. But a machine is a machine.
[00:06:31] SY: Yes.
[00:06:32] VJ: It doesn’t have the kind of quick thing where we can just scan through a list of numbers and we just know. It doesn’t know what’s the smallest number to pull out from that list from 4, 3, 1, 2. So what it starts off doing is it assumes that the smallest number on the list is the one that you start off with.
[00:06:49] SY: Okay.
[00:06:50] VJ: It’s just like, “All right, the first number I have, I’m going to assume that this is the smallest and maybe it’s in the right spot. I don’t know.” And in this case, the smallest number is from the perspective of the algorithm the number 4. So you can almost think of it as like if the algorithm was a person, if there’s like a personification of the algorithm, there would be like a little Post-It note where I would be like, “Here’s my smallest number. Here’s my first number.”
[00:07:11] SY: It’s so wrong.
[00:07:13] VJ: Well, from a code perspective, what you’d probably have if you were coding this algorithm you would have a variable or some sort of reference and you’d say like, “Here’s my smallest number.” It’s currently pointing to the first element in my array of numbers.
[00:07:26] SY: Okay.
[00:07:27] VJ: So now you know what the smallest number is. It’s the first one in the list. Now the algorithm has to basically iterate through the whole data set until it finds the actual smallest number.
[00:07:38] SY: Okay.
[00:07:38] VJ: So in this situation, it will go from 4 and it will look at 3 and it will say is 3 is smaller than 4. No. Oh, sorry.
[00:07:46] SY: Wait. No. Yes. Vaidehi, you have to know these things.
[00:07:52] VJ: I’m going to start that sentence again.
[00:07:54] SY: Okay. This is why we need the algorithm.
[00:07:58] VJ: I don’t know. Oh man. What it’s going to do is it’s going to go through the list and it’s going to compare 4 with the next numbers in the list. It’ll compare 4 and 3 and it’ll say, “Oh, 3 is smaller.”
[00:08:12] SY: Okay. So after the 4 and 3 comparison, after the algorithm decides actually 3 is my smallest number, that pointer that used to be pointing to 4 as the smallest number, now we move that so that it’s now pointing to 3 as the smallest number.
[00:08:27] VJ: Right. So now it’ll say, “Oh, 3 is smaller than 4 so now I know 3 is the smallest.”
[00:08:32] SY: Okay.
[00:08:33] VJ: Then it’ll compare 3 and 1 and it’ll say, “Oh, which one’s smaller? Oh, 1 is actually smaller.” So now this is now what I think is the smallest number in the list so far. Of course, keep in mind it doesn’t know that 3 is smaller than 4 and greater than 1. All it knows is what’s the smallest.
[00:08:49] SY: Just comparing two at a time? Okay.
[00:08:50] VJ: Right. Right. It doesn’t have any concept of the whole list right now. It’s not like, “Oh, I know this is the smallest of all of them.” It’s not really looking at what’s bigger and what’s smaller than what. It’s just looking at two at a time and it’s basically going to keep doing that so it will compare 1 and 2 and say, “Oh, 1 is still smaller,” 1 is the smallest item in the whole data set.
[00:09:09] SY: And then what does it do?
[00:09:11] VJ: Well, so now it’s found the smallest item and it knows where it started.
[00:09:14] SY: Okay.
[00:09:15] VJ: It knows that it started with 4 and it knows that it’s trying to for example sort in ascending order. So it wants the smallest item at the front and the largest at the end and it needs to numerically build up in order.
[00:09:28] SY: Okay.
[00:09:29] VJ: Now all it’s going to do is it’s just going to say, “I know the smallest number. Now it’s in the wrong place. It needs to be at the front of the list.” So I’m just going to swap it.
[00:09:37] SY: That’s right. Okay.
[00:09:38] VJ: All I’m doing is swapping the smallest thing. The algorithm is going to work with that basic thesis. It’s going to find the smallest unsorted number. It’s going to swap it with the front of the list, with whatever’s at the front of the list, and then it’s basically figured out, “Here’s the smallest number. I’ve put it aside.” So now it knows that the item at the front of the list is the smallest in the whole data set. It still doesn’t know the rest of the data set and whether it’s in order, what’s the next biggest, next smallest, it has no idea, right? It’s not really keeping track of that. So in our heads we can kind of imagine, on the left side of the list is the sorted list and the right side is the unsorted list. So this is the two buckets that I was talking about before, the two piles or the two lists that the selection sort algorithm uses.
[00:10:28] SY: Okay. So just to confirm. Our original list said 4, 3, 1, 2. The algorithm and its first pass, its first iteration, I guess, has identified that 1 is the smallest number and it did a great job. And so now we’re going to swap out 1 and 4 because we need to put 1 at the beginning? Okay. So now our new list says 1, 3, 4, 2.
[00:10:53] VJ: Yup. And keep in mind, the unsorted ones on the right side, like if we’re Imagining the left side is the 1.
[00:11:00] SY: Right. It’s like a division now. Yeah.
[00:11:01] VJ: Yeah, there’s like a little bit of a partition. The ones on the right side, they’re not in the right place, but they should be, like 4 shouldn’t have been moved to the third position really, but again the algorithm is just focusing on what’s the smallest, move it to the front of the list.
[00:11:17] SY: Okay. All right. Can I take a stab at what happens next?
[00:11:21] VJ: Yeah.
[00:11:21] SY: Okay. So we have our new, it’s not our new list, it’s technically the same list, but we have our first pass list, which is 1, 3, 4, 2. So then the algorithm, can -- I think I can skip the first one, right? Because we handled that already.
[00:11:38] VJ: Yeah. So basically if you are coding this you’d probably have a loop or you’d have some sort of like some sort of iterative you know enumerator type thing that’s going by the index. So it would say, “Oh, I’m done with the first index in the list.” So now what is the number that is at the front of the list? It’s the number 3.
[00:11:57] SY: Okay, that makes sense. Okay. So my algorithm starts at 3 this time and then it says, “Okay, is 3 smaller than 4?” where I’m comparing those two. “Yes, it is the smallest one.” So my smallest number is still 3. Then I go from, oh wait? Then do I swap it at this point?
[00:12:16] VJ: Well, you’re not done iterating yet.
[00:12:17] SY: Okay. So 3 is my smallest, then I compare 3 and 2 and then I say, “Actually 2 is my smallest number.” So now that 2 is my smallest number, now I swap the 3 and the 2.
[00:12:29] VJ: Yup.
[00:12:30] SY: Oh, snap! I like this little algorithm. It’s like Choo-Choo Train. It’s swapping things out and making a little list as it goes. It’s like a zipper. I don’t know why it’s like a zipper, but it just feels like a zipper.
[00:12:42] VJ: That is a good kind of visual metaphor. You know I’ve always envisioned it with like playing cards where I imagine cards just being like, “Du, du, du, du, I’m going to hop to the front. Du, du, du, du!” That sound by the way is when you’re like going through the deck. You’re just going through the deck and then one card is like, “I’m at the front now.” And then you go through the deck, you go through the deck, and then you’re like, “Oh, I’m at the front.” But I like the zipper thing too. That’s a cool visual.
[00:13:07] SY: And has a sound effect too. zzzzz, is that what a zipper does?
[00:13:13] VJ: Between your zip and my du, du, du, du, we have some musical talent here.
[00:13:21] SY: It will be a base PS, The Production Studio. Okay. This is a technical podcast. Okay. So at the end of our second iteration, we now have 1, 2 and then our 4 and our 3. Okay. So now we have one more pass. So we have our index 0 is taken, we’ve already taken care of that, don’t have to worry about that, index 1, already taken care of that. Now we’re on index 2, which is where our 4 is. We only have like two things to compare. So we have our 4 and our 3 that are left and we say, “Which one is smaller? Our 4 or our 3?” 3 is the smallest of the two. So then we swap that and then we’re done. We have our 1, 2, 3, 4.
[00:14:16] VJ: Yeah. What’s kind of cool is that there were, what, four elements, right? There are four elements, 4, 3, 1, 2 is what we started with. We didn’t actually ever have to do like a comparison the last time.
[00:14:28] SY: Yeah.
[00:14:29] VJ: So there were four elements, but we only have to do these three comparisons because by default when you got to the fourth element, you’re like, “Well, you’re the last one. I know I’ve found the smallest one every time I’ve iterated so you must be the biggest.”
[00:14:40] SY: Okay. That’s not as complicated and scary as I expected. Okay. Can I tell you how I really feel? Can I tell you how I really feel?
[00:14:48] VJ: Yeah.
[00:14:49] SY: This algorithm is just so silly. It’s a silly little algorithm. It’s like checking along like, “Which one is smaller? I have a new small one. I have a new small one.”
[00:14:58] VJ: So yeah, just like that. See, now you’re making the same sound effect where you’re just like wobble, wobble, wobble.
[00:15:02] SY: That itself is spot on. It just seems like a little silly toy. It’s adorable. Okay. So if we break down the steps that we just talked about or just walked through, we have setting the smallest number to be the first element in the list. That’s like the main accomplishment of this algorithm.
[00:15:29] VJ: It is. But wait, the main accomplishment is literally just taking everything at face value and being like, “Well, you’re the first number. You’re at the front.” That’s wonderful. I never thought about it like that, but now this algorithm is really, really silly.
[00:15:46] SY: It’s so silly. It’s just a silly algorithm. So its whole thing is like, yeah, finding its new smallest one and putting it at the front. It’s kind of like when you watch a baby like organize toys and it’s like, “What are you doing?” And it’s just like putting it in random order and it makes sense to the toy and it’s really cute to watch, but like your whole life is just putting toys in order. I don’t have kids. Okay.
[00:16:07] VJ: But this would be a great way of teaching them like algorithms and be like, “Find your smallest toy, put it at the front. Find your smallest toy, put it at the front.” I also don’t have kids obviously.
[00:16:21] SY: Okay, but in the context of, you know, we talked about at the very beginning of this episode, algorithm is about following instructions and so we have to have some type of repeatable pattern, right? We have to have something we’re doing over and over again. So one of the things we’re doing over and over again is setting the smallest number and putting it at the front over and over again. Okay. Cool.
[00:16:38] VJ: I’m glad this list was like only four items because if it had been any longer, I felt like by the end you would have been pretty bored. You were like, “Oh, we’re doing this again. Really? The smallest number again?”
[00:16:49] SY: Exactly. And then the other thing we’re doing is we’re swapping things. That’s kind of important. We’re swapping one thing with another thing. Another thing we’re doing is moving on to the next thing, right? We’re moving like one up like you were saying.
[00:17:06] VJ: We’re incrementing the index basically.
[00:17:08] SY: Incrementing, that’s the word.
[00:17:10] VJ: Every time we iterate.
[00:17:11] SY: Moving one up. You know how you move one up? Moving one up.
[00:17:16] VJ: Oh, man! First episode of the season! Can you feel it? Can you feel the burn?
[00:17:27] SY: Okay. And then you just pretty much keep doing that until you’re at the end of the list.
[00:17:31] VJ: Yeah. Once you get to the last element by default you know you’ve sorted it using these steps. So you’re done.
[00:17:37] SY: Okay. So I see why this is considered a simple algorithm. Now the good thing about it though is it’s pretty -- like once we break it down it’s not that complex. I mean, it’s simple, like it’s pretty easy to understand and to see how it works, but how does it perform? How does it perform in terms of our nice Big O notation stuff? What’s this algorithm? How is it doing?
[00:18:01] VJ: Well, it’s not the best. Well, for example, with this algorithm we could kind of actually break down the Big O notation into two different pieces. There are two main things that are happening. There are things that involve comparing two elements in the list where N is the number of elements and then you also have the number of times you have to move elements in the list. That’s the swapping, right?
[00:18:27] SY: Oh, interesting. Yeah. Yeah. Yeah.
[00:18:30] VJ: So again, that’s related to how many elements you have in the list.
[00:18:33] SY: Okay.
[00:18:34] VJ: So for math reasons, there’s some fancy algebra where you can basically determine that the number of comparisons that you’ll make using selection sort is going to be approximately N squared over 2. That’s the number of potential comparisons using n number of elements.
[00:18:54] SY: Sure. That’s what I was thinking.
[00:18:59] VJ: Yeah, obviously. Well, I guess I’ll leave you with a little food for thought. The reason that this math works out is because when we had four elements we ended up doing three comparisons. Right? And if you had five elements you would end up doing four comparisons. So there’s a rule of, when you have n number of elements you end up doing N minus 1 passes through the whole list while you’re iterating because you have to do a certain number of comparisons every time to get to the end to determine is this the smallest. So that’s the number of comparisons. The worst-case scenario for how many times you’ll actually have to swap elements if you end up with a list where you’re swapping every single time, so there could be a situation where maybe in one pass through everything is in the same order. So you maybe don’t swap every time. But in our situation and in a worst-case scenario, you might iterate through the whole list and swap every element per pass. That’s a potential of n number of swaps, which means swapping per element.
[00:20:01] SY: That’s true.
[00:20:02] VJ: And the way that this basically works out with Big O notation is a running time of o n-squared, which basically means as the number of elements that we’re sorting, number of elements n increases, the running time for the algorithm quadruples.
[00:20:21] SY: Oh, wow! Oh no.
[00:20:26] VJ: Oh my God.
[00:20:28] SY: That sounds terrible. Oh, no, no, no. I don’t like that.
[00:20:33] VJ: IT’s not great.
[00:20:34] SY: No, my brain is way better.
[00:20:37] VJ: It’s all wonderful, but you understand like this was easy, but it’s not performance. So it’s like easy to understand, but is it going to be great as the list gets bigger? This was fine because we only had four elements, but you know for 40, 400.
[00:20:51] SY: Yes. It’s like eating cookies for dinner is easy, but you’re going to pay for it.
[00:20:57] VJ: I mean especially if it’s more than 4.
[00:21:02] SY: Then you get slower. You get slower. Oh my God. Oh my God. Oh my God. No, this cookie comparison is going to work. Okay. Ready? Ready?
[00:21:07] VJ: I’m so ready.
[00:21:08] SY: Okay. Big O notation is all about like it’s going to take longer, right? It’s like it’s more stuff. It’s a pain, not fast, right? When you eat a lot of cookies, then you gain weight and you physically slow down. You know I’m saying? They’re the same.
[00:21:23] VJ: You’re saying I should only eat a few cookies.
[00:21:26] SY: Yes, you’ve got to keep it under like two cookies. Okay. Two cookies is fine. Any more than two cookies you become o n-squared. Okay?
[00:21:35] VJ: It’s okay. I’ll just eat two cookies.
[00:21:38] SY: That’s all you need.
[00:21:39] VJ: And I’ll know which one is bigger because I’m not a machine. I can just look at it and I’ll just say they’re the smaller one, move it to the front and eat it first.
[00:21:48] SY: Move it to the front and eat it first. There you go. There’s one thing you learned from this episode.
[00:21:50] VJ: Eat the small cookies first. Oh, man!
[00:21:59] SY: Oh my goodness!
[00:22:00] VJ: I’m so hungry now.
[00:22:02] SY: And that’s the end of today’s show. If you liked what you heard, please leave us a review and make sure to check out Vaidehi’s blog post. Link is in your show notes. Also want to give a huge thank you to Linode for sponsoring the show. Try their powerful servers built for all your infrastructure needs. Getting started with a shiny new server takes less than one minute so you can get up and running in no time. Get a $20 credit by using the promo code CodeNewbie2019. Just go to linode.com/codenewbie for more info. Also want to give a huge shout-out to DigitalOcean. They are the easiest way to deploy, manage, and scale your application. Everything about it was built with simplicity at the forefront: setting, deploying, even billing. Try DigitalOcean for free by going to DO.co/codenewbie and get $100 in infrastructure credit. Link is in your show notes. Vaidehi, you want to say goodbye?
[00:22:56] VJ: Bye everyone!
[00:22:57] SY: Thanks for listening. See you next week.
BLOOPERS
[00:23:03] VJ: I don’t have any cookies though. I blame you.
[00:23:05] SY: That’s okay. You’re in Portland. It’s fine. They won’t be helpful to me. Do you like how I made your hunger about me?
[00:23:13] VJ: Yeah.
[00:23:14] SY: Do you like that? That’s what friends are for.
Thank you for supporting the show!