Description
We've been talking a lot about the differences between compilers and interpreters, and how both of them work, and the ways that allowed one — the compiler — to lead to the creation of the other — the interpreter. Now we get into the Just In Time compiler, or a JIT, which is fusion or combination of the interpreter and the compiler, which are each two types of translators in their own right. A just-in-time compiler has many of the benefits of both of these two translation techniques, all rolled up into one.
Show Notes
This episode of the Basecs podcast is based on Vaidehi's blog post, A Most Perfect Union: Just-In-Time Compilers from her basecs blog series.
Transcript
[00:00:02] SY: Welcome to the Base.cs Podcast where we explore the basics of computer science concepts. I’m your host Saron, Founder of CodeNewbie.
[00:00:08] VJ: And I’m Vaidehi Joshi, Author and Developer.
[00:00:12] SY: And she is the brilliant mind behind the Base.cs Blog Series. Today, we’re talking about…
[00:00:16] VJ: Just-in-Time Compilers.
[00:00:19] SY: This season of the Base.cs Podcast is brought to you by DigitalOcean and Heroku. DigitalOcean’s cloud infrastructure is optimized to make it super intuitive and easy to build faster and scale easier. The pricing is predictable. They have flexible configurations and excellent customer support. Get started on DigitalOcean for free with the free $100 credit at DO.co/basecs. That’s DO.co/basecs.
[00:00:47] Did you know you can build, run, and operate applications entirely from the cloud? With Heroku’s intuitive platform, you can streamline development using the most popular open source languages, so you don’t have to worry about backend logistics. Also, you’re not walk-in to the service. So why not start building your apps today with Heroku?
[00:01:12] SY: Okay. So we’ve talked about compilers in the past. Can we do a quick recap of what a compiler is?
[00:01:18] VJ: A compiler is basically a program that is kind of like a translator for the code that we write and the programs that we write and the code that needs to be run by the machine because the machine can’t completely understand what we type out. It needs to be compiled down. I mean, it needs to be translated into machine code, and the compiler is one of the translator friends that we have that helps us do that.
[00:01:44] SY: Okay. Very cool. And so now we’re talking about a just-in-time compiler. What do we mean by just-in-time?
[00:01:51] VJ: I guess in order for us to talk about just-in-time compilers, we also should probably do a recap of interpreters.
[00:01:58] SY: Okay.
[00:01:59] VJ: So I said that a compiler is a translator for our code that we write, the source code and the machine code for our machines, but really what it does is it does something important, which is it translates all of our source code into a binary file first, which is an executable file, and that’s the way that it does the translation and gives it to our machine. But an interpreter, it’s also a translator, but it does something different. An interpreter, basically, it executes pieces of code and translates it as our program is running. So it doesn’t create this executable file. It sort of translates as you go, as you run your code. So it will sometimes have to retranslate our code multiple times, which has downsides, and we’ve talked about that in previous episodes, and it still does do the work of translating our code to something that the machine can understand, but it does it in a very different way. It doesn’t do it all at once. It does it piece by piece, not in one fell swoop the way that a compiler does.
[00:03:00] SY: Okay. We’re talking about a just-in-time compiler. So why did we need to know about interpreters?
[00:03:06] VJ: Because a just-in-time compiler, also known as a JIT, J-I-T, JIT, it’s the fusion of a compiler and an interpreter, which is why I was like, “We got to recap both because I’m going to combine them together and smoosh them into this little JIT.
[00:03:21] SY: Smooshed into this.
[00:03:25] VJ: And every time I see the words JIT, I think of a gnat because JIT reminds me of zit and my brain confuses zits and gnats.
[00:03:36] SY: Those are very different, Vaidehi.
[00:03:37] VJ: I know. But they’re just weird words. So anyways, in my head, I think…
[00:03:42] SY: Do you mix the concept of a zit and a gnat?
[00:03:46] VJ: No, not a concept. Like when someone has a zit, I’ll be like, “Oh, you have a little gnat.” And then I’ll be like, “No, wait. No, zit.” And then I’ll be like, “Oh, this annoying zit is flying around my face.” And I’m like, “No, wait, gnat.” I don’t know, man. Somewhere in my brain, the wires are all messed up.
[00:04:01] SY: That’s amazing. All crossed up. Okay. All right. Cool. So now we can add JIT to your world of confusion. Okay. We have JIT where you said as a fusion of a compiler interpreter. What does that mean?
[00:04:14] VJ: It’s sort of like the best of both worlds. So what I mean by this is a JIT sort of does the good things that a compiler does, and it does the good things that interpreter does, and it picks the best parts of both. And so what I mean by that is one of the things a compiler does is it makes code really fast to execute once it’s been compiled, but then it has some downsides and an interpreter sort of makes up for those downsides. It’s not necessarily the fastest thing. But you can translate code like sort of one line at a time and then you know exactly if something goes wrong, where in that source code came it from and a compiler can’t do that. So a JIT pulls the benefits of an interpreter and a compiler, and it sort of becomes its own thing. And the way that it does that is it fundamentally acts as an interpreter first, which means that it’s going to run the code in line and interpret it and translate it line by line. However, if it finds code that’s called many, many times, and it’s like invoked repeatedly, it behaves sort of like its other parent. If its interpreter is one parent, its other parent is the compiler. So then it sort of pulls from its other parent and it decides to do some compilation when it realizes that it’s necessary. So that’s what I mean by it’s like a fusion because it behaves like an interpreter, but it knows when it needs to compile and be smart about it when the time comes.
[00:05:46] SY: Interesting. Okay. So how does it know when the time has come? How does it know when it actually makes sense to compile?
[00:05:54] VJ: Basically, the way that a JIT works, I hope like no computer scientists are like listening to this and being like, “Excuse me, my whole life’s work is a JIT,” because I’m about to try to reduce it down to like one sentence. Basically a JIT sort of asks itself, “Can I keep interpreting this code directly or should I go ahead and compile this code so that I don’t need to repeat this translation again and again?” And you asked a great question, which is, “How does it know? Do I need to compile it or not?” And the way that it knows that is by sort of monitoring the code. It’s like profiling the code and keeping an eye on the code and like how many times something is being called. And the JIT is sort of like really smart, self-aware. It’s like the suave, sexy spy that’s like watching the code and it’s like, “Oh!” This is not working. I’m going to abandon that weird voice.
[00:06:54] SY: You sexy zit. It could be a zit. It could be a gnat. We don’t really know.
[00:07:00] VJ: But basically you can kind of think of it as like this character that’s sort of like just monitoring what’s going on, like in the shadows and it’s like, “Hmm, this line of code is called a lot. Interesting. I’m going to watch it. I’m going to keep an eye on it. And if I notice it’s called like a lot, a lot, maybe I should go and optimize this and just compile it.” But if it sees another line, it’s like, “Oh, this one’s only called twice. Seems silly to optimize and compile this. I’m just going to translate this in line as it runs.”
[00:07:28] SY: So how is it doing these designations, You’re just keeping a tally and then if it hits a certain threshold, it compiles or what are the categories in which it thinks about it?
[00:07:41] VJ: So the way that a JIT sort of categorizes different lines of code in a program is whether or not the code, that line of code is called often. It’s hard for me to give you an exact number because different implementations of just-in-time compilers will do this differently, but the theory behind it is sort of the same. So the idea is if there’s like a line of code in a program that’s called kind of often, like pretty often, the JIT will designate it as like something called a warm code where it’s like, “It seems like it’s frequently used,” but you don’t necessarily know if it’s like the most used line or not, but seems like warm, like potentially could be optimized. However, if there’s a line of code that’s called like many times, like let’s say you have the first line of code in a program, like say it’s called like so many times, like every time the programs run or like maybe that line is called 10 times more than every other line, that’s designated as hot code because it’s got like a high heat index.
[00:08:44] SY: Nice.
[00:08:45] VJ: And then sometimes the JIT could also notice that like there could be like one line of code in the whole program that’s like maybe an empty variable, for example, or like a variable that’s assigned and never used. And maybe that’s never called at all. And that’s something called dead code. The important things here really are the hot and the warm code because if you know that some line of code or some piece of logic is like hot, then that’s sort of a prime use case for optimization because you know that you’re going to call this line again and again and again and again, and you’re going to translate it in line as the program is run again and again and again. And it would be great if you could, instead of interpreting in that scenario, compile it so that you don’t need to do it so many times. You just sort of do it once.
[00:09:33] SY: Okay. So we’ve got dead code, which doesn’t need to be compiled. We’ve got hot code, which definitely needs to be compiled, and then we’ve got warm code, which I guess could be compiled or how do we think about that? How do we decide which one and when to compile?
[00:09:49] VJ: It will depend on the type of JIT and like the way that it was implemented. For example, you could say that any code that’s warm code is something that was called at least five times and then maybe hot code is something that’s called more than eight times or more than ten times. So that designation could change based on who implemented the JIT. And there’s a little bit of a risk factor when it comes to warm code, because let’s say that you have a line of code that’s called four times, and we decided, “Oh, okay, if we see it called five times, we know it’s warm, we should optimize it.” There’s a risk that on the fifth time that the JIT sees it, it’s like, “Oh! I got to optimize this. Great! I’m going to go optimize.” And then what if it’s never called more than five times? Now you’ve sort of optimized for no reason.
[00:10:41] SY: Yeah. Yeah.
[00:10:41] VJ: So there is a slight risk that you can do some premature optimization, but most modern day JIT are like generally try to be clever about that. But theoretically, you could do that and you have to like consider the downside of optimizing when you maybe don’t need to, which could be a problem with warm code.
[00:11:02] SY: So I see my code, I put it into my three categories. I decide to optimize it. Is there just one way to optimize it or how do we think about optimization?
[00:11:11] VJ: So there’s basically two ways that a JIT can approach any kind of compilation and optimization of a line of code. The first one is sort of the quick and dirty way, and that’s called “Baseline Optimization”. And the way to think about it is that this is just like the quick way to optimize it. It may actually not be super performant. It’s just sort of like, “It’s the baseline.” It’s like the simple way that you could optimize it. There’s a downside of this because if you use a baseline optimization for something that’s hot code, that actually doesn’t really help because if it’s not optimized for speed and it’s not super performant, now you’ve actually made the hot code take longer and you can actually increase your runtime linearly.
[00:12:01] SY: Oh, wow!
[00:12:01] VJ: So the way I like to think about it is that like a baseline optimization is sort of like editing an essay for grammar and punctuation.
[00:12:09] SY: Okay.
[00:12:10] VJ: So it’s not really going to change what you’re saying in the essay, but you’re just like…
[00:12:14] SY: That’s helpful. It’s like cleans it up a little bit.
[00:12:16] VJ: It’s sort of helpful, but like it’s not going to change your grade probably. If your essay is not that good and you add some commas…
[00:12:25] SY: You’re screwed.
[00:12:26] VJ: Yeah, it’s fine. Don’t expect a miracle is what I’m saying.
[00:12:29] SY: Yeah.
[00:12:31] VJ: But on the other end, you have something called “Optimizing Compilation”. Sorry. There’s so many words that are so similar, but it’s called optimizing compilation. But I call it opt-compiling because that’s sort of the short form. And this is like the equivalent of not editing an essay for punctuation and grammar, but like for clarity and for readability and like maybe changing like the whole structure of it.
[00:12:55] SY: Yeah. So we’re going deeper on this one.
[00:12:57] VJ: Yeah. And importantly, it’s like way more of a time commitment. So opt-compiling, if we think about it as a way of approaching compilation of code, you’re basically doing some upfront investment in optimizing efficiently. So you’re like thinking about performance and you’re thinking about how can I compile this code in the most performant way, not just like, “Let me just think of the most quick and dirty solution.” So this one is kind of nice in that if you take the time to do some opt-compiling of a certain line of code, let’s say it’s something that’s like called 10 times in a program, yes, it takes a little bit of time upfront, but basically now what you’ve done is you have made it easy for your JIT to run that code in a constant amount of time because you’ve compiled it. So even if that code is run ten times or a hundred times, because you compiled it already, it takes O of 1 constant amount of time to run that line. So it takes a little bit of upfront investment, but I mean, you get O of 1 runtime, which is pretty great, and it just requires that upfront investment. Now it’s important to say you probably don’t want to opt-compile a line of code that is warm or you definitely don’t want to opt-compile something that’s like not even warm or God forbid dead code. You don’t want to compile that because it like takes so much upfront work.
[00:14:26] SY: Investment. Yeah.
[00:14:27] VJ: And then if you’re not really running that code, why did you do all that work?
[00:14:30] SY: Right. Right. Yeah. Well, when done right, that opt-compiling sounds legit.
[00:14:37] VJ: I love it.
[00:14:38] SY: Do you like that? Do you like that? That was all Levi. That was entirely our producer. That wasn’t me, but I thought that was amazing.
[00:14:44] VJ: Oh, really? I love it. Oy my God, Levi! It’s so good.
[00:14:45] SY: That’s amazing. That’s Good. Yeah. Well done, Levi. Well done.
[00:14:49] VJ: We have our episode title.
[00:14:51] SY: Oh my God, we do! Legit. Amazing.
[00:14:56] VJ: It’s quite French. I wish I could do a French accent because I could imagine. Oh my God! It’s so perfect because like a JIT could be like a little like a French gnat of French zit.
[00:15:06] SY: Is that actually what it means or just making this up?
[00:15:09] VJ: No. Doesn’t “la” mean “the” in French?
[00:15:15] SY: Sure. It sounds right. It sounds like that could be true.
[00:15:20] VJ: Anyways. There’s some sort of opportunity for some sort of like rude French accent here, but I can’t do that. But I could imagine our kids can be very French and like prancing around Paris with a baguette or something.
[00:15:32] SY: Holding a croissant.
[00:15:33] VJ: Yeah.
[00:15:34] SY: Yeah. Very nice. With a little hat.
[00:15:38] VJ: Like a beret?
[00:15:39] SY: There you go. I was like, “What’s the word? The hat with the thing?”
[00:15:42] VJ: Bonjour.
[00:15:43] SY: The hat with thing. I’m so sorry to all of our French listeners. This is very frustrating for you. Okay. So we’ve got our legit JIT and it’s figuring out if it should do opt-compiling or if it should do baseline optimization or if it should just do nothing, which is totally fine as well. But here’s a question I have. So I know that one of the benefits of using an interpreter is that because you are running your code, I guess, in like real time or you’re executing your code in real time, that you get to catch errors along the way versus when you compile code, you’re like compiling it first and it’s not until you run it later that you get to see your errors. So if I’m compiling parts of my code, doesn’t that mean I lose the debugging superpower of interpreters?
[00:16:36] VJ: You would think so. But remember that a JIT is like sort of an interpreter first and then it compiles when necessary. So even when we optimize and compile some code that we run often, we actually still are able to reference where that compile code came from. So let’s say like you have a program and one line or like one function in that program is hot, so the JIT decided, “I’m going to optimize it and I’m going to use opt-compiling. So I’m going to compile it ahead of time. So now when that function or that line is run, even though you’re using code that was compiled, just because it was that one section that was compiled doesn’t mean that you don’t know where in the program it was happening because you’re still running everything else in line, right? It’s just that one little function or one line that is going to point to some compiled code. But you still know like, “Oh, it was that line or it was that function,” because the whole thing isn’t being compiled. So you actually have preserved context because this is being run during runtime. So you still can debug any errors or any issues from that compiled code because you know, “Oh, it’s coming from this line or this function.” So you kind of like fix that problem that compilers had and you’ve got that benefit that interpreters give you, which is like ease of debugging and having that preserved context.
[00:18:04] SY: Very cool.
[00:18:05] VJ: So I just want to add one more thing. We sort of already touched on it, which is that sometimes your JIT could decide to pre optimize a piece of code that doesn’t need to be optimized or like for example, maybe your JIT tries to optimize something that’s never even going to be called again. That is just a reality, not just of just-in-time compilers, but it’s just something that is part of dynamic translation, which is any kind of compilation that happens while the program is running and there’s not too much more that I really want to say about that, except that if you ever see that word, you now know that a JIT is a type of…
[00:18:46] SY: It’s a JIT.
[00:18:47] VJ: It’s legit.
[00:18:49] SY: It is legit.
[00:18:50] VJ: That a JIT is a type of dynamic translation, and it’s not the only one. There are other kinds too.
[00:18:55] SY: And now that we have finished covering a JIT, do you know what that means?
[00:19:02] VJ: No. What does it mean?
[00:19:04] SY: We have covered the very last topic in the Base.cs Series.
[00:19:12] VJ: Oh my gosh! I thought you were going to ask me some sort of hard technical question. I was like, “I don’t know. I’m scared.” And you’re like, “No, it just means we’re done.”
[00:19:22] SY: No. It just means we’re done. It means we’re done. I mean that this is our last concept. In next week, we’re going to do a little bit of a look back and we’ll share some of our favorite parts of our episodes and we’ll play some little parts of our episodes and we’ll have a nice farewell.
[00:19:37] VJ: Nice. I think it’s kind of perfect that we’re ending all the technical stuff with the JIT because it’s like this beautiful marriage between the compiler and the interpreter. And as we learned in the past few episodes, compilers and interpreters, they’re sort of like this perfect fusion of data structures and algorithms combined, right? Because we learned how compilers require all these interesting data structures under the hood, and now we’ve learned how you need to be able to have an algorithm internally to decide like, “Oh, is something hot code? Is it warm code? Do I optimize it? Which way do I optimize it?” Like that’s just an algorithm that the JIT implements and it’s cool that we’re talking about these relatively complex topics, but they’re really just using the building blocks that we’ve been learning about all these years.
[00:20:32] SY: Yeah. Oh my God! It’s been years. It’s been three years.
[00:20:35] VJ: I know.
[00:20:37] SY: Wow!
[00:20:37] VJ: So much!
[00:20:39] SY: Yeah. We’ve covered so many topics. Yup. Well, that’s the end of the day show. If you liked what you heard, please leave us a review and make sure to check out Vaidehi’s blog post. Link to that is in your show notes. I also want to give a huge shout-out to DigitalOcean and Heroku. DigitalOcean is a simple developer-friendly cloud platform, which makes managing and scaling apps easy with an intuitive API, multiple storage options, integrated firewalls, load balancers, and more. There’s also a robust community that provides over 2,000 tutorials to help you stay up to date with the latest open source software, languages, and frameworks. Get started on DigitalOcean for free with a free $100 credit at DO.co/basecs. That’s DO.co/basecs. There’s a reason over nine million apps have been created and ran on Heroku’s cloud service. They not only manage over two million data stores, but they make over 175 add-on services available to you. So whether you’re creating a personal app, a free app, or an enterprise app, start building today with Heroku. This episode was edited and mixed by Levi Sharpe.
[00:21:52] VJ: Bye everyone.
[00:21:53] SY: Thanks for listening. See you next week.
Thank you for supporting the show!