12.5 - Temporaries - Code generation - [compilers theory] by alex aiken

In the last couple of videos we have talked about code generation for simple programming language and I mentioned at the end of the last video that realistic compilers do things a bit differently and in particular they do a better job of keeping values and registers and of managing the temporaries that have to be stored in the activation record. We're actually going to talk about both of those problems. In this particular video, we're only going to talk about the second one and so we're going to be covering a better ways for compilers to manage temporary values. So the biggest idea which we've already seen is to keep temporaries in the activation record. Now, this is not as efficient as keeping temporaries in registers but that's the subject of a future video, we're not going to talk about that today. What we're going to talk about is improving the language we manage temporaries that happened to be in the activation record for whatever reasons. So why, it doesn't matter why we want them to be into activation record, but given that it's there, that's the most efficient code that we can generate And, the improvement that we're going to make Is have the co-generator assign a fixed location In the activation record for each temporaries. We're going to pre-allocate memory or a spot in the activation record for each temporary and then we will be able to save and restore the temporary without having to do the stack pointer manipulations. So, let's take a look at the [inaudible] program for a simple programming language. Here is the Fibonacci function again and let me change colors to something that says more contrast and let's think about how many temporaries we need to evaluate this functions. So, this function body when it executes we'll need a certain number of temporaries and if we know how many temporaries that needs in advance then we could allocate the space for those in the activation record rather having to do push and pop, pushing and popping from the stack at runtime. So, let's take a look and if then else is going t o involve a temporary because it always do this predicate comparison here, we're going to have to evaluate the, the first argument to the predicate and then save the result of that while we evaluate the second argument to the predicate. So this one involve one temporary, we'll need one temporary for that predicate. Similarly for this predicate, to evaluate it since it's a two argument operation in comparison, we'll also need one temporary for that. 1010. There's this expression over here which is kind of complicated. How many temporaries will we need for these? Well, remember how this works. So, evaluate the first expression and then we save the results of that so this will require one temporary for the result of the called fib going to have to be saved and only evaluate the plus And while we are evaluating the call, the fib though is actually, before we evaluate to call the fib, we have to evaluate the argument of fib and that involve the subtractions. We also need one temporary here for the subtraction. Okay And now we have about the second side of the, this edition here. Well this also involves a subtraction. Okay So, we got to have one temporary here to hold on to the value x while we're evaluating the minus to compute the value of the argument before we call [inaudible]. Okay? So how many temporaries do we need in total? While we need one here for the predicate, but notice that once the predicate is decided, once we know the answer to whether this predicate is true or false, we don't need that temporary anymore. So in fact, that temporary can be reclaimed; we don't need the space for that temporary anymore by the time we get to the false branch. And again, once this predicate is evaluated, we don't need the space for that temporary anymore, okay? So now we're down to the plus. The first thing that happens is we evaluate the argument to this first call the fib. Once that's evaluated, we don't need the temporary for it anymore. Now the results of fib has to be saved somewhere while we do the plus, okay? And then we'r e going to have to evaluate the argument to the second call of fib and then notice that this happens while we still need this temporary here so in fact, we need both of these temporaries at the same time. Okay because while we're evaluating this argument, the second call of fib, we still need to be holding on to the first argument to the plus. And so in fact this particular function can be evaluated with just two temporaries. So all the space we need to compute the value of this function body. So in general, we can define a function nt of e that computes a number of temporaries needed to evaluate e1 + e2. So, that's going to need at least as many temporaries as e1. Okay, so if we need a number of temporary's k to evaluate e1, let's have at least k temporaries to evaluate the whole expression And then, we'll also need at least as many temporaries as it's needed to evaluate the two+1 because we have to hold on to the value of e2 while we are evaluating so we have to hold on the value of e1 while we're evaluating the two. Okay And it's going to be the maximum. Over these two so it'll be the maximum number with between the maximum number of temporaries need to evaluate a one and one + the number of temporaries to evaluate two. That would be the total number of temporaries, the minimum number of temporaries needed to evaluate e1 + e2 And the reason is a max instead of a sum. Is that once we've evaluate e1 we don't need any of the space that was used to evaluate e1 anymore. All those temporaries are done. All we need is the answer. We don't need the immediate results and that means that the temporaries that were used to evaluate e1 can be reused to evaluate e2. So, generalizing from that one example, here is the system of equations that subscribes the number of temporaries needed to evaluate every kind of expression in our little language. So, let's take a look. So, we already talked about e1+e2 is just the max of over the number or temporaries to value of e1 and one + number of temporaries to value of e2. So, e1-e2 is exactly the same thing because the same structure is a different computational operation but is a binary operation and we have to save the value of e1 while evaluated e2. So, it's the same formula. [inaudible] Now for if and else well what do we need? We need one, I'm sorry we need, it's going to max again. It's going to be max over some number of different quantities. How many temporaries might we need? Well, we might need as many temporaries or as needed to evaluate the value of e1 and we certainly need at least as many, alright. So, if you want to take a certain number of temporaries, the whole f and l is going to require at least as many temporaries. Now of course, once e1 is done evaluating, we don't need its temporaries anymore. And, and we can evaluate e2, okay. And while we are evaluating e2, we have to hold on. To the results of e1, that's where the one plus comes from. So, to that, while we're evaluating e2, we need one plus the number of temporaries to evaluating two to hold all the temporaries of the computation. And then once the predicate is done, we don't need any of those temporaries anymore at all ad we're going to evaluate either e3 or e4. And so then, we just need however many temporaries each of those requires and whatever the maximum is over these four quantities, that's the minimum number of temporaries we can get away with to evaluate the entire if then else. Let's take a look at a function call. So that the space needed for the function call is number of temporaries, the max over the number of temporaries to evaluate anyone of the arguments and this is actually an interesting case because notice. That we don't need, we don't have anywhere in this formula space for the results for the e1 through en Of course once we've evaluated the e1 then we need to save it somewhere and so you would think that we might see some numbers in here representing the temporary space needed to hold on to the results of the evaluating these expressions. And the reason that we don't have that in here is that. Even though those values are saved, they are indeed saved; they're not saved in the current activation record The space where the results of e1 and the results of all, any of the arguments. Yeah, again, is saved in the new activation record that we're building And so, the space for the, the results of e1 through en is that those values are stored in new activation record and that storage of current activation record and we're trying to compute the number of temporaries needed to evaluate inside of the current activation And then for integer, that doesn't take any space at all to require any temporaries I mean. So there's zero temporaries required for that and also for a variable reference so it requires no temporaries. So now let's go through our example and work out systematically using the equations. How many temporaries we will need? Okay? So, here for this if then else, remember it was going to be the max over the number required to evaluate e1, well that zero. One + the number to evaluate e2 which is the second expression in the predicate so that would be one, because the number one requires zero temporaries and the one, the we have one hold on to x, all right? And then max over the branches. So, to evaluate zero requires Zero temporaries and now. We have to compute. The number required here. Okay so once again to evaluate the first expression if and else requires zero temporaries to evaluate the second one we require one. One + the number required, one + zero to evaluate that constant we got zero temporaries and now for the last expression how many will this one will require. Well this is going to require zero for this guy. One for the second argument so to evaluate fib is going to require one temporary, okay and then it's going to be one plus over here. We have to hold on to the results there. The value of x - two so how much that going to require? That is going to require the max of zero and one + zero okay so this would be one alright so we have over here we have one + one = two okay and now we're taking the max over two and one. So that's two, okay? And this is the last expression in the, our if and else. So clearly, this if then else here will require two temporaries okay? Because the max over the number required for either part of the predicate, the then branch and the else branch And now, this whole expression. Requires two temporaries and that'll be the max of the four components of the outer if then else And so then for, for the entire expression we get two temporaries. Once it computed the number of temporaries required to evaluate the function value, we can add that much space to the activation record. So, now our activation record is going to require two + n + nt (e) elements And so, the two of course are for the return address for the frame pointer. The n is for the n argument of the function And then, the rest of it is just the space required for the temporaries And now we can talk about how we're going to layout the activation record. We'll leave the first part of it the same, so everything up to the return address is laid out just before. First the color string pointer then the and arguments in reverse order, and then the return address. And then after the return address come the and locations or the nt(e), excuse me, locations for the temporaries. Now, that we know how many temporaries or intermediate values we need to evaluate a function and we also know where those intermediate value is going to be stored in activation record. The last thing we need to know in order to code generation is how many temporaries are in use at each point in the program. Change colors here and so the way we're going to do that is we're going to add a new argument to co-generation which is the position of the next available temporary. So as temporaries gets used up, this argument, the co-generation will change while other expressions to save their values and save places without stepping on temporaries that are already having saved by all other expressions. And as you'll see in a, in a moment here when we do an example, the temporary area of the activation r ecord is going to be used like a small fixed size stack. Essentially, we're going to have the same stack discipline that we had before only all the computation on the stack pointer, all the discussion, all the computation of what all sets to use has already been done by the compiler. So, what we used to do by pushing and popping element from the stack in the generated code allow that computation has been moved into the compiler and all that happens now is a bunch of stores and load. To fix off that from the frame pointer So let's take a look at how this works. Here's the code that we had for e1 + e2 under the old scheme where we didn't have a separate area in the activation records for temporaries. So we would generate a code for e1, and then we would save the results of e1 on the stack, and that would be done by saving the value of the accumulator under the stack and then we would have to adjust the stack pointer And then after we had evaluated the two, then we would load the results of e1 back into a temporary register, we could do the add And then we could pop the value off of the stack, the intermediate value off of the stack Down to the new scheme. Co-generations going to take a second argument saying what is the position of the next available temporary so what is the position of the next unused temporary inside of the activation record and so now we generate code for e1 and we pass along the argument okay because e1 may itself have some temporaries that it needs to store And, and then after you [inaudible] to evaluating, now we just do a direct store into, into the activation record at all set empty from the frame pointer And so now as we have to do in store, we have to save e1 in the activation record so we have it for later on but we have to do any manipulation of the stacks. So, we could place two instructions here by one And then we generate code for e2 but now. We just save the temporary value at position at, at all set empty from the frame pointer so the next available temporary would be of address empty o r offset, excuse me, nt+4 And then after each was evaluating, now we have to load the value of e1 back into a temporary and again that was all set NT from the frame pointer of the current activation record and then we can do the add and once again we saved the manipulation of the stack pointers. So this code sequence here is two instructions shorter than the one we had before and this actually substantially more efficient.