Crockford on JavaScript - Volume 1: The Early Years

Greetings comrades! [laughter] We're talking about JavaScript. There actually won't be a lot of JavaScript tonight. There is a whole lot of back story that I think we need to get to before we get to JavaScript, so tonight is going to be Volume One: The Early Years. I'll start with some of my own history, but first we'll start with *** Allen. In 1969 when I was in high school, *** Allen made a movie called Take the Money and Run. In the movie he plays Virgil Starkwell, and in this scene he's interviewing with an accounting company for a job, something for which he is completely unqualified. The interviewer asks him: 'do you have any experience running high-speed automatic digital computers?' And Virgil answers 'yes, my aunt has one.' He says 'what does your aunt do?' and he says 'I can't recall.' It's funny that a completely nerdy young man would have less experience with computers than his aunt, it's just ridiculous. Except when the movie came out it was funny for a completely different reason. The reason it was funny in 1969 was that computers were incredibly expensive at that time, they cost millions of dollars, they required a room as big as a house, literally, with a raised floor and special air-conditioning and fire suppression systems. They required a lot of people to operate them and to manage them, and to perform maintenance on them. They took a lot of power to operate. It's sort of like the expense of running a colo for one CPU. So computers would be only available to very large corporations, large government agencies, very well endowed Universities, and nobody else. It was completely impossible that anybody's aunt would have one. So in order to understand this story, in order to understand why it's funny, you need to understand the context behind it, and that's what we're going to be doing tonight. As far as my own story, I wanted to have a computer but they were just not available. I only knew two people who actually had them. One of them was Napoleon Solo in The Man From U.N.C.L.E.; he was able to have one because he worked for the United Network Command for Law and Enforcement and they had a lot of money, so he had a really nice looking computer there. The other person was Batman. [laughter] He had his Bat-computer, and that was because he had access to the wealth of millionaire Bruce Wayne. I didn't have resources like that, so I decided I was going to build my own. At that time I had absolutely no idea what computers were or what they did, or why I wanted it, I just knew that I wanted it. I couldn't even identify all the components of one except that I knew they had consoles that had lots of lights and buttons on them. I thought, I'll start with that: I'll make a console and then I'll work out the rest of it. I found some pieces of particle board and a saw and I sketched out what it was going to look like, and started sawing. I sawed, and sawed, and sawed. The particle board was really, really hard, and the saw was really, really dull. I sawed for what must have been at least two minutes, and then I gave up. OK, I'm not going to do that. So I probably went into the house and watched television after that. At that time, even at that tender age, it was already obvious that I was going to be a software guy. [laughter] Having established my credentials, my qualifications for giving this talk, we will now proceed. There's a lot of history I'm going to give you tonight, and I think it's really interesting stuff. I'm going to spend this time with you today because I think it's important that you know it too. Most of us are not aware of the history of our own field, and you need to be, because there's a lot of rich material here. But there are a couple of recurring themes that I'm going to identify. The first is that people who should be the first to recognize the value of an innovation are often the last, and we're going to see lots of cases where this has occurred. Obsolete technologies fade away very slowly. We like to think that innovation causes everything to move over; it doesn't. Sometimes we step forward, and sometimes we step backwards, and sometimes we step both ways at the same time. Sometimes stepping backwards is the right thing to do, sometimes it's a bad thing to do ó at the moment, we never seem to know the difference. There's also a myth of inevitability, that the reason things are the way they are is because they had to be as a consequence of everything that happened before, the reason we have the things that we have is that's just the best way that it could have worked out, and that's absolutely not the case. I hope to demonstrate that as well. So I'm going to be weaving some threads together. Let's start with the Jacquard Loom. Joseph Pierre Charles Jacquard perfected the automatic loom in 1801, and this is what it looked like. He adapted player piano technology to automate the operation of a loom, so each card represents the pattern on one row of the thing that's being woven. In the holes and spaces, instead of causing hammers to go down onto strings and back up again, instead it controlled the movement of threads within the loom. He didn't do the continuous roll that the player pianos did because that was just way too difficult to edit. Instead he had each row as a separate unit and he would then sew them together, so it was a much easier thing to create. It was an extremely effective device. That a weaver using a Jacquard Loom could out-perform a master weaver and an apprentice on a draw loom by over a magnitude in efficiency is just an amazing difference. The Jacquard Loom became very successful, and these principles were applied to other forms of automation. Whenever you have a big breakthrough like that, there are always other things that get invented as a consequence, and one of the big ones was industrial sabotage. Because it turned out the weavers didn't like this ó he was completely upsetting the way that they used to live. So they went around and would find Jacquard Looms and destroy them. That's what they did. Now, there were some people that did identify other uses for this. For example, Babbage and Lovelace recognized the potential of using punched cards for moving data in and out of their computing engines. Unfortunately Babbage never finished his machine, so that's sort of a dead end. The story picks up with the Hollerith Card Tabulating Machine in 1890. The United States Constitution requires a census every ten years in which we go and count all the citizens, and we use that information to determine the composition of the House of Representatives. The country had gotten so big by the late 19th century that it was becoming more and more difficult to count how many of us there are. Herman Hollerith invented a machine and showed it to the Census Bureau and they accepted his bid. This is how the 1890 census was done. A clerk would take a questionnaire, which you can see in his left hand, and copy it onto a punch card using this pantograph punch machine. We'll zoom in on that card. What a card was, in a Hollerith system, was a set of field sets, each field set containing a number of radio buttons. The radio button would be on if it had a hole punched in it, and it would be off if it didn't. In that way, he can code a lot of information on one card, and do it very efficiently. Then he had his Tabulating Machine, which was in two parts. There was the main part, which had the dials that did the counting, and then the sorting cabinet. The operator would take a card, put it in the machine, and push down an array of pins. If the pins went through a hole they would touch a pool of mercury completing a circuit, which would then cause one of the dials to advance and would also open up one of the drawers on the sorting machine, and then you'd drop the card in and get the next card and do it again. It seems really tedious, but it was a whole lot better than the previous system which was all done with paper and pen. The census was successful, and then Hollerith went on with other inventors as well, to apply this to business. Well into the '70s, IBM and other companies were still operating this kind of equipment. It's very, very successful. This is an accounting machine. It did everything that the previous machine did, but it's more automated in that you can put a whole deck of cards into it and run them all through at once, you didn't have to have someone feeding it one card at a time. Here's another example of its operation. This is a card. In fact, this is a card, this is what they looked like. The form factor of the card was based on the size of the dollar bill at the time. Hollerith chose that because at the time there were lots of off-the-shelf stacking trays that you could just put these into, so that reduced some of his engineering cost. In a later phase of the card, the way information was stored on it was re-organized. Instead of the random set of fieldsets we saw before, it became a simpler column-and-row situation. There are eighty columns on the card, each column containing potentially twelve punches. Has anyone ever heard of the eighty-character limit? We still have that today, and you probably wondered where that came from. Why eighty, why an eighty limit? This is the limit. You can't put more than eighty characters on one of these cards. This has been obsolete for a long, long, long time, but we still have the eighty character limit. This card shows the Hollerith code. The first ten punches show how you encode a number. Digit zero is the zero punch, one is the one punch, and so on. To do letters using what is called the Hollerith code, you take one punch from the top three ó that's sometimes called the zone ó and then one punch of the lower nine. Three times nine is twenty-seven, there are twenty-six letters, so it just fits nicely. There's one code left over, and the code that was left over was the zero one punch. It was decided not to use that one, because you had two punches that were next to each other and the machines could sometimes get a little brutal and tear away that little piece of paper and damage the card. The use of these machines to run business was called Unit Record Management, and it was really, really successful. This is what allowed the modern corporation to evolve. Without this kind of data processing equipment, corporations just could not have become big; they only could have become as big their bookkeepers could have managed. And these cards were used for everything. You'd have a card representing an account, a customer, an order, an invoice, a payment, a personnel record, a detail item in any of those things. In some cases they were sent directly to the people at home. For example, you would get a bill from a company and they would send along a card, and you would send the card back with payment, and when it was received an operator would punch how much you paid onto the card and then put it back into the system. The problem with these is that they're really fragile, and you're giving this important data record to your customer or sending it through the mail and then getting it back, so they very often came with instructions on them such as 'do not fold, spindle, or mutilate' telling people not to mess these up, because if they did they could make life extremely difficult for the operators. This is a keypunch machine, and this is a more modern way of making cards. IBM built these well into the '70s, I think. In fact, my very first experience in programming was with one of these, in the basement of a library in San Francisco State University. The way you managed a program on these things was your program would be a deck of cards, one card per statement or line of your program, and if you wanted to modify your program you had to go through the deck, find the card you wanted to change, pull it out, put it in the machine, dupe it, modify it, replace it, put it back in, and so on. If you wanted to rearrange lines in your program you actually pulled cards out and reordered them. It was an extremely fragile way of putting programs together, and over time you learned some gymnastic tricks to try to make it easier. For example, if you wanted to take a card and make another one in which you deleted some of the characters, the way you do that‚Ä¶ First, there are two card stations on the machine. There's the read station and the write station, so you're always punching at the write station. If there's a card in the read station and you push the dupe button then it reads whatever's at that column and punches it on the next one and advances both cards. If you want to do a deletion you hold your thumb down on the card in the read station so it can't advance, and push spacebar a couple of times to advance the other one. Really nasty stuff, but that's how you did it. So the punch card was an amazing device. It served the purpose of memory, which eventually got replaced with RAM and core. It was storage, which eventually got replaced with disk. It was archive ó if you wanted to keep something for a long time you would send a box of cards to the salt mine and it'd keep that there. Eventually we found better ways of doing that, but I'm sure that deep underground somewhere you can find lots and lots of punch cards. It was a network. If you were at the field office and needed to get some records back to headquarters, you'd take a deck of cards and put them on a train, and they would get sent back. Eventually we figured out we could use wires to do that, and it got a lot better, but for a time the way you did that was you'd mail a box of cards. The last of the functions to finally get replaced was user interface. Cards were used for user interface long after these other functions went away. You would have thought that would be the first to go away because that's the thing it does worst, but it actually happened in the other order. The counting machines were programmable, and they were programmed in a data flow sort of way. You'd have a bunch of data sources which could be columns on cards, and then you could direct them to registers and to calculation units and to sinks like the card punch or the printer. Your program would be on a punch card or a punch board, and you could replace boards and that would change the program in the machine. These were invented fairly early in the 20th century and remained current for a long, long time. For a long time this was how you did programming. You've heard of spaghetti code? This is where it was invented. Eventually these Unit Record Machines were replaced by mainframes, by digital computers. They came online after World War II. There was a lot of research during the war in cryptography and weapons development, and when the war was over a lot of that stuff spilled out into the commercial sector. A surprisingly large number of companies started building computers. It was really obvious that that was the way a lot of things were going to be done going forward. Even so, these machines started coming online publicly in the late '40s, early '50s, well into the '60s, and record machines continued to work into the '70s. So just because the good new technology is available doesn't immediately displace the crappy, old technology. These computers were based on the Stored Program concept which said that instead of having a plugboard, or some other external programming source, the program is stored in the same memory as the data, and there are going to be some really interesting implications for that. The chief one was that over time, the program may modify itself in order to change or improve its behavior, and eventually, after a large enough series of modifications, the program will become intelligent and perhaps even conscious, and eventually become our masters. And that would be a good thing. So there's a lot of research into artificial intelligence to try to bring that about. Unfortunately it didn't come about because the way our brains work is just way harder than we can imagine. You'd think if our brains worked right we would be able to imagine how we work, but we don't. Instead we had to program them in a different way, and we came up with assembly language. First we had to use machine codes, whether it be a digit for each thing that the machine knew how to do and a bunch of digits for each cell in memory. That was just way too hard to organize, so the first software tool, the first program to make programming easier, was the assembler using something called assembly language. We don't know why it's called assembly language. The word assembly doesn't make any sense there. From what I've been able to figure out, their early programs did a lot of things. They would do things that we now call linkers and binders and loaders and other things; those all happen in one program. Eventually those features got teased out into other applications, and the word assembly was left the one thing they did that had nothing to do with assembly, but we still call it that. Here we've got a hypothetical machine. In the left column we have statement labels. We're going to load the accumulator with whatever word is at the INTERX variable. We'll then subtract from that the variable called COUNT4. We'll then skip if the result was zero. If we didn't skip we will jump to ABORT27, and if we did skip we will jump to a subroutine called CALCKHJ. That JSR is probably the most important instruction in the machine. It was early on very quickly recognized that the set of opcodes that the machine provides is never going to be adequate for all the things we want to do, so we want to be able to create our own opcodes, and that's what the jump to subroutine did. It would jump to a piece of code and when that piece of code was finished, it would then jump back. And there were lots of different ways that a machine could do that. One was that it would remember the address of that instruction and put that in a register someplace, so when we came back we could use that register to find out where the program resumes. Another way it could be done is that the program modifies itself. It will take a location in memory and change it to be a jump instruction to the place where we want to resume, and then when the subroutine is done it will jump to that instruction. Much later, the stack was discovered in which we have a place in memory where we can keep track of those addresses, and it's much more convenient. So we see a lot of that in modern machines, but it came much later. All throughout the mainframe period we saw enormous architectural variety. There were a lot of really clever people building machines. Often you'd have multiple architectures within each manufacturing company. I don't know how many different models of computers IBM made, but during the early years there were a lot of them, and there were lots of smaller competitors who had at least as many, sometimes more. They varied on things like word size, number types, would they use signed magnitude or ones complement or twos complement, how many registers they would have, whether they were special purpose or general purpose. They might have base registers or index registers; enormous variety in instruction sets. It was an amazing period, and all machine designers were learning from each other. There was brilliant, brilliant work done for many, many years, basically trying to drive the price performance thing. Because these machines were extremely difficult to make, so they were trying to figure out what the best way of putting them together was so that you could get the most work out of them. Here's an example of a mainframe. In the front left we've got the disk drives. They probably couldn't contain as much information as whatever you've got in your pocket right now, but that was the main online storage in its day. Behind them we've got the punch card equipment ó the punch card reader, the punch card puncher. Behind that we've got the disk drives, and way in the back we've got the memory cabinet. You might have 8k or so in a box about the size of a refrigerator, and as many of those as you could afford was how much memory your computer had. That in the middle there is the console, which is where the operator works. Here's another IBM computer. I mean, check out the console. The console has lots and lots of lights on it, and lots of switches and buttons and knobs. That was the thing I was hoping to build. And if you were a programmer, that's really where you wanted to be working because sitting there you could see the contents of every register, if you could read binary. They had a light for each bit, and if you could work that out then you could see exactly what was happening in the machine. You could single step the machine, there's a full debugger there. The problem was that they never let programmers in the room because they didn't trust them. Also, the machine time was just so expensive. You had to justify the cost of this extremely expensive machine and plant that you just couldn't afford to have the downtime that a programmer would have, sitting there trying to single step through his program. Also notice the great Mad Men fashions that were in vogue at the time. This was maybe my favorite machine of that era ó this was the Control Data 66000. Designed by Seymour Cray, it was for a time the fastest computer in the world. The thing I liked most about it was the console. Instead of all the lights and buttons, it had a simple keyboard and two round CRTs, so it could do real time displays. Really, really nice looking machine. It was too expensive to let programmers sit down at the console and do work, so the way most programmers worked was in batch mode, where you would take your job and make it in the form of a deck of cards. The first card would be the job card which identified what you were doing. You might have an account card, and then a card to tell the operator what tapes to mount, and then a card indicating that you wanted the FORTRAN compiler. Then you'd have your FORTRAN program, and then an end of file card, and then your data, and then another end of file card, and an end of job card. You'd take all of that and you'd put it in a tray, and then you'd wait a couple of hours. Eventually a number of jobs would get put into the tray and an operator would get around to taking them all out and taking the rubber bands off and putting them in the card reader, and they'd all get read into disk. Then they would take the jobs one by one, or sometimes several if it was a multi-processing machine, and run them. The results would go to a line printer and then the operator would take all the decks and put all the rubber bands back on and match them with the print outs and put them in a bin. So you come back a couple of hours later and pull the thing out and find out you're missing a comma. You go OK, and go back to the keypunch machine and fix that comma and submit it, and then next day you found out that you missed another comma. It was a really unproductive way of getting things done. They call the process 'submission', when you would submit a job, and it was submit in both senses. There was an ideal way that the analysts thought that this process would work. First the analyst would write the specifications and draw the flow charts that describe the application. Then the programmer would code a program, probably into assembly language, based on the flow charts. He would hand his coding pages to a keypuncher. The keypuncher would then sit at the keypunch machine and punch them in. Just in case the keypuncher made a mistake, they would take that deck and the coding forms and give it to a second keypuncher who's working at a slightly modified keypunch machine called a verifier machine, re-type everything, and if any character mismatches the card is destroyed, and then it has to be re-punched. Then assuming it gets through that process, it's given to the operator, and then the operator will run it. If there's a bug then you call a meeting, because nobody is in charge of the whole thing. I can't imagine that this ever worked, but this was the official way that it was documented. So what's a bug? Bugs, as far as I can tell, were invented by Thomas Edison. He invented a lot of other stuff, but he also invented the bug, and this is the documentation. The story is from the Pall Mall Gazette in 1889: "Mr. Edison, I was informed, had been up the two previous nights discovering a bug in his phonograph.‚Äù His phonograph was a device which would record sound and recover sound from a cylinder and a stylus, so the friction against the stylus would either create grooves or produce sounds as it followed the groove. I suspect that Mr. Edison's machine had a chirp in it that sounded something like crickets or something, and he couldn't figure out where the noise was coming from, so it was bugs. It was sort of a standing American joke for a long time about the crazy inventor who will become wealthy once he can get the bugs out of his invention. A real bug was discovered by Grace Hopper. During World War II she was working on ballistics tables for the military and one day her calculator stopped working. They opened up panel F and they found a moth smashed in a relay. She pulled it out and put it in her notebook with the notation: "first actual case of bug being found.‚Äù Her notebook is now in the Smithsonian. We'll get back to Grace in a little while. The batch mode was not good for programmers. It was designed specifically to try to optimize the use of machine time, not to optimize the use of human time. So another mode was developed called timesharing, in which you'd have lots of users who could use the machine simultaneously. Each would get a fraction of the resources of the machine, but if the applications are interactive enough then each person gets the appearance that they've got the use of a whole machine. That turns out to be much, much better. The way that you accessed it was through a device that was a whole lot less interesting than the console, but was good enough, and that was the Teletype machine. This is a model 33 Teletype. It's an uppercase only system, and it can work online and offline. Offline means that instead of sending characters to a computer, you're sending them to its local paper tape punch. Sometimes online time was too expensive, so you could type your program in offline and then when it was ready you would log in and then have it read your paper tape, and that way reduce your connect time charges. Now, program preparation on paper tape is even harder than on punch cards because if you make a mistake you can't throw that card away and replace it, it's one continuous band. One affordance that it did provide was that there's a backspace button on the punch, and when you push that the punch would go back one character. You could then push the delete button, and the code for delete is all one, so go 'jink' and completely punch out whatever was in that column. The convention was that if the mainframe saw a code which was all punched out, that meant that you had deleted that code and that it should just ignore it. So if you're ever wondering why your terminal has both a backspace and a delete button, this is why. The Teletype was really slow. It printed at ten characters a second, so you had to be pretty economical in terms of what kind of information you wanted to give to the user. In terms of accessibility, this is probably the best system we ever had; one that gave the best parity between sighted people and blind people. You could take a voice synthesizer, like the Votrax, and put it on the line between the computer and the terminal, and it will say the name of every character that comes down the wire. So a blind person can be aware of everything that comes out, gets exactly the same information that a sighted person should. In the years later we made lots of advancements in terms of the way you can use machines which have all tended to work very badly against blind people. So everything I'm about to say after this point works against them. The character set used by the terminal was ASCII. In fact, for the model 33 it was what I called half-ASCII, because it was only upper case. Eventually machines allowed us to do lower case, and the ASCII set recognized that. It contained 128 characters, which was just enough to do English. As a typewriter replacement, it had pretty much all the keys and characters that a typewriter would have. For people with other languages, though, it was not adequate. For people in other countries using other languages, they would replace some characters, and that made it very difficult for doing interoperation between one country and another. Also, for Asian countries, the seven bit thing didn't work at all, so they had to come up with double byte character sets, which made things even more difficult. Finally, that was solved with UNICODE. UNICODE attempted to take all of the national character sets and combine them into one character set; a really brilliant thing. Then later Thompson gave us UTF-8, which was an 8 bit encoding which was ideal for devices like Teletypes and everything else we did. Today we've got UTF-8, which should be the one way that all characters are transmitted on the network, but just because we have the best possible one way doesn't mean that everybody's doing that yet. One thing that is odd about ASCII is that it has a carriage return character and a line feed character. This was to model the way that Teletypes actually worked, where the carriage return character would take the print element and push it over to the left. The line feed character would take the platen and spin it one line. So most lines are going to end with going back and rolling the paper, and it took two separate codes to do that. Most timesharing systems didn't require people to type in both codes ó generally they would allow people to hit the return key, and then they would echo the line space key, just because there's no reason to make people type both characters. Also, other devices don't work that way. Most other printers of the time would just take a line of text and print it and advance; there was no way to separate the carriage return from the line feed function. So this was a pretty device specific thing. Most systems who adopted ASCII as their character set chose one or the other. The systems that tended to be more hardware focused in their orientation tended to pick line feed, and the systems that tended to be more human focused tended to pick carriage return, and that was fine until they needed to interoperate. Then you'd have a committee of people, some using line feed, some using carriage return ó how do you resolve that? You could just pick one. You could even flip a coin, because it really doesn't matter. But these committees could not decide. Nobody wanted to be the guy who got it wrong, and nobody wanted to be the guy who had to change, so they came up with a mutually disagreeable compromise, which is: We will always require both. So that's the way the internet protocols work. We haven't been using Teletype machines in I don't know how many years ó they're decades obsolete ó but we're still forcing both sets of control codes to be transmitted in HTTP because of this Teletype heritage. I would get into arguments with guys in the basement at the Computer Center. On our campus we had Teletype machines, so we could do timesharing, and we also had the batch system, and we would argue about which was better. It's obvious that the timesharing system was better. It was designed to use human time more effectively, and it was just the right thing to do. In effect, everything we're doing today looks much more like timesharing than it does like batch, so history bears us out. But there were people there who were vigorously arguing that batch mode was the right way to do it, that timesharing was a fad‚Ä¶ I'm trying to think of what their arguments were. They didn't make any sense at all. The main one came down to discipline, that the discipline that batch mode required, where you had to think the whole thing through and submit flawless programs to the computer because if it was buggy you would never, ever get it to work. That call to discipline was the biggest advantage of batch mode. It was another example of where you have a technology that was developed by programmers for programmers, and there were programmers who were rejecting it and thought that they were well reasoned in their rejection. What it really was, when you scrape it all the way down, was while they intellectually understood what timesharing did they had never tried it and they never understood it, and so they assumed that they were being very successful in their current endeavors without ever having understood that. Therefore it was not important to understanding that, therefore they could out of hand reject any argument that required that understanding. I continue to see that happening over and over again through pretty much everything that we do. One of the big benefits of timesharing was that it provided the first social network. All of these innovations happened first in timesharing: file sharing, email, distributed computing, computing as a service, chat, blogs, open source development. That all happened on the mainframes a long time ago. We think this is all fairly current stuff. In a few minutes I'll show you why we think it's current stuff, but this stuff all happened back in the '60s and '70s. We had games on the mainframes, both single player and multi-player games. It turns out that games are a really important place for technology development. There's some really good work in terms of user interface design, program construction, algorithm development, that was all motivated by games. Then finally, security. Timesharing systems had a huge security problem because they had lots of people running programs in the same memory, and integrity demanded that they be able to keep all of that stuff separate and not interfere with each other. So there was a lot of work in that era to try to figure out how to do that, and then to try to do the even harder thing after that which was to allow those programs to sometimes cooperate, because we started to identify the need for collaborative applications. The timesharing machines were just starting to figure that stuff out when they were destroyed. One of the other things that happens in timesharing is that you need an editor, and a paper tape editor doesn't make it. You need to be able to edit online. You can't do what you did with cards ó take a card out, change it and put it back. So they wrote programs which allowed you to do that, where you could load a file and then go to a particular line in the file, replace that file, insert some more lines after that line, and so on. So almost every system had an edit program in it ó it might have been called Ed or QED or some variation on that ó but everyone had one. At MIT they called it TECO. They then figured out how to add keyboard macros to it, and that became EMACS. VI also came out of edit. These text editors are still in wide use, still very popular, but they are dinosaurs left over from the timesharing era. The next step was replacing the Teletype with CRT terminals. CRT terminals were eventually much cheaper, they use less paper, and eventually they allowed for onscreen editing in which they could display a page of information and you could cursor around on it. For example, this terminal had some arrow keys on some of the letters to help in designing software that would do that. If any of you are VI users and ever wondered how it could ever have possibly make sense for H to go that way and L to go that way, this is where it happened. Again, this is a timesharing era dinosaur which still exists in the current age. Now, IBM was never able to get timesharing right. Timesharing requires that you be able to switch from one process to another on a keystroke basis, and their software was just not adequate to do it. Rather than fix their software architecture, they invented a new piece of hardware that they called the 3270. At the time they called it an intelligent terminal, but today we'd probably call it something else. The way that 3270 worked was you would take a page of data and rip it down from the main frame into the terminal, and it would show up on the screen. Some of the screen will be full of characters which are part of the display, and some of the characters are reserved as fields. So the user can then type stuff directly into the field locally, then hit the submit button, and then all the data in those fields gets sent back to the mainframe. Does that sound at all familiar to anybody? Does that sound like a form application? This is where that came from. When the World Wide Web came up there were a bunch of dinosaurs who said oh yeah, I remember that, and that's how we got a lot of what we've got today. Now, while all of that was going on one of the smartest guys who's ever lived, Doug Engelbart, was working at SRI on the Human Augmentation Project. Like me, early on he recognized the potential of computers, but unlike me he was able to do some really, really important work on it, which he demonstrated in 1968 at the Joint Computer Conference. It was the most amazing demo anybody had ever seen. He demonstrated hypertext, he demonstrated onscreen displays, he demonstrated groupware, he demonstrated video conferencing, to do lists, outline processing. It just goes on and on and on, all these things that he wasn't just theorizing, that he was doing and showing live. About the only thing anybody paid attention to was the mouse. He also invented the mouse, and he demonstrated that as part of this demo. He also had a chorded keyboard where he had five keys that he could play like a piano and do keys very quickly that way. He had five buttons here, three on the mouse, and he could type ASCII with both hands while he was moving. His theory was: let me take a couple of hours to train somebody in the system, and I can allow them to do amazing things and be incredibly effective. The world decided it didn't want to work that hard, but it's just amazing what he did. His lab was one of the first two sites on the ARPANET, which eventually grew to consume all of the networks of the world. At the time he was doing this, everybody else was on punch cards. You just can't imagine what a profound shock this was to see him showing the future in San Francisco like that. I highly recommend you see it ó it's available on YouTube. Go search for Doug Engelbart, The Mother of All Demos. It's out there, and it's just amazing. We still have not caught up to all of Engelbart's vision. What a number of people have done over time is take some little bit of what Engelbart was doing but hasn't been fully adopted yet and work on that. Some people have gotten rich and famous doing that. And there's still a lot that Engelbart was doing that we haven't caught up with yet. You can do that, too. I highly, highly recommend that you check out Engelbart. OK, next is minicomputers. There were a number of developments that allowed for repackaging some of the stuff that had been in the mainframes until much smaller, less expensive form factor, and these became minicomputers. A whole new class of companies started making these, companies like Digital Equipment and Data General, Basic Four ó a bunch of them ó and created many new markets for computers. In some cases they went into companies which already had mainframes but there would be operations within them who found that the data processing departments were not responsive to them. When they first got the computers it was 'great, now we can do things because we have computers', and then a little surprisingly short time after that it's 'we can't do these things because we have computers'. So people would try to get around the system by finding some cheap box that they could put in their own department. In some cases they would end up in small businesses and in small colleges and places that formerly hadn't been able to afford computers at all. They started to show up in places that were new. Again, we saw an explosion in CPU architecture, an amazing amount of creativity in the way that the designers and engineers came up with to get work out of these amazing little machines. The next step was microcomputers. This began in a collaborative project between a memory startup called Intel and a company that was making intelligent terminals, similar to the ones we saw earlier, called Datapoint. Datapoint, at that time, was making their terminals completely out of discrete components, and they were kind of expensive and they also got really hot, because all of those components created a lot of heat. So they came up with a design for a little CPU, and they figured if they had that CPU they could reduce the part count on their terminal significantly, make it a lot cheaper, and make it work better. You'd run a little program inside that little chip that would look at the keyboard, look to see if anything was being pressed, and look at the serial port and see if any characters were coming in, and based on what it was finding it would cause things to happen and then put them on the screen or send them on the wire. So Intel developed a device called the 8008, and Datapoint was very successful with that. They also sold it to the public, and they were also very successful with that. It then got improved into something called the 8080, and then another startup that spun out of Intel called Zilog improved it again and called it the Z80. In addition to that family, Motorola had the 6800, and there was another chip called the 6502 which was kind of based on that design but was much, much cheaper. So we started seeing an explosion in 8 bit CPU architecture. They went into all kinds of devices, including into computers. The Apple II had a 6502 in it, and it was the Apple II that put an end to timesharing because the economies of personal computing were just so overwhelmingly better than what you could get with a timesharing system. There were some trade-offs in that you didn't have access to the network anymore, but if all you wanted to do was compute some models you could do it much cheaper on an Apple II than you could through a timesharing bureau. This is the register set for the Z80. It had several 8 bit registers: A, B, C, D, E, H, and L. It had two 16 bit index registers: IX and IY. It had a stack pointer that was also 16 bits, and a program counter. All Z80s had this, and it was a very nice way of writing programs in assembly language. To back up just a little bit‚Ä¶ No, I'm not going to back up, I'll do that later. There was then the 16 bit generation. Again, a number of companies came up with very interesting designs. The Motorola 68000, Zilog the Z8000, National Semiconductor the 32000, which I think was the best of that generation in terms of instruction-set elegance. If you're writing a code generator or if you had to write an assembly language ó which you shouldn't anymore by this time, but if you had to ó it was clearly the best thought out of all of them. Intel went in a different direction than the others, though. They came up with an architecture called the 432, which they called a micro mainframe, where they tried to take a whole lot of the functionality that you would expect to find in an operating system and push it all the way down into the silicon, so garbage collection would be happening in the CPU transparently. They designed it to be programmed exclusively in high level languages, primarily Ada. They took Ada, which was a language being developed for the Defense Department, and extended it to make it object-oriented. So they had that support in their CPU. Very forward-looking design. It was one of these designs which just went wild with all the things that they could do, and they never properly accounted for the cost of all the things that they were doing. So the basic CPU ended up having to be split onto two chips because it was too big to put on one, and it turned out to be really, really slow. So it was very expensive, very slow, it turned out that people couldn't figure out how to write programs for it, and it was a total disaster for Intel. It looked like they were going to miss out on the 16 bit generation and they were probably going to go out of business completely. So they had to very quickly figure out: How do we get into the 16 bit race having stumbled so badly on the 432? They decided to go back to the 8080, which had been, and continued to be, a big success for them, and try to capture the business of the 8 bits by making a machine that was assembly language compatible with the new one. This is a contrast of the Z80 register set and the 8086 register set. Very, very similar. They changed some of the names of things, but basically it's very easy to see the Z80 heritage in the 86 instruction set. So they very quickly threw this thing together and tossed it out onto the market. They didn't design it to be good, they designed it to be compatible. It turned out that compatibility didn't really matter. The thing that ultimately sold it was that it was cheap, so it went into devices like this one: the IBM PC. IBM had looked at what Apple was doing, the effect Apple was having on their mainframes, and they decided they needed to get into the personal computer business. They built the machine and called it 'the' personal computer, they sort of took over the space, and they put Intel's chip in it. Then they went to a company that was best known for its crappy basic interpreter, a company that knew nothing about operating systems, and got them to make an operating system for them. That was MS-DOS, and that went into that machine. There were a lot of other companies who also made similar machines, and most of them failed. The only ones that succeeded were the ones who made machines that were exactly the same as this machine, what were called clones. The clones set the new standard for cheap computers ever since. That was followed by another generation, the 32-bit generation. There were lots of really elegant designs out there that were really good. Intel decided, again, to play compatibility, so for the 386 they put in a mode which simply took each of the existing registers and changed its size from 16 to 32 bits. This was done again in the 64-bit generation, the EMD this time doing the design. It took each of the Z80 registers and pushed them into 64 bits. Without question, the worst CPU architecture we have is the Intel architecture. Intel has always been very much aware of this and embarrassed by it. It improved a little with each generation of the 386, which is significantly better than the 286, but still at its root there's an 8080 in there, and there's just a lot of awfulness as a result of that. To manage its embarrassment, Intel has pursued a lot of other architectures that were actually quite elegant. There was the 960 which was really good, there was the 860 which was also very good, and the Iridium. But the market said no, we don't want that, we want the bad stuff, we want the compatible stuff. And who is making those decisions? It's programmers. Programmers say no, we don't want the machine that is best for programmers, we want the crappy one because that's what we're used to. That's the way we do it. So even though we think we're very knowledgeable about the work that we do, as a community we are historically quite bad at understanding what we do and what we need in order to do what we do effectively. One of the reasons why microprocessors ended up destroying the mainframes and the minicomputers and eventually became everything was because of a prediction made by Gordon Moore, who was at Intel. He hypothesized that the complexity for minimum component costs has increased at roughly a factor of two per year, and he just assumed that that would go on, perhaps at a slightly slower rate. He thought that it would go on for ten years. It's gone on for forty years now. It's just amazing that for every two years we get a doubling in the efficiency of semiconductors. This prediction was called Moore's Law, and it has held for an amazingly long time and is likely to continue to go for awhile further. It's not really a law, it's a prediction that became a self-fulfilling prophecy. If you're an engineer at Intel, you're shown a point on his graph and told this is where you need to be in three years, come back when you can hit that point. They have to do amazing superhero kinds of stunts in order to accomplish that level of performance, and when they turn it in it's like yeah, OK, we knew you were going to do that. It's nothing special. It seems to me pretty thankless to be doing that kind of engineering at Intel. It can't hold forever. Everybody knows that eventually Moore's Law is going to fail, but it's still holding. It's got a lot of life in it yet. The other thing we've seen is an end to CPU innovation. We used to see a lot of really radical new designs happening all the time, but we don't see that happening anymore. Basically we've got three architectures that we use for most of our stuff: virtually all the computers are on Intel, most of the game platforms are on Power PCs, most of the mobile devices are on ARM, and that's it. Nobody's making new stuff, nothing radical, it's just refinements of stuff that's been happening for several decades. We're doing even worse in operating systems. It used to be that every model of every machine had its own operating system, and that came with a lot of obvious inefficiency, so we've pushed that down and now we have just two: we've got Unix which was developed in the '70s, and we've got Windows that was developed in the '80s. Of the two, Unix is obviously the better one, but there's no innovation happening in operating systems. Basically we've been rewriting the same systems for 40 years. That's just not where we do innovation. Where we do innovation is in programming languages, and that's been going on for quite a long time. In the '50s, everything was assembly language unless it was still punchboard, plugboards, which were still going on too. There was interest in research in automatic programming, because the perception was that programming's just way too hard and we need to figure out a way to make it easier, so we'll make it easier by having the computer do most of the work for us. We'd already seen with assembly language a start to doing that, and we wanted to go further so that instead of writing a program you instead tell the computer what the program's supposed to do and then the computer will write the program for you. Brilliant, that should be easy. There was a lot of work and experimentation on that, and the result of that experimentation was called FORTRAN. You might be looking at FORTRAN and thinking this kind of looks like a program, and in fact it is. Automatic programming didn't work, because it turns out the description of a program in sufficient detail to do what you intended to do is still a program. What they succeeded in doing was raising the level of abstraction. Instead of dealing with memory cells and opcodes we're now dealing with things which look more like the problem domain, so you can be much more productive in this language which is a really good thing. But this doesn't replace programming, it's just another kind of programming. We've seen this happen over and over. Right now there's a lot of interesting work happening in domain-specific languages. There are some theorists who think that working in those very specific languages you're not really programming, but in fact you are. You're just programming at a different level; sometimes a more appropriate, productive level, which is good. This is a FORTRAN program. FORTRAN arrived in the late '50s. Here we've got a subroutine. Subroutines are very similar to modern functions. FORTRAN did allow for recursion, but in other ways its very much like our current functions. The if statement looks a little odd. What it means is if N is negative or zero, jump to statement 10. Otherwise, if it's greater or it's positive, jump to statement 8. Eventually it came up with a better way of writing if statements, but even this if statement looks quite a lot like the C if statement, and that similarity is not accidental. It also has a do loop which allows you to do something a certain number of times. In this case, we will iterate from here to statement 9, each time varying i from 1 to N. That's how you read that statement. Their data is taking the ith member of i. Square bracket hadn't been introduced into mainframes, so that character wasn't available, so they used parentheses for doing that. But they did use the asterisk for multiplication. I don't know if you ever wondered why we do that, why we don't use an X or a dot or something else instead. The reason is that the early mainframe instruction sets didn't have those characters in them ó they were designed for business applications, they had character sets that looked like typewriters, and so FORTRAN established the convention that you use asterisk to mean multiplication. And that's still the case in virtually all languages now. Another language was COBOL. COBOL was developed by Grace Hopper, who you remember earlier discovered the bug. COBOL was an attempt to make programs look more like English. At first the hope was that anybody would be able to write business applications, but that turned out not to be the case. Then there was a secondary hope that at least anybody ought to be able to read one of these programs to understand what it does. This was particularly hoped for by management, because management had little trust or understanding of what programmers were doing, and the thought was that if they could read what they were doing then it would be a little easier to keep control over the operation. But that didn't really work either, because there's a lot of subtlety in programming in any language which is not readily apparent to most people. BASIC was a slightly later language. It was designed specifically for timesharing. It was developed at Dartmouth University by Kemeny and Kurtz. They did a really clever thing ó they started with FORTRAN and stripped it down and stripped it down into the simplest possible language so that anybody could use it without much training at all. It was very quick to learn. They also came up with a clever way of editing programs. They came up with the line number, so you give every statement a line number and if you want to change that statement you simply type the line number again and the new statement that replaces it. They used the same line number as the destination for jumps, so there's a certain kind of economy there. Here we have a Hello World game where at line 20 the program will print the string: 'What's your name?' and then read from the terminal whatever you typed followed by carriage return. BASIC, even though it was a really primitive language, had the best string processing and the best text processing of any language in its generation. It hardly does anything, but it just does the right things. It's got a way of representing a literal string, a way of concatenating a few together, a way of teasing them apart, inputting them and printing them out. That's all you need, and it did that. That's been followed by virtually every language since then, and in that sense BASIC was incredibly influential. The other thing BASIC did was it sort of crystallized the input/output relationship. Here I want to interact with the user so I will print and I will input, which means that my program stops until the specific thing that I asked for is delivered by the user. So it's an extremely modal thing, that the operator has to somehow figure out how to convince the program to get to the place where it wants to ask the thing that the user wants to tell. Later we discovered that this was a really bad way to write programs, but it took a long time to figure that out. BASIC influenced a number of other languages. There was Business BASIC that ran on the small business minicomputers, which added database functionality to their file systems so you could store values and retrieve values by keys into files and pull them out and do operations on them. It was much more pleasant than COBOL for most people, and it was very cheap and very popular. Microsoft was started on Microsoft BASIC, and that eventually evolved into Visual BASIC which for a few years was the most popular programming language in the world, although it has been replaced by another that we'll get to a little bit later. A really important language which came out in 1960 was ALGOL 60. It is the best design by committee in the history of programming languages. A bunch of really smart guys got together and came up with a language for use in expressing algorithms for publication, but while they were at it they also made it actually work in practice. So it defines a couple of languages ó a reference language, a publication language ó but it worked, and it was popular within its sphere. There were a number of machines that were designed specifically to use ALGOL as their basic language. It introduced the notion of structured programming and blocks. We have blocks in modern programming languages, most of them use curly braces, and that came from ALGOL. ALGOL used the words begin and end instead of curly braces because again, curly braces weren't available at that time, they were invented later. But that's where we got that stuff. It was an extremely important language, a very influential language, but unfortunately there were lesser languages which tended to be much more popular. One big debate that happened partially as a consequence of ALGOL was the structured programming debate. Dijkstra wrote a famous letter entitled 'GOTO considered harmful', and Dijkstra claimed that programs like this are just too hard to follow when they get complicated. You got the things bouncing here, there, there, there, you can't keep track of what the program's doing, it doesn't scale sufficiently well to allow us to write programs of sufficient complexity, and that we would be better off if we simply stopped using GOTO and used the other features that ALGOL had provided. It turned out he was right. But at the time this was an extremely contentious idea, that programmers would have an easier time managing the complexity of their programs if they don't use this feature. Who was most enraged by this suggestion? Programmers. This debate went on for literally a decade, for two decades, for a generation, arguing about whether GOTO should be eliminated or not. Ultimately we got rid of it, and that was the right thing to do. I think it's not coincidental that it took a generation to do it, because basically we had to come up and train a whole new set of people who were not stuck in the previous idea. Again, who better should have understood the value of structuring your programs in such a way that they could scale better? Only programmers should understand the value of that argument, and programmers were least able to understand that argument. From that generation, FORTRAN, COBOL and ALGOL, each of these languages was pretty specialized. In particular, FORTRAN was intended just for scientific processing, and COBOL was intended just for business processing, and there was interest in trying to make a common language that could do both. At that time they didn't recognize that there were other things, as well, that would actually dwarf both of those applications, but it's still early yet. There was PL/1 developed at IBM, there was the Combined Programming Language that was developed in England, ALGOL 68 that was developed in Europe ó they all wanted to be the √ºber-language that would do everything. All these languages had some partial success, but none of them fulfilled the promise of being the language that would bind us all. In fact, there was a reaction after that to try to scale it back and come up with simpler languages. There was a dialect of CPL called BCPL: Bootstrapper or Basic CPL, which was a simplification similar to what had happened with BASIC and FORTRAN. Strip it down, strip it down, make the simplest possible language that works, and then you've got a language that works. BCPL was very successful within its niche, and we'll see more of that in a moment. Then the design of ALGOL looked at what was happening in ALGOL 68 with horror and said no, that's not the way. ALGOL actually got it right, and there are some who consider ALGOL an improvement on most of its successors, which in fact was true. Taking that approach, Wirth came up with Pascal, which was extremely popular. He designed it as a teaching language, but a lot of people put it to work as a general programming language. Unfortunately there were a couple of significant design problems which interfered with its larger mission. One was that it wasn't modular enough, so it assumed that a whole program was one unit, and that turned out not to work practically. A bigger problem was its type system. Types were intended to make programming easier, but in this case it made programming significantly harder because the dimension of array, the number of elements in an array, was considered part of its type. So if you wanted to write a function that could deal with an array, it could only deal with arrays of one fixed size, and that turned out not to work very well. BCPL inspired Ken Thompson to make another language called B, which basically took the good ideas that were in BCPL but give them FORTRAN syntax, which wasn't necessarily an improvement. But he did, so it got that. Then C took B ó this one was Dennis Ritchie ó taking some of the good ideas in Pascal, being more selective in taking stuff from its type system, and adding it to B which was mainly a typeless language, and made C. C was incredibly popular, and it's become the most important implementation language of all. Virtually all languages since then are either based on C or are implemented in the C. C has been an extremely successful language. While all this is going on, there's still assembly language happening. Again, there's the debate ó which should we be using? The assembly language or high level languages? There are people arguing on both sides: that the high level languages make them much more productive, that the high number of lines of code you can write in a day is pretty much constant, and if you're writing in a high level language those lines get more work done than in an assembly language. That's the basic argument for the high level languages. The argument for assembly language was‚Ä¶ I don't know. There wasn't a good reason, but they'd argue about it, and they'd argue on and on and on. Now, it turned out there was a good reason for assembly language: you get systems like this. This is the Atari 2600, the VCS Video Computer System. This is the first computer most people had in their house. It had a 6502 in it, it was really cheap, and it ran games that you could play on your TV set. It was impossible to program this machine in any high level language, you had to be working in assembly language, and I'll show you why. The machine contained a 6502 CPU. Actually, 6507, but it's the same instruction set, which had a very small number of registers. It had an 8 bit accumulator, two 8 bit index registers, an 8 bit stack pointer, a flags register, and a 16 bit program counter. That was it, that's all the registers you'd get. There's no code generator that knows how to write efficient code with that. Worse than that, there was no software or firmware built into the console, so the only code there that could run was what was supplied on the game cartridge. People go to Sears, buy the cartridge, plug it in, and you play. The cartridge had 4K in it, and that's not a misprint. 4000 characters were on the cartridge, so all of your program, all of your static data, all of your visuals, bitmaps, text, everything has got to fit in that 4K. Later they came up with an 8K, but 8's not enough more than 4 to think about writing this in C. In addition, the console has some RAM in it. It has 128 bytes of RAM. Again, I'm saying this very precisely ó 128 bytes. You can count them: 1, 2, 3, up to 128, and that's all you get. That has to include all of the dynamic state of the game including any dynamic bit of imagery that you're getting ready to hand to the video shift registers, any music that you're playing. You've got to be keeping track of the note list and the durations and all that stuff, that's got to be in RAM. Your subroutine stack is in that same RAM. The guys who could write for the VCS were heroes. [laughter] They would do amazing stuff. There were like 30 variations in the tank cartridge. You can toggle the console button and play 30 variations of this game where these two guys go round and shoot at each other, and it was all implemented in 4K, and it's in color. When the game's not going they will cycle the colors on the TV so that you don't get burn in on the phosphorus on your set. All of that is happening in the cartridge. 4K. 128 bytes. Amazing. So you had to do that in assembly language, there was just no way you could do it any other way. The incentive is you can get your program in millions of homes, so that's a good thing to do. You can't get them there writing in FORTRAN. Specialized systems like this just weren't compatible with high level languages, so you had these throwbacks which kept assembly language being useful long after high level languages became dominant. ALGOL went on and had some other influences. ALGOL was in 1960. In 1967 Simula was developed in Norway. Simula was the first object oriented language; Simula added classes and objects to ALGOL. That language had a big influence on Alan Kay, who went to Xerox Parc and in 1982 he started working on a programming language for kids based on the object-oriented idea. The name of the language was Smalltalk. Alan and his lab spent a lot of time working on this language. It went through several generations, a lot of testing, brilliant work, great implementations. They published it eight years later: Smalltalk 80. A great language, the first truly modern object oriented language. They did it as part of motivation for a system that they called the Dynabook, which was going to be a portable personal computer. As part of that work, they took what Engelbart had been doing in timesharing systems and applied it to personal computers. They adopted some things that Engelbart did very obviously, things like mice, and his approach to interactive displays. They took it a little bit further and came up with bitmap displays with overlapping windows, they invented window systems for that. Basically the modern user interface was developed at Xerox as part of the Smalltalk project. Also at Xerox at the same time they came up with local area networking and Ethernet and laser printers and a whole lot of stuff that we take for granted today. Xerox tried to commercialize all this stuff but never really understood what their labs had developed for them, so those projects failed. Smalltalk itself looks a little alien. Here we've got a statement, and we'll set the result to either the string greater or the string less or equal, depending on the relation between A and B. The way they would describe this working is they would send the greater than message to A, passing B as a parameter, and the result of that will be an object which is either the true object or the false object. Depending on which it is, they will then call that object with the ifTrue ifFalse method and depending on what the state of the variable is it'll call one or the other. They'd use this language where they'd say, instead of invoking a method, we haven't come up with that terminology yet, so they called it sending a message. I don't know why it is, but a lot of programmers just couldn't get used to this syntax. What's going on there is really a method invocation, but it's not a dot and a name and then some parentheses, it's these key words with colons and then values in between them. This is actually more readable if you understand what's going on, because it's self documenting to the extent that it tells you what each parameter is doing which is something that we don't have in the conventional notation, because all you have is a comma, it doesn't tell you what anything is. So this may be a superior notation, but it was profoundly rejected. By who? By us, by the programmers, because we couldn't understand it. As a consequence, Smalltalk never made it commercially. But despite that, Smalltalk has been extremely effective as an influencer. Under the influence of Smalltalk, we've seen C evolve into Objective C, C++, and Eiffel. And then C++ inspired Java, which inspired C#. So basically every language since then has taken ideas from Smalltalk, combined with the crappy syntax of C, and that's basically the modern world. All this took a long time. In 1967 Simula, to 1995 Java, and now here many years later, it took awhile for object orientation to be just the way you do things. Again, there were debates ó we don't need objects, objects don't make sense, they're just a lot of overhead, they don't really do anything for you. Who was making those arguments? Programmers were making those arguments, thinking they knew what objects were but having no experience about what they are. But eventually we all figured it out and we took that next step forward. Software development comes in leaps, and our leaps are much farther apart than the hardware experiences. Moore's Law lets the hardware leap every two years; we leap more like every twenty years. Again, basically we need a generation to retire before we can get the good new ideas going, so despite the fact that we're always talking about innovation and how we love innovation and we're always innovating, we tend to be extremely conservative in the way we adopt new technology. Smalltalk had some other influences as well. One of them was a language called Self, which was also developed at Xerox Parc and eventually moved to Sun Labs, worked on by Ungar and Smith. Brilliant language. It took the Smalltalk idea and took the classes out, so instead of having classes which define sets of instances you just have the objects themselves, and you allow one object to inherit from another object. That greatly simplified the language. Part of their motivation for doing that was to allow them to go faster; they were trying to figure out how to make Smalltalk or a Smalltalk-like language run as fast as C. The thing we know Self best for is the stuff they did in performance. They did amazing work in garbage collection systems ó generational scavenging came out of this language. The hot-spot technology that made Java acceptable came out of the Self project. The V8 system that's being used at Google also came out of the Self project. So Self was a big influence on performance. But also it did a really good job of demonstrating the idea of prototypes, where you don't need a class, you simply have an object that inherits from an object. That turns out to be a really powerful idea. It's a newer, more recent idea than classical object orientation, which is why the idea was rejected out of hand by most programmers because again, it's new and unfamiliar. If I don't know about it and I'm such a hotshot guy, then it can't be important. It turned out it's very important, and we saw it first in Self. That brings us to the Actor Model. This is another kind of indirect spin off from Smalltalk. Smalltalk would talk about how you send a message to an object. Carl Hewitt at MIT listened to the way they were describing it and going: well, that's not what you're doing. These are just invocations, you're not sending a message. But what if you were sending a message? What if each of these objects was an independent process, let's call it an actor, and the only way they could communicate with each other is to send messages? The messages will be asynchronous, you just send the message like you're sending an email, and every actor will have a queue of incoming messages that it can then process in order. What kind of programming model would you have? It turns out you would have a model with really interesting properties. It scales really well because you can take all these things and put them on one CPU or put them on a million CPUs and they work exactly the same. It also had really interesting security properties in that each of these was a separate process that was completely sealed, so nothing could interfere with it. So they all protect their boundaries. Any actor can only talk to other actors that it has knowledge of; if it doesn't know their address it can't send them an email. It beautifully demonstrated the capability principles, and it's really good stuff. It turns out if we step back from things, a lot of things that we're already familiar with are already in the actor model. For example, modern desktop applications are all built around an event loop. That event loop looks very much like the message queue that an actor would have, so an application is an actor. Looking at the web, a web service is an actor. It's something you send a message to and it may send you a message. So the actor model is more familiar than we may realize, but it was still pretty new and again, too radical for most programmers. But there were a couple of programmers at MIT who wanted to understand it better, so they took Hewitt's actor model and implemented a part of it in LISP and created another language that looked a lot like LISP but had slightly different semantics. The thing that they discovered was the actor dispatch model looked exactly like their function dispatch model, and that functions and messages were the same thing, which completely surprised them. They weren't expecting that at all. They kind of refined that idea and came up with the language called Scheme, which is sort of the perfection of LISP. LISP was the artificial intelligence language that had been developed at MIT in 1958, and they got it right. Part of what happened was that they needed tail recursion in order to allow you to keep calling things and never expect them to return without running out of memory. It also allowed for flexible closure so that if a function is nested inside of another function, it gets access to everything that the outer function has even if the outer function is already returned. So there are all these really intricate nested actor patterns that fell out of the work, really brilliant stuff. Scheme then went on to influence a lot of language designers. It also influenced LISP, so common LISP owes a lot to things that Scheme figured out. The actor model also influenced the design of a data flow language called Joule. Joule had been designed specifically for security applications. There was then another project that took Joule, gave it Java syntax, and created a new language called E. E is the language that demonstrates the object capability model, which turns out to be the savior of secure systems going forward. There's a lot of work now in trying to make JavaScript into a secure language that's all deeply informed by the work that happened at E. You may have heard of Caja ó it's something that we're using here at Yahoo! in order to secure applications. Caja was developed at Google based on work that had happened in E. OK, let's take a little detour now. Xerox had done this brilliant work with Smalltalk and the Dynabook and was unable to commercialize it. Steve Jobs got a demo of it and immediately understood the potential of this stuff, so he took it back to Apple and eventually Apple produced a device called the Macintosh. It had a 68000 Processor in it, 128 kilobytes of RAM ó so it was 1000 times better than a VCS ó but was still too small, so initially you couldn't program this machine in anything but assembly language. It had the bitmap displays and a mouse and a lot of the stuff that had been demonstrated at Xerox, but it was still hard to program, particularly for programmers who were used to the basic model where you input and print. You don't do that on this device because the program has to be running all the time and the user has to be able to click anywhere and have it be meaningful. So the old stop and wait for input model just doesn't work. Apple gave people advice on how to write their applications, but it was really difficult to get programmers doing that. So Bill Atkinson came up with a really interesting application. He had written MacPaint and QuickDraw, and he came up with this little database tool which he thought was going to make it easy for people to make applications. They added a little scripting language to it and then suddenly stuff that had been so difficult about Mac programming became easy enough that non-programmers could do it. That was called HyperCard. For awhile HyperCard was free on all Macintoshes, and it was extremely successful. It was imagined to be the future of software, that all applications from this point on were going to be HyperCard stacks. HyperCard was going to be the way everything was going to be made going forward. What is HyperCard? HyperCard's basically a file format of stuff that can be displayed visually, and it has a very small set of types in it. There's the stack, which can contain any number of backgrounds and cards. There's a background which contains an image, and maybe some buttons and fields that get shared by every card that has that background. You can have cards which can use one of those backgrounds and can also have buttons and fields on it. A button can be a clickable area and it can have text or image, and a field is a thing that has text that you can type into. Many of these things we have on web pages, but this was an earlier model. The whole thing was a little IDE in that you could type in Command B and that would make a new button and open up a dialog. Then you could give the button a name, and you could configure it to have it tell you what kind of input it was going to be. You could also then click on its link, and that would take you to another page in which you could set its script. What did its script look like? Well, its script could look something like this. You'd say: 'on mouseUp'‚Ä¶ Some of you might be going: woah, 'on mouseUp'? That sounds eerily familiar. This is where all that stuff came from. HyperTalk wanted to look like English. Their motivation was a little bit different than COBOL's, but similar. They wanted to make the language easy to teach by making it look familiar. Here we say 'set the location of card button x to pos', but the modern language would probably write something like 'card.buttons set x = pos'. Both would do the same thing, but that's how you write it in HyperTalk. HyperTalk is trying to look wordy. One of the disadvantages of HyperTalk is that you can't ever see the whole program, because all of the handlers are nested inside of their individual components. So you never get the big overview. But the plus side of that ó because everything's a trade-off ó is that if you put a script in a button and then move that button onto a different card or into a different stack, that button will still work because the script travels with the button. Also there was a delegation model in that I could put this script in a button and then if I click on the button then this script will run. Or I could put it in the card that the button is on and then if the button doesn't handle it, it delegates to the card. Again, that might seem very familiar to some of you, and we'll see more of that in future evenings. HyperCard had stacks of cards containing buttons, images, text fields. It didn't anticipate color. It was a strictly one pixel black and white system, so it didn't always look very good. That may have been because Bill Atkinson was colorblind and didn't see the need for it. It had things you could click on and then go to something else, go to a different stack or a different card. But it didn't allow you to put the links inside of the text fields; that was an obvious thing, but they just never figured out how to express that. Probably the biggest limitation was that it didn't anticipate networking, so everything was expected to be distributed on floppy disk. Also it had a terrible security model, because if you loaded someone's stack onto your machine and if it came from an evil person they owned your machine now. It didn't protect you from that kind of stuff. Almost overnight HyperCard just sort of collapsed. It had been the biggest thing anyone had ever seen and then it virtually disappeared. Winding back a little bit more, going back to Engelbart's system. Engelbart's system at SRI was not just a demo, it actually worked, it was a real system. SRI sold his system to a timesharing company called TimeShare. Then TimeShare was sold to McDonnell Douglas, and then McDonnell Douglas buried it, so unfortunately that stuff didn't go forward. It died inside that corporation. Engelbart was a big influence on Ted Nelson, and Ted Nelson came up with an extremely ambitious hypertext system called Xanadu. In fact, Nelson invented the term 'hypertext'. His system had bidirectional links in it, and transclusions, and inclusions, and a payment system, and all kinds of stuff that he considered to be necessary. He had a brilliant team of engineers building this stuff, but they never finished it. Xanadu had a small influence on HyperCard. Basically that influence was the name ó the 'hyper' in HyperCard was lifted from the hyper in hypertext, but that was about the only similarity. Tim Berners-Lee's World Wide Web was also directly influenced by Xanadu, except he really didn't know very much about Xanadu, and he knew nothing about Engelbart. But as a result it was really simple, because he had never thought of all the really complicated things he can do, and because it was really simple he was able to implement it. It turns out that getting the thing done counts more than just about anything else. The World Wide Web itself was influential. After Sir Tim published his specs, a lot of people started imitating it. The most famous of those was the Mosaic project at the University of Illinois at Urbana-Champaign. They developed the Mosaic browser. At that time there were a handful of protocols that were all contending to be the popular front end of the web, and this team couldn't decide which of those was going to win, so they made a program that could implement all of them. It could do Gopher, and WAIS, and everything. They called it Mosaic because it was made up of all those different pieces. It turned out that the web component was the one that people liked, because they added an image tag to it so that even though it wasn't what everybody wanted, it could be made to look exactly like what everybody wanted, and that was enough to send it to the moon. So Mosaic and the Web became extremely popular after that. Then that team split into two separate start ups, Netscape and Spyglass. Netscape announced that they were going to destroy Microsoft, so Microsoft bought Spyglass and turned it into Internet Explorer. Netscape had an idea to take the ideas in HyperCard, particularly that easy to use program model that was event driven, based on buttons and fields, and put that into the browser. They hired this guy to do it ó that's Brendan Eich. Brilliant guy. They hired him out of Silicon Graphics, they asked him what he wanted to do, and he said he wanted to write a Scheme interpreter because he'd read this game import and thought that it was really cool. So they said great, they hired him, and then said 'but you can't do Scheme, that's just too weird looking, people won't like that. Make it look more like Java.' So he designed a language that looked more like Java. Basically, he took these components. He took the syntax of Java, he took the function model of Scheme ó which was brilliant, one of the best ideas in the history of programming languages ó and he took the prototype objects from Self. He put them together in a really interesting way, really fast; he completed the whole thing in a couple of weeks. It's a shame that he wasn't given the freedom that Xerox had to spend a decade to get this right. Instead of ten years it was more like ten days, and that was it. I challenge any language designer to come up with a brand new design from scratch in ten days and then release it to the world and call it done and see what happens with that. One of the consequences of it was that there are parts of it that are just awful. If they'd had more time they probably would have recognized that and fixed it, but they didn't. Netscape was not a company that had time to get it right, which is why there's no longer a Netscape. [laughter] But despite that, there is absolutely deep profound brilliance in this language, and this language is succeeding in places where many other languages have failed because of that brilliance; it's not accidental that JavaScript has become the most popular programming language in the world. Many people may not remember that the language of the browser was supposed to be Java. Java Applets in 1995 were the hottest thing anyone had ever seen, and they were going to rule the world. They were hotter than HyperCard. It was going to be big. And the Java community doesn't remember this, but Java failed on its face, hard, total, complete failure. They managed to find a niche on the server side, so there's good in Java and it survives. Good for them. But the thing that Java was intended to do, the thing they told the world this is what it's all about, Java totally failed. In that same venue, JavaScript is succeeding brilliantly, so the argument that it's just luck, because it's in the browser, that's why it's doing so well, that completely ignores history, because Java was in there first and got every break. Just being in the browser was not enough to assure success. We'll be talking more about what the language got right and wrong in future episodes. In 1969 Jean Sammet wrote a brilliant book called Programming Languages: History and Fundamentals, which was basically a survey on all of the work on automatic programming that had happened in the '50s and '60s. She countered over 100 languages which she describes in her book, because that was a time of amazing innovation. I'm very happy that we are, again, in another of those periods of innovation; we've got a lot of interesting languages now, including some pretty wild designs like Haskell, Erlang, and Scala, which are all getting attention. And there are lots of other languages which are also getting attention. One thing that's different now than in the '50s and '60s is there are lot of computers out there, and there are a lot of people writing programs now. It's possible to get a community of people even if you have a minor language, enough to do useful things, to do a lot of group work. You've got a group large enough to justify writing books, which was something we didn't have back in the '50s and '60s. So I think this is a great time to be a programmer. We have lots of choices, and we need to be smart about making those choices and be open to accepting the new ideas, because there are a lot of new ideas out there that we shouldn't be rejecting just because they're unfamiliar and we don't see the need for them. There are actually a lot of good ideas in all of these languages, not least of which is JavaScript, which will be the subject going forward. There's much, much more of this history. The history of computers and software and programming languages is incredibly rich. I was only able to scratch the surface of it in these two hours, but I highly recommend that you take a deeper look at it. Next time we'll come back here and do Chapter 2, and we'll look at JavaScript, I promise. Thank you, and good night. [applause]