Tip:
Highlight text to annotate it
X
Greetings comrades!
[laughter]
We're talking about JavaScript. There actually won't be a lot of JavaScript tonight. There
is a whole lot of back story that I think we need to get to before we get to JavaScript,
so tonight is going to be Volume One: The Early Years. I'll start with some of my own
history, but first we'll start with *** Allen.
In 1969 when I was in high school, *** Allen made a movie called Take the Money and Run.
In the movie he plays Virgil Starkwell, and in this scene he's interviewing with an accounting
company for a job, something for which he is completely unqualified. The interviewer
asks him: 'do you have any experience running high-speed automatic digital computers?' And
Virgil answers 'yes, my aunt has one.' He says 'what does your aunt do?' and he says
'I can't recall.' It's funny that a completely nerdy young man would have less experience
with computers than his aunt, it's just ridiculous.
Except when the movie came out it was funny for a completely different reason. The reason
it was funny in 1969 was that computers were incredibly expensive at that time, they cost
millions of dollars, they required a room as big as a house, literally, with a raised
floor and special air-conditioning and fire suppression systems. They required a lot of
people to operate them and to manage them, and to perform maintenance on them. They took
a lot of power to operate. It's sort of like the expense of running a colo for one CPU.
So computers would be only available to very large corporations, large government agencies,
very well endowed Universities, and nobody else. It was completely impossible that anybody's
aunt would have one. So in order to understand this story, in order to understand why it's
funny, you need to understand the context behind it, and that's what we're going to
be doing tonight.
As far as my own story, I wanted to have a computer but they were just not available.
I only knew two people who actually had them. One of them was Napoleon Solo in The Man From
U.N.C.L.E.; he was able to have one because he worked for the United Network Command for
Law and Enforcement and they had a lot of money, so he had a really nice looking computer
there. The other person was Batman.
[laughter]
He had his Bat-computer, and that was because he had access to the wealth of millionaire
Bruce Wayne. I didn't have resources like that, so I decided I was going to build my
own. At that time I had absolutely no idea what computers were or what they did, or why
I wanted it, I just knew that I wanted it. I couldn't even identify all the components
of one except that I knew they had consoles that had lots of lights and buttons on them.
I thought, I'll start with that: I'll make a console and then I'll work out the rest
of it.
I found some pieces of particle board and a saw and I sketched out what it was going
to look like, and started sawing. I sawed, and sawed, and sawed. The particle board was
really, really hard, and the saw was really, really dull. I sawed for what must have been
at least two minutes, and then I gave up. OK, I'm not going to do that. So I probably
went into the house and watched television after that. At that time, even at that tender
age, it was already obvious that I was going to be a software guy.
[laughter]
Having established my credentials, my qualifications for giving this talk, we will now proceed.
There's a lot of history I'm going to give you tonight, and I think it's really interesting
stuff. I'm going to spend this time with you today because I think it's important that
you know it too. Most of us are not aware of the history of our own field, and you need
to be, because there's a lot of rich material here. But there are a couple of recurring
themes that I'm going to identify.
The first is that people who should be the first to recognize the value of an innovation
are often the last, and we're going to see lots of cases where this has occurred. Obsolete
technologies fade away very slowly. We like to think that innovation causes everything
to move over; it doesn't. Sometimes we step forward, and sometimes we step backwards,
and sometimes we step both ways at the same time. Sometimes stepping backwards is the
right thing to do, sometimes it's a bad thing to do ó at the moment, we never seem to know
the difference.
There's also a myth of inevitability, that the reason things are the way they are is
because they had to be as a consequence of everything that happened before, the reason
we have the things that we have is that's just the best way that it could have worked
out, and that's absolutely not the case. I hope to demonstrate that as well. So I'm going
to be weaving some threads together.
Let's start with the Jacquard Loom. Joseph Pierre Charles Jacquard perfected the automatic
loom in 1801, and this is what it looked like. He adapted player piano technology to automate
the operation of a loom, so each card represents the pattern on one row of the thing that's
being woven. In the holes and spaces, instead of causing hammers to go down onto strings
and back up again, instead it controlled the movement of threads within the loom. He didn't
do the continuous roll that the player pianos did because that was just way too difficult
to edit. Instead he had each row as a separate unit and he would then sew them together,
so it was a much easier thing to create.
It was an extremely effective device. That a weaver using a Jacquard Loom could out-perform
a master weaver and an apprentice on a draw loom by over a magnitude in efficiency is
just an amazing difference. The Jacquard Loom became very successful, and these principles
were applied to other forms of automation.
Whenever you have a big breakthrough like that, there are always other things that get
invented as a consequence, and one of the big ones was industrial sabotage. Because
it turned out the weavers didn't like this ó he was completely upsetting the way that
they used to live. So they went around and would find Jacquard Looms and destroy them.
That's what they did. Now, there were some people that did identify other uses for this.
For example, Babbage and Lovelace recognized the potential of using punched cards for moving
data in and out of their computing engines. Unfortunately Babbage never finished his machine,
so that's sort of a dead end.
The story picks up with the Hollerith Card Tabulating Machine in 1890. The United States
Constitution requires a census every ten years in which we go and count all the citizens,
and we use that information to determine the composition of the House of Representatives.
The country had gotten so big by the late 19th century that it was becoming more and
more difficult to count how many of us there are. Herman Hollerith invented a machine and
showed it to the Census Bureau and they accepted his bid.
This is how the 1890 census was done. A clerk would take a questionnaire, which you can
see in his left hand, and copy it onto a punch card using this pantograph punch machine.
We'll zoom in on that card. What a card was, in a Hollerith system, was a set of field
sets, each field set containing a number of radio buttons. The radio button would be on
if it had a hole punched in it, and it would be off if it didn't. In that way, he can code
a lot of information on one card, and do it very efficiently.
Then he had his Tabulating Machine, which was in two parts. There was the main part,
which had the dials that did the counting, and then the sorting cabinet. The operator
would take a card, put it in the machine, and push down an array of pins. If the pins
went through a hole they would touch a pool of mercury completing a circuit, which would
then cause one of the dials to advance and would also open up one of the drawers on the
sorting machine, and then you'd drop the card in and get the next card and do it again.
It seems really tedious, but it was a whole lot better than the previous system which
was all done with paper and pen. The census was successful, and then Hollerith went on
with other inventors as well, to apply this to business.
Well into the '70s, IBM and other companies were still operating this kind of equipment.
It's very, very successful. This is an accounting machine. It did everything that the previous
machine did, but it's more automated in that you can put a whole deck of cards into it
and run them all through at once, you didn't have to have someone feeding it one card at
a time. Here's another example of its operation.
This is a card. In fact, this is a card, this is what they looked like. The form factor
of the card was based on the size of the dollar bill at the time. Hollerith chose that because
at the time there were lots of off-the-shelf stacking trays that you could just put these
into, so that reduced some of his engineering cost. In a later phase of the card, the way
information was stored on it was re-organized. Instead of the random set of fieldsets we
saw before, it became a simpler column-and-row situation. There are eighty columns on the
card, each column containing potentially twelve punches.
Has anyone ever heard of the eighty-character limit? We still have that today, and you probably
wondered where that came from. Why eighty, why an eighty limit? This is the limit. You
can't put more than eighty characters on one of these cards. This has been obsolete for
a long, long, long time, but we still have the eighty character limit.
This card shows the Hollerith code. The first ten punches show how you encode a number.
Digit zero is the zero punch, one is the one punch, and so on. To do letters using what
is called the Hollerith code, you take one punch from the top three ó that's sometimes
called the zone ó and then one punch of the lower nine. Three times nine is twenty-seven,
there are twenty-six letters, so it just fits nicely. There's one code left over, and the
code that was left over was the zero one punch. It was decided not to use that one, because
you had two punches that were next to each other and the machines could sometimes get
a little brutal and tear away that little piece of paper and damage the card.
The use of these machines to run business was called Unit Record Management, and it
was really, really successful. This is what allowed the modern corporation to evolve.
Without this kind of data processing equipment, corporations just could not have become big;
they only could have become as big their bookkeepers could have managed. And these cards were used
for everything. You'd have a card representing an account, a customer, an order, an invoice,
a payment, a personnel record, a detail item in any of those things. In some cases they
were sent directly to the people at home. For example, you would get a bill from a company
and they would send along a card, and you would send the card back with payment, and
when it was received an operator would punch how much you paid onto the card and then put
it back into the system.
The problem with these is that they're really fragile, and you're giving this important
data record to your customer or sending it through the mail and then getting it back,
so they very often came with instructions on them such as 'do not fold, spindle, or
mutilate' telling people not to mess these up, because if they did they could make life
extremely difficult for the operators.
This is a keypunch machine, and this is a more modern way of making cards. IBM built
these well into the '70s, I think. In fact, my very first experience in programming was
with one of these, in the basement of a library in San Francisco State University. The way
you managed a program on these things was your program would be a deck of cards, one
card per statement or line of your program, and if you wanted to modify your program you
had to go through the deck, find the card you wanted to change, pull it out, put it
in the machine, dupe it, modify it, replace it, put it back in, and so on. If you wanted
to rearrange lines in your program you actually pulled cards out and reordered them.
It was an extremely fragile way of putting programs together, and over time you learned
some gymnastic tricks to try to make it easier. For example, if you wanted to take a card
and make another one in which you deleted some of the characters, the way you do that…
First, there are two card stations on the machine. There's the read station and the
write station, so you're always punching at the write station. If there's a card in the
read station and you push the dupe button then it reads whatever's at that column and
punches it on the next one and advances both cards. If you want to do a deletion you hold
your thumb down on the card in the read station so it can't advance, and push spacebar a couple
of times to advance the other one. Really nasty stuff, but that's how you did it.
So the punch card was an amazing device. It served the purpose of memory, which eventually
got replaced with RAM and core. It was storage, which eventually got replaced with disk. It
was archive ó if you wanted to keep something for a long time you would send a box of cards
to the salt mine and it'd keep that there. Eventually we found better ways of doing that,
but I'm sure that deep underground somewhere you can find lots and lots of punch cards.
It was a network. If you were at the field office and needed to get some records back
to headquarters, you'd take a deck of cards and put them on a train, and they would get
sent back. Eventually we figured out we could use wires to do that, and it got a lot better,
but for a time the way you did that was you'd mail a box of cards.
The last of the functions to finally get replaced was user interface. Cards were used for user
interface long after these other functions went away. You would have thought that would
be the first to go away because that's the thing it does worst, but it actually happened
in the other order.
The counting machines were programmable, and they were programmed in a data flow sort of
way. You'd have a bunch of data sources which could be columns on cards, and then you could
direct them to registers and to calculation units and to sinks like the card punch or
the printer. Your program would be on a punch card or a punch board, and you could replace
boards and that would change the program in the machine. These were invented fairly early
in the 20th century and remained current for a long, long time. For a long time this was
how you did programming. You've heard of spaghetti code? This is where it was invented.
Eventually these Unit Record Machines were replaced by mainframes, by digital computers.
They came online after World War II. There was a lot of research during the war in cryptography
and weapons development, and when the war was over a lot of that stuff spilled out into
the commercial sector. A surprisingly large number of companies started building computers.
It was really obvious that that was the way a lot of things were going to be done going
forward. Even so, these machines started coming online publicly in the late '40s, early '50s,
well into the '60s, and record machines continued to work into the '70s. So just because the
good new technology is available doesn't immediately displace the crappy, old technology.
These computers were based on the Stored Program concept which said that instead of having
a plugboard, or some other external programming source, the program is stored in the same
memory as the data, and there are going to be some really interesting implications for
that. The chief one was that over time, the program may modify itself in order to change
or improve its behavior, and eventually, after a large enough series of modifications, the
program will become intelligent and perhaps even conscious, and eventually become our
masters. And that would be a good thing. So there's a lot of research into artificial
intelligence to try to bring that about. Unfortunately it didn't come about because the way our brains
work is just way harder than we can imagine. You'd think if our brains worked right we
would be able to imagine how we work, but we don't.
Instead we had to program them in a different way, and we came up with assembly language.
First we had to use machine codes, whether it be a digit for each thing that the machine
knew how to do and a bunch of digits for each cell in memory. That was just way too hard
to organize, so the first software tool, the first program to make programming easier,
was the assembler using something called assembly language.
We don't know why it's called assembly language. The word assembly doesn't make any sense there.
From what I've been able to figure out, their early programs did a lot of things. They would
do things that we now call linkers and binders and loaders and other things; those all happen
in one program. Eventually those features got teased out into other applications, and
the word assembly was left the one thing they did that had nothing to do with assembly,
but we still call it that.
Here we've got a hypothetical machine. In the left column we have statement labels.
We're going to load the accumulator with whatever word is at the INTERX variable. We'll then
subtract from that the variable called COUNT4. We'll then skip if the result was zero. If
we didn't skip we will jump to ABORT27, and if we did skip we will jump to a subroutine
called CALCKHJ. That JSR is probably the most important instruction in the machine. It was
early on very quickly recognized that the set of opcodes that the machine provides is
never going to be adequate for all the things we want to do, so we want to be able to create
our own opcodes, and that's what the jump to subroutine did. It would jump to a piece
of code and when that piece of code was finished, it would then jump back.
And there were lots of different ways that a machine could do that. One was that it would
remember the address of that instruction and put that in a register someplace, so when
we came back we could use that register to find out where the program resumes. Another
way it could be done is that the program modifies itself. It will take a location in memory
and change it to be a jump instruction to the place where we want to resume, and then
when the subroutine is done it will jump to that instruction. Much later, the stack was
discovered in which we have a place in memory where we can keep track of those addresses,
and it's much more convenient. So we see a lot of that in modern machines, but it came
much later.
All throughout the mainframe period we saw enormous architectural variety. There were
a lot of really clever people building machines. Often you'd have multiple architectures within
each manufacturing company. I don't know how many different models of computers IBM made,
but during the early years there were a lot of them, and there were lots of smaller competitors
who had at least as many, sometimes more.
They varied on things like word size, number types, would they use signed magnitude or
ones complement or twos complement, how many registers they would have, whether they were
special purpose or general purpose. They might have base registers or index registers; enormous
variety in instruction sets. It was an amazing period, and all machine designers were learning
from each other. There was brilliant, brilliant work done for many, many years, basically
trying to drive the price performance thing. Because these machines were extremely difficult
to make, so they were trying to figure out what the best way of putting them together
was so that you could get the most work out of them.
Here's an example of a mainframe. In the front left we've got the disk drives. They probably
couldn't contain as much information as whatever you've got in your pocket right now, but that
was the main online storage in its day. Behind them we've got the punch card equipment ó
the punch card reader, the punch card puncher. Behind that we've got the disk drives, and
way in the back we've got the memory cabinet. You might have 8k or so in a box about the
size of a refrigerator, and as many of those as you could afford was how much memory your
computer had. That in the middle there is the console, which is where the operator works.
Here's another IBM computer. I mean, check out the console. The console has lots and
lots of lights on it, and lots of switches and buttons and knobs. That was the thing
I was hoping to build. And if you were a programmer, that's really where you wanted to be working
because sitting there you could see the contents of every register, if you could read binary.
They had a light for each bit, and if you could work that out then you could see exactly
what was happening in the machine. You could single step the machine, there's a full debugger
there.
The problem was that they never let programmers in the room because they didn't trust them.
Also, the machine time was just so expensive. You had to justify the cost of this extremely
expensive machine and plant that you just couldn't afford to have the downtime that
a programmer would have, sitting there trying to single step through his program. Also notice
the great Mad Men fashions that were in vogue at the time.
This was maybe my favorite machine of that era ó this was the Control Data 66000. Designed
by Seymour Cray, it was for a time the fastest computer in the world. The thing I liked most
about it was the console. Instead of all the lights and buttons, it had a simple keyboard
and two round CRTs, so it could do real time displays. Really, really nice looking machine.
It was too expensive to let programmers sit down at the console and do work, so the way
most programmers worked was in batch mode, where you would take your job and make it
in the form of a deck of cards. The first card would be the job card which identified
what you were doing. You might have an account card, and then a card to tell the operator
what tapes to mount, and then a card indicating that you wanted the FORTRAN compiler. Then
you'd have your FORTRAN program, and then an end of file card, and then your data, and
then another end of file card, and an end of job card.
You'd take all of that and you'd put it in a tray, and then you'd wait a couple of hours.
Eventually a number of jobs would get put into the tray and an operator would get around
to taking them all out and taking the rubber bands off and putting them in the card reader,
and they'd all get read into disk. Then they would take the jobs one by one, or sometimes
several if it was a multi-processing machine, and run them. The results would go to a line
printer and then the operator would take all the decks and put all the rubber bands back
on and match them with the print outs and put them in a bin. So you come back a couple
of hours later and pull the thing out and find out you're missing a comma. You go OK,
and go back to the keypunch machine and fix that comma and submit it, and then next day
you found out that you missed another comma. It was a really unproductive way of getting
things done. They call the process 'submission', when you would submit a job, and it was submit
in both senses.
There was an ideal way that the analysts thought that this process would work. First the analyst
would write the specifications and draw the flow charts that describe the application.
Then the programmer would code a program, probably into assembly language, based on
the flow charts. He would hand his coding pages to a keypuncher. The keypuncher would
then sit at the keypunch machine and punch them in. Just in case the keypuncher made
a mistake, they would take that deck and the coding forms and give it to a second keypuncher
who's working at a slightly modified keypunch machine called a verifier machine, re-type
everything, and if any character mismatches the card is destroyed, and then it has to
be re-punched. Then assuming it gets through that process, it's given to the operator,
and then the operator will run it. If there's a bug then you call a meeting, because nobody
is in charge of the whole thing. I can't imagine that this ever worked, but this was the official
way that it was documented.
So what's a bug? Bugs, as far as I can tell, were invented by Thomas Edison. He invented
a lot of other stuff, but he also invented the bug, and this is the documentation. The
story is from the Pall Mall Gazette in 1889: "Mr. Edison, I was informed, had been up the
two previous nights discovering a bug in his phonograph.” His phonograph was a device
which would record sound and recover sound from a cylinder and a stylus, so the friction
against the stylus would either create grooves or produce sounds as it followed the groove.
I suspect that Mr. Edison's machine had a chirp in it that sounded something like crickets
or something, and he couldn't figure out where the noise was coming from, so it was bugs.
It was sort of a standing American joke for a long time about the crazy inventor who will
become wealthy once he can get the bugs out of his invention.
A real bug was discovered by Grace Hopper. During World War II she was working on ballistics
tables for the military and one day her calculator stopped working. They opened up panel F and
they found a moth smashed in a relay. She pulled it out and put it in her notebook with
the notation: "first actual case of bug being found.” Her notebook is now in the Smithsonian.
We'll get back to Grace in a little while.
The batch mode was not good for programmers. It was designed specifically to try to optimize
the use of machine time, not to optimize the use of human time. So another mode was developed
called timesharing, in which you'd have lots of users who could use the machine simultaneously.
Each would get a fraction of the resources of the machine, but if the applications are
interactive enough then each person gets the appearance that they've got the use of a whole
machine. That turns out to be much, much better.
The way that you accessed it was through a device that was a whole lot less interesting
than the console, but was good enough, and that was the Teletype machine. This is a model
33 Teletype. It's an uppercase only system, and it can work online and offline. Offline
means that instead of sending characters to a computer, you're sending them to its local
paper tape punch. Sometimes online time was too expensive, so you could type your program
in offline and then when it was ready you would log in and then have it read your paper
tape, and that way reduce your connect time charges.
Now, program preparation on paper tape is even harder than on punch cards because if
you make a mistake you can't throw that card away and replace it, it's one continuous band.
One affordance that it did provide was that there's a backspace button on the punch, and
when you push that the punch would go back one character. You could then push the delete
button, and the code for delete is all one, so go 'jink' and completely punch out whatever
was in that column. The convention was that if the mainframe saw a code which was all
punched out, that meant that you had deleted that code and that it should just ignore it.
So if you're ever wondering why your terminal has both a backspace and a delete button,
this is why.
The Teletype was really slow. It printed at ten characters a second, so you had to be
pretty economical in terms of what kind of information you wanted to give to the user.
In terms of accessibility, this is probably the best system we ever had; one that gave
the best parity between sighted people and blind people. You could take a voice synthesizer,
like the Votrax, and put it on the line between the computer and the terminal, and it will
say the name of every character that comes down the wire. So a blind person can be aware
of everything that comes out, gets exactly the same information that a sighted person
should. In the years later we made lots of advancements in terms of the way you can use
machines which have all tended to work very badly against blind people. So everything
I'm about to say after this point works against them.
The character set used by the terminal was ASCII. In fact, for the model 33 it was what
I called half-ASCII, because it was only upper case. Eventually machines allowed us to do
lower case, and the ASCII set recognized that. It contained 128 characters, which was just
enough to do English. As a typewriter replacement, it had pretty much all the keys and characters
that a typewriter would have. For people with other languages, though, it was not adequate.
For people in other countries using other languages, they would replace some characters,
and that made it very difficult for doing interoperation between one country and another.
Also, for Asian countries, the seven bit thing didn't work at all, so they had to come up
with double byte character sets, which made things even more difficult.
Finally, that was solved with UNICODE. UNICODE attempted to take all of the national character
sets and combine them into one character set; a really brilliant thing. Then later Thompson
gave us UTF-8, which was an 8 bit encoding which was ideal for devices like Teletypes
and everything else we did. Today we've got UTF-8, which should be the one way that all
characters are transmitted on the network, but just because we have the best possible
one way doesn't mean that everybody's doing that yet.
One thing that is odd about ASCII is that it has a carriage return character and a line
feed character. This was to model the way that Teletypes actually worked, where the
carriage return character would take the print element and push it over to the left. The
line feed character would take the platen and spin it one line. So most lines are going
to end with going back and rolling the paper, and it took two separate codes to do that.
Most timesharing systems didn't require people to type in both codes ó generally they would
allow people to hit the return key, and then they would echo the line space key, just because
there's no reason to make people type both characters. Also, other devices don't work
that way. Most other printers of the time would just take a line of text and print it
and advance; there was no way to separate the carriage return from the line feed function.
So this was a pretty device specific thing.
Most systems who adopted ASCII as their character set chose one or the other. The systems that
tended to be more hardware focused in their orientation tended to pick line feed, and
the systems that tended to be more human focused tended to pick carriage return, and that was
fine until they needed to interoperate. Then you'd have a committee of people, some using
line feed, some using carriage return ó how do you resolve that? You could just pick one.
You could even flip a coin, because it really doesn't matter. But these committees could
not decide. Nobody wanted to be the guy who got it wrong, and nobody wanted to be the
guy who had to change, so they came up with a mutually disagreeable compromise, which
is: We will always require both. So that's the way the internet protocols work. We haven't
been using Teletype machines in I don't know how many years ó they're decades obsolete
ó but we're still forcing both sets of control codes to be transmitted in HTTP because of
this Teletype heritage.
I would get into arguments with guys in the basement at the Computer Center. On our campus
we had Teletype machines, so we could do timesharing, and we also had the batch system, and we would
argue about which was better. It's obvious that the timesharing system was better. It
was designed to use human time more effectively, and it was just the right thing to do. In
effect, everything we're doing today looks much more like timesharing than it does like
batch, so history bears us out. But there were people there who were vigorously arguing
that batch mode was the right way to do it, that timesharing was a fad… I'm trying
to think of what their arguments were. They didn't make any sense at all. The main one
came down to discipline, that the discipline that batch mode required, where you had to
think the whole thing through and submit flawless programs to the computer because if it was
buggy you would never, ever get it to work. That call to discipline was the biggest advantage
of batch mode.
It was another example of where you have a technology that was developed by programmers
for programmers, and there were programmers who were rejecting it and thought that they
were well reasoned in their rejection. What it really was, when you scrape it all the
way down, was while they intellectually understood what timesharing did they had never tried
it and they never understood it, and so they assumed that they were being very successful
in their current endeavors without ever having understood that. Therefore it was not important
to understanding that, therefore they could out of hand reject any argument that required
that understanding. I continue to see that happening over and over again through pretty
much everything that we do.
One of the big benefits of timesharing was that it provided the first social network.
All of these innovations happened first in timesharing: file sharing, email, distributed
computing, computing as a service, chat, blogs, open source development. That all happened
on the mainframes a long time ago. We think this is all fairly current stuff. In a few
minutes I'll show you why we think it's current stuff, but this stuff all happened back in
the '60s and '70s. We had games on the mainframes, both single player and multi-player games.
It turns out that games are a really important place for technology development. There's
some really good work in terms of user interface design, program construction, algorithm development,
that was all motivated by games.
Then finally, security. Timesharing systems had a huge security problem because they had
lots of people running programs in the same memory, and integrity demanded that they be
able to keep all of that stuff separate and not interfere with each other. So there was
a lot of work in that era to try to figure out how to do that, and then to try to do
the even harder thing after that which was to allow those programs to sometimes cooperate,
because we started to identify the need for collaborative applications. The timesharing
machines were just starting to figure that stuff out when they were destroyed.
One of the other things that happens in timesharing is that you need an editor, and a paper tape
editor doesn't make it. You need to be able to edit online. You can't do what you did
with cards ó take a card out, change it and put it back. So they wrote programs which
allowed you to do that, where you could load a file and then go to a particular line in
the file, replace that file, insert some more lines after that line, and so on. So almost
every system had an edit program in it ó it might have been called Ed or QED or some
variation on that ó but everyone had one. At MIT they called it TECO. They then figured
out how to add keyboard macros to it, and that became EMACS. VI also came out of edit.
These text editors are still in wide use, still very popular, but they are dinosaurs
left over from the timesharing era.
The next step was replacing the Teletype with CRT terminals. CRT terminals were eventually
much cheaper, they use less paper, and eventually they allowed for onscreen editing in which
they could display a page of information and you could cursor around on it. For example,
this terminal had some arrow keys on some of the letters to help in designing software
that would do that. If any of you are VI users and ever wondered how it could ever have possibly
make sense for H to go that way and L to go that way, this is where it happened. Again,
this is a timesharing era dinosaur which still exists in the current age.
Now, IBM was never able to get timesharing right. Timesharing requires that you be able
to switch from one process to another on a keystroke basis, and their software was just
not adequate to do it. Rather than fix their software architecture, they invented a new
piece of hardware that they called the 3270. At the time they called it an intelligent
terminal, but today we'd probably call it something else. The way that 3270 worked was
you would take a page of data and rip it down from the main frame into the terminal, and
it would show up on the screen. Some of the screen will be full of characters which are
part of the display, and some of the characters are reserved as fields. So the user can then
type stuff directly into the field locally, then hit the submit button, and then all the
data in those fields gets sent back to the mainframe.
Does that sound at all familiar to anybody? Does that sound like a form application? This
is where that came from. When the World Wide Web came up there were a bunch of dinosaurs
who said oh yeah, I remember that, and that's how we got a lot of what we've got today.
Now, while all of that was going on one of the smartest guys who's ever lived, Doug Engelbart,
was working at SRI on the Human Augmentation Project. Like me, early on he recognized the
potential of computers, but unlike me he was able to do some really, really important work
on it, which he demonstrated in 1968 at the Joint Computer Conference. It was the most
amazing demo anybody had ever seen. He demonstrated hypertext, he demonstrated onscreen displays,
he demonstrated groupware, he demonstrated video conferencing, to do lists, outline processing.
It just goes on and on and on, all these things that he wasn't just theorizing, that he was
doing and showing live.
About the only thing anybody paid attention to was the mouse. He also invented the mouse,
and he demonstrated that as part of this demo. He also had a chorded keyboard where he had
five keys that he could play like a piano and do keys very quickly that way. He had
five buttons here, three on the mouse, and he could type ASCII with both hands while
he was moving. His theory was: let me take a couple of hours to train somebody in the
system, and I can allow them to do amazing things and be incredibly effective. The world
decided it didn't want to work that hard, but it's just amazing what he did. His lab
was one of the first two sites on the ARPANET, which eventually grew to consume all of the
networks of the world.
At the time he was doing this, everybody else was on punch cards. You just can't imagine
what a profound shock this was to see him showing the future in San Francisco like that.
I highly recommend you see it ó it's available on YouTube. Go search for Doug Engelbart,
The Mother of All Demos. It's out there, and it's just amazing.
We still have not caught up to all of Engelbart's vision. What a number of people have done
over time is take some little bit of what Engelbart was doing but hasn't been fully
adopted yet and work on that. Some people have gotten rich and famous doing that. And
there's still a lot that Engelbart was doing that we haven't caught up with yet. You can
do that, too. I highly, highly recommend that you check out Engelbart.
OK, next is minicomputers. There were a number of developments that allowed for repackaging
some of the stuff that had been in the mainframes until much smaller, less expensive form factor,
and these became minicomputers. A whole new class of companies started making these, companies
like Digital Equipment and Data General, Basic Four ó a bunch of them ó and created many
new markets for computers. In some cases they went into companies which already had mainframes
but there would be operations within them who found that the data processing departments
were not responsive to them.
When they first got the computers it was 'great, now we can do things because we have computers',
and then a little surprisingly short time after that it's 'we can't do these things
because we have computers'. So people would try to get around the system by finding some
cheap box that they could put in their own department. In some cases they would end up
in small businesses and in small colleges and places that formerly hadn't been able
to afford computers at all. They started to show up in places that were new. Again, we
saw an explosion in CPU architecture, an amazing amount of creativity in the way that the designers
and engineers came up with to get work out of these amazing little machines.
The next step was microcomputers. This began in a collaborative project between a memory
startup called Intel and a company that was making intelligent terminals, similar to the
ones we saw earlier, called Datapoint. Datapoint, at that time, was making their terminals completely
out of discrete components, and they were kind of expensive and they also got really
hot, because all of those components created a lot of heat. So they came up with a design
for a little CPU, and they figured if they had that CPU they could reduce the part count
on their terminal significantly, make it a lot cheaper, and make it work better. You'd
run a little program inside that little chip that would look at the keyboard, look to see
if anything was being pressed, and look at the serial port and see if any characters
were coming in, and based on what it was finding it would cause things to happen and then put
them on the screen or send them on the wire.
So Intel developed a device called the 8008, and Datapoint was very successful with that.
They also sold it to the public, and they were also very successful with that. It then
got improved into something called the 8080, and then another startup that spun out of
Intel called Zilog improved it again and called it the Z80.
In addition to that family, Motorola had the 6800, and there was another chip called the
6502 which was kind of based on that design but was much, much cheaper. So we started
seeing an explosion in 8 bit CPU architecture. They went into all kinds of devices, including
into computers. The Apple II had a 6502 in it, and it was the Apple II that put an end
to timesharing because the economies of personal computing were just so overwhelmingly better
than what you could get with a timesharing system. There were some trade-offs in that
you didn't have access to the network anymore, but if all you wanted to do was compute some
models you could do it much cheaper on an Apple II than you could through a timesharing
bureau.
This is the register set for the Z80. It had several 8 bit registers: A, B, C, D, E, H,
and L. It had two 16 bit index registers: IX and IY. It had a stack pointer that was
also 16 bits, and a program counter. All Z80s had this, and it was a very nice way of writing
programs in assembly language. To back up just a little bit… No, I'm not going
to back up, I'll do that later.
There was then the 16 bit generation. Again, a number of companies came up with very interesting
designs. The Motorola 68000, Zilog the Z8000, National Semiconductor the 32000, which I
think was the best of that generation in terms of instruction-set elegance. If you're writing
a code generator or if you had to write an assembly language ó which you shouldn't anymore
by this time, but if you had to ó it was clearly the best thought out of all of them.
Intel went in a different direction than the others, though. They came up with an architecture
called the 432, which they called a micro mainframe, where they tried to take a whole
lot of the functionality that you would expect to find in an operating system and push it
all the way down into the silicon, so garbage collection would be happening in the CPU transparently.
They designed it to be programmed exclusively in high level languages, primarily Ada. They
took Ada, which was a language being developed for the Defense Department, and extended it
to make it object-oriented. So they had that support in their CPU. Very forward-looking
design.
It was one of these designs which just went wild with all the things that they could do,
and they never properly accounted for the cost of all the things that they were doing.
So the basic CPU ended up having to be split onto two chips because it was too big to put
on one, and it turned out to be really, really slow. So it was very expensive, very slow,
it turned out that people couldn't figure out how to write programs for it, and it was
a total disaster for Intel. It looked like they were going to miss out on the 16 bit
generation and they were probably going to go out of business completely. So they had
to very quickly figure out: How do we get into the 16 bit race having stumbled so badly
on the 432? They decided to go back to the 8080, which had been, and continued to be,
a big success for them, and try to capture the business of the 8 bits by making a machine
that was assembly language compatible with the new one.
This is a contrast of the Z80 register set and the 8086 register set. Very, very similar.
They changed some of the names of things, but basically it's very easy to see the Z80
heritage in the 86 instruction set. So they very quickly threw this thing together and
tossed it out onto the market. They didn't design it to be good, they designed it to
be compatible. It turned out that compatibility didn't really matter. The thing that ultimately
sold it was that it was cheap, so it went into devices like this one: the IBM PC.
IBM had looked at what Apple was doing, the effect Apple was having on their mainframes,
and they decided they needed to get into the personal computer business. They built the
machine and called it 'the' personal computer, they sort of took over the space, and they
put Intel's chip in it. Then they went to a company that was best known for its crappy
basic interpreter, a company that knew nothing about operating systems, and got them to make
an operating system for them. That was MS-DOS, and that went into that machine. There were
a lot of other companies who also made similar machines, and most of them failed. The only
ones that succeeded were the ones who made machines that were exactly the same as this
machine, what were called clones. The clones set the new standard for cheap computers ever
since.
That was followed by another generation, the 32-bit generation. There were lots of really
elegant designs out there that were really good. Intel decided, again, to play compatibility,
so for the 386 they put in a mode which simply took each of the existing registers and changed
its size from 16 to 32 bits. This was done again in the 64-bit generation, the EMD this
time doing the design. It took each of the Z80 registers and pushed them into 64 bits.
Without question, the worst CPU architecture we have is the Intel architecture. Intel has
always been very much aware of this and embarrassed by it. It improved a little with each generation
of the 386, which is significantly better than the 286, but still at its root there's
an 8080 in there, and there's just a lot of awfulness as a result of that. To manage its
embarrassment, Intel has pursued a lot of other architectures that were actually quite
elegant. There was the 960 which was really good, there was the 860 which was also very
good, and the Iridium. But the market said no, we don't want that, we want the bad stuff,
we want the compatible stuff.
And who is making those decisions? It's programmers. Programmers say no, we don't want the machine
that is best for programmers, we want the crappy one because that's what we're used
to. That's the way we do it. So even though we think we're very knowledgeable about the
work that we do, as a community we are historically quite bad at understanding what we do and
what we need in order to do what we do effectively.
One of the reasons why microprocessors ended up destroying the mainframes and the minicomputers
and eventually became everything was because of a prediction made by Gordon Moore, who
was at Intel. He hypothesized that the complexity for minimum component costs has increased
at roughly a factor of two per year, and he just assumed that that would go on, perhaps
at a slightly slower rate. He thought that it would go on for ten years. It's gone on
for forty years now. It's just amazing that for every two years we get a doubling in the
efficiency of semiconductors.
This prediction was called Moore's Law, and it has held for an amazingly long time and
is likely to continue to go for awhile further. It's not really a law, it's a prediction that
became a self-fulfilling prophecy. If you're an engineer at Intel, you're shown a point
on his graph and told this is where you need to be in three years, come back when you can
hit that point. They have to do amazing superhero kinds of stunts in order to accomplish that
level of performance, and when they turn it in it's like yeah, OK, we knew you were going
to do that. It's nothing special. It seems to me pretty thankless to be doing that kind
of engineering at Intel. It can't hold forever. Everybody knows that eventually Moore's Law
is going to fail, but it's still holding. It's got a lot of life in it yet.
The other thing we've seen is an end to CPU innovation. We used to see a lot of really
radical new designs happening all the time, but we don't see that happening anymore. Basically
we've got three architectures that we use for most of our stuff: virtually all the computers
are on Intel, most of the game platforms are on Power PCs, most of the mobile devices are
on ARM, and that's it. Nobody's making new stuff, nothing radical, it's just refinements
of stuff that's been happening for several decades.
We're doing even worse in operating systems. It used to be that every model of every machine
had its own operating system, and that came with a lot of obvious inefficiency, so we've
pushed that down and now we have just two: we've got Unix which was developed in the
'70s, and we've got Windows that was developed in the '80s. Of the two, Unix is obviously
the better one, but there's no innovation happening in operating systems. Basically
we've been rewriting the same systems for 40 years. That's just not where we do innovation.
Where we do innovation is in programming languages, and that's been going on for quite a long
time. In the '50s, everything was assembly language unless it was still punchboard, plugboards,
which were still going on too. There was interest in research in automatic programming, because
the perception was that programming's just way too hard and we need to figure out a way
to make it easier, so we'll make it easier by having the computer do most of the work
for us. We'd already seen with assembly language a start to doing that, and we wanted to go
further so that instead of writing a program you instead tell the computer what the program's
supposed to do and then the computer will write the program for you. Brilliant, that
should be easy. There was a lot of work and experimentation on that, and the result of
that experimentation was called FORTRAN.
You might be looking at FORTRAN and thinking this kind of looks like a program, and in
fact it is. Automatic programming didn't work, because it turns out the description of a
program in sufficient detail to do what you intended to do is still a program. What they
succeeded in doing was raising the level of abstraction. Instead of dealing with memory
cells and opcodes we're now dealing with things which look more like the problem domain, so
you can be much more productive in this language which is a really good thing. But this doesn't
replace programming, it's just another kind of programming. We've seen this happen over
and over. Right now there's a lot of interesting work happening in domain-specific languages.
There are some theorists who think that working in those very specific languages you're not
really programming, but in fact you are. You're just programming at a different level; sometimes
a more appropriate, productive level, which is good.
This is a FORTRAN program. FORTRAN arrived in the late '50s. Here we've got a subroutine.
Subroutines are very similar to modern functions. FORTRAN did allow for recursion, but in other
ways its very much like our current functions. The if statement looks a little odd. What
it means is if N is negative or zero, jump to statement 10. Otherwise, if it's greater
or it's positive, jump to statement 8. Eventually it came up with a better way of writing if
statements, but even this if statement looks quite a lot like the C if statement, and that
similarity is not accidental.
It also has a do loop which allows you to do something a certain number of times. In
this case, we will iterate from here to statement 9, each time varying i from 1 to N. That's
how you read that statement. Their data is taking the ith member of i. Square bracket
hadn't been introduced into mainframes, so that character wasn't available, so they used
parentheses for doing that. But they did use the asterisk for multiplication. I don't know
if you ever wondered why we do that, why we don't use an X or a dot or something else
instead. The reason is that the early mainframe instruction sets didn't have those characters
in them ó they were designed for business applications, they had character sets that
looked like typewriters, and so FORTRAN established the convention that you use asterisk to mean
multiplication. And that's still the case in virtually all languages now.
Another language was COBOL. COBOL was developed by Grace Hopper, who you remember earlier
discovered the bug. COBOL was an attempt to make programs look more like English. At first
the hope was that anybody would be able to write business applications, but that turned
out not to be the case. Then there was a secondary hope that at least anybody ought to be able
to read one of these programs to understand what it does. This was particularly hoped
for by management, because management had little trust or understanding of what programmers
were doing, and the thought was that if they could read what they were doing then it would
be a little easier to keep control over the operation. But that didn't really work either,
because there's a lot of subtlety in programming in any language which is not readily apparent
to most people.
BASIC was a slightly later language. It was designed specifically for timesharing. It
was developed at Dartmouth University by Kemeny and Kurtz. They did a really clever thing
ó they started with FORTRAN and stripped it down and stripped it down into the simplest
possible language so that anybody could use it without much training at all. It was very
quick to learn. They also came up with a clever way of editing programs. They came up with
the line number, so you give every statement a line number and if you want to change that
statement you simply type the line number again and the new statement that replaces
it. They used the same line number as the destination for jumps, so there's a certain
kind of economy there.
Here we have a Hello World game where at line 20 the program will print the string: 'What's
your name?' and then read from the terminal whatever you typed followed by carriage return.
BASIC, even though it was a really primitive language, had the best string processing and
the best text processing of any language in its generation. It hardly does anything, but
it just does the right things. It's got a way of representing a literal string, a way
of concatenating a few together, a way of teasing them apart, inputting them and printing
them out. That's all you need, and it did that. That's been followed by virtually every
language since then, and in that sense BASIC was incredibly influential.
The other thing BASIC did was it sort of crystallized the input/output relationship. Here I want
to interact with the user so I will print and I will input, which means that my program
stops until the specific thing that I asked for is delivered by the user. So it's an extremely
modal thing, that the operator has to somehow figure out how to convince the program to
get to the place where it wants to ask the thing that the user wants to tell. Later we
discovered that this was a really bad way to write programs, but it took a long time
to figure that out.
BASIC influenced a number of other languages. There was Business BASIC that ran on the small
business minicomputers, which added database functionality to their file systems so you
could store values and retrieve values by keys into files and pull them out and do operations
on them. It was much more pleasant than COBOL for most people, and it was very cheap and
very popular. Microsoft was started on Microsoft BASIC, and that eventually evolved into Visual
BASIC which for a few years was the most popular programming language in the world, although
it has been replaced by another that we'll get to a little bit later.
A really important language which came out in 1960 was ALGOL 60. It is the best design
by committee in the history of programming languages. A bunch of really smart guys got
together and came up with a language for use in expressing algorithms for publication,
but while they were at it they also made it actually work in practice. So it defines a
couple of languages ó a reference language, a publication language ó but it worked, and
it was popular within its sphere. There were a number of machines that were designed specifically
to use ALGOL as their basic language. It introduced the notion of structured programming and blocks.
We have blocks in modern programming languages, most of them use curly braces, and that came
from ALGOL. ALGOL used the words begin and end instead of curly braces because again,
curly braces weren't available at that time, they were invented later. But that's where
we got that stuff. It was an extremely important language, a very influential language, but
unfortunately there were lesser languages which tended to be much more popular.
One big debate that happened partially as a consequence of ALGOL was the structured
programming debate. Dijkstra wrote a famous letter entitled 'GOTO considered harmful',
and Dijkstra claimed that programs like this are just too hard to follow when they get
complicated. You got the things bouncing here, there, there, there, you can't keep track
of what the program's doing, it doesn't scale sufficiently well to allow us to write programs
of sufficient complexity, and that we would be better off if we simply stopped using GOTO
and used the other features that ALGOL had provided. It turned out he was right. But
at the time this was an extremely contentious idea, that programmers would have an easier
time managing the complexity of their programs if they don't use this feature.
Who was most enraged by this suggestion? Programmers. This debate went on for literally a decade,
for two decades, for a generation, arguing about whether GOTO should be eliminated or
not. Ultimately we got rid of it, and that was the right thing to do. I think it's not
coincidental that it took a generation to do it, because basically we had to come up
and train a whole new set of people who were not stuck in the previous idea. Again, who
better should have understood the value of structuring your programs in such a way that
they could scale better? Only programmers should understand the value of that argument,
and programmers were least able to understand that argument.
From that generation, FORTRAN, COBOL and ALGOL, each of these languages was pretty specialized.
In particular, FORTRAN was intended just for scientific processing, and COBOL was intended
just for business processing, and there was interest in trying to make a common language
that could do both. At that time they didn't recognize that there were other things, as
well, that would actually dwarf both of those applications, but it's still early yet. There
was PL/1 developed at IBM, there was the Combined Programming Language that was developed in
England, ALGOL 68 that was developed in Europe ó they all wanted to be the √ºber-language
that would do everything. All these languages had some partial success, but none of them
fulfilled the promise of being the language that would bind us all.
In fact, there was a reaction after that to try to scale it back and come up with simpler
languages. There was a dialect of CPL called BCPL: Bootstrapper or Basic CPL, which was
a simplification similar to what had happened with BASIC and FORTRAN. Strip it down, strip
it down, make the simplest possible language that works, and then you've got a language
that works. BCPL was very successful within its niche, and we'll see more of that in a
moment.
Then the design of ALGOL looked at what was happening in ALGOL 68 with horror and said
no, that's not the way. ALGOL actually got it right, and there are some who consider
ALGOL an improvement on most of its successors, which in fact was true.
Taking that approach, Wirth came up with Pascal, which was extremely popular. He designed it
as a teaching language, but a lot of people put it to work as a general programming language.
Unfortunately there were a couple of significant design problems which interfered with its
larger mission. One was that it wasn't modular enough, so it assumed that a whole program
was one unit, and that turned out not to work practically. A bigger problem was its type
system. Types were intended to make programming easier, but in this case it made programming
significantly harder because the dimension of array, the number of elements in an array,
was considered part of its type. So if you wanted to write a function that could deal
with an array, it could only deal with arrays of one fixed size, and that turned out not
to work very well.
BCPL inspired Ken Thompson to make another language called B, which basically took the
good ideas that were in BCPL but give them FORTRAN syntax, which wasn't necessarily an
improvement. But he did, so it got that. Then C took B ó this one was Dennis Ritchie ó
taking some of the good ideas in Pascal, being more selective in taking stuff from its type
system, and adding it to B which was mainly a typeless language, and made C. C was incredibly
popular, and it's become the most important implementation language of all. Virtually
all languages since then are either based on C or are implemented in the C. C has been
an extremely successful language.
While all this is going on, there's still assembly language happening. Again, there's
the debate ó which should we be using? The assembly language or high level languages?
There are people arguing on both sides: that the high level languages make them much more
productive, that the high number of lines of code you can write in a day is pretty much
constant, and if you're writing in a high level language those lines get more work done
than in an assembly language. That's the basic argument for the high level languages. The
argument for assembly language was… I don't know. There wasn't a good reason,
but they'd argue about it, and they'd argue on and on and on.
Now, it turned out there was a good reason for assembly language: you get systems like
this. This is the Atari 2600, the VCS Video Computer System. This is the first computer
most people had in their house. It had a 6502 in it, it was really cheap, and it ran games
that you could play on your TV set. It was impossible to program this machine in any
high level language, you had to be working in assembly language, and I'll show you why.
The machine contained a 6502 CPU. Actually, 6507, but it's the same instruction set, which
had a very small number of registers. It had an 8 bit accumulator, two 8 bit index registers,
an 8 bit stack pointer, a flags register, and a 16 bit program counter. That was it,
that's all the registers you'd get.
There's no code generator that knows how to write efficient code with that. Worse than
that, there was no software or firmware built into the console, so the only code there that
could run was what was supplied on the game cartridge. People go to Sears, buy the cartridge,
plug it in, and you play. The cartridge had 4K in it, and that's not a misprint. 4000
characters were on the cartridge, so all of your program, all of your static data, all
of your visuals, bitmaps, text, everything has got to fit in that 4K. Later they came
up with an 8K, but 8's not enough more than 4 to think about writing this in C.
In addition, the console has some RAM in it. It has 128 bytes of RAM. Again, I'm saying
this very precisely ó 128 bytes. You can count them: 1, 2, 3, up to 128, and that's
all you get. That has to include all of the dynamic state of the game including any dynamic
bit of imagery that you're getting ready to hand to the video shift registers, any music
that you're playing. You've got to be keeping track of the note list and the durations and
all that stuff, that's got to be in RAM. Your subroutine stack is in that same RAM. The
guys who could write for the VCS were heroes.
[laughter]
They would do amazing stuff. There were like 30 variations in the tank cartridge. You can
toggle the console button and play 30 variations of this game where these two guys go round
and shoot at each other, and it was all implemented in 4K, and it's in color. When the game's
not going they will cycle the colors on the TV so that you don't get burn in on the phosphorus
on your set. All of that is happening in the cartridge. 4K. 128 bytes. Amazing.
So you had to do that in assembly language, there was just no way you could do it any
other way. The incentive is you can get your program in millions of homes, so that's a
good thing to do. You can't get them there writing in FORTRAN. Specialized systems like
this just weren't compatible with high level languages, so you had these throwbacks which
kept assembly language being useful long after high level languages became dominant.
ALGOL went on and had some other influences. ALGOL was in 1960. In 1967 Simula was developed
in Norway. Simula was the first object oriented language; Simula added classes and objects
to ALGOL. That language had a big influence on Alan Kay, who went to Xerox Parc and in
1982 he started working on a programming language for kids based on the object-oriented idea.
The name of the language was Smalltalk. Alan and his lab spent a lot of time working on
this language. It went through several generations, a lot of testing, brilliant work, great implementations.
They published it eight years later: Smalltalk 80. A great language, the first truly modern
object oriented language.
They did it as part of motivation for a system that they called the Dynabook, which was going
to be a portable personal computer. As part of that work, they took what Engelbart had
been doing in timesharing systems and applied it to personal computers. They adopted some
things that Engelbart did very obviously, things like mice, and his approach to interactive
displays. They took it a little bit further and came up with bitmap displays with overlapping
windows, they invented window systems for that. Basically the modern user interface
was developed at Xerox as part of the Smalltalk project. Also at Xerox at the same time they
came up with local area networking and Ethernet and laser printers and a whole lot of stuff
that we take for granted today. Xerox tried to commercialize all this stuff but never
really understood what their labs had developed for them, so those projects failed.
Smalltalk itself looks a little alien. Here we've got a statement, and we'll set the result
to either the string greater or the string less or equal, depending on the relation between
A and B. The way they would describe this working is they would send the greater than
message to A, passing B as a parameter, and the result of that will be an object which
is either the true object or the false object. Depending on which it is, they will then call
that object with the ifTrue ifFalse method and depending on what the state of the variable
is it'll call one or the other. They'd use this language where they'd say, instead of
invoking a method, we haven't come up with that terminology yet, so they called it sending
a message.
I don't know why it is, but a lot of programmers just couldn't get used to this syntax. What's
going on there is really a method invocation, but it's not a dot and a name and then some
parentheses, it's these key words with colons and then values in between them. This is actually
more readable if you understand what's going on, because it's self documenting to the extent
that it tells you what each parameter is doing which is something that we don't have in the
conventional notation, because all you have is a comma, it doesn't tell you what anything
is. So this may be a superior notation, but it was profoundly rejected. By who? By us,
by the programmers, because we couldn't understand it.
As a consequence, Smalltalk never made it commercially. But despite that, Smalltalk
has been extremely effective as an influencer. Under the influence of Smalltalk, we've seen
C evolve into Objective C, C++, and Eiffel. And then C++ inspired Java, which inspired
C#. So basically every language since then has taken ideas from Smalltalk, combined with
the crappy syntax of C, and that's basically the modern world.
All this took a long time. In 1967 Simula, to 1995 Java, and now here many years later,
it took awhile for object orientation to be just the way you do things. Again, there were
debates ó we don't need objects, objects don't make sense, they're just a lot of overhead,
they don't really do anything for you. Who was making those arguments? Programmers were
making those arguments, thinking they knew what objects were but having no experience
about what they are. But eventually we all figured it out and we took that next step
forward.
Software development comes in leaps, and our leaps are much farther apart than the hardware
experiences. Moore's Law lets the hardware leap every two years; we leap more like every
twenty years. Again, basically we need a generation to retire before we can get the good new ideas
going, so despite the fact that we're always talking about innovation and how we love innovation
and we're always innovating, we tend to be extremely conservative in the way we adopt
new technology.
Smalltalk had some other influences as well. One of them was a language called Self, which
was also developed at Xerox Parc and eventually moved to Sun Labs, worked on by Ungar and
Smith. Brilliant language. It took the Smalltalk idea and took the classes out, so instead
of having classes which define sets of instances you just have the objects themselves, and
you allow one object to inherit from another object. That greatly simplified the language.
Part of their motivation for doing that was to allow them to go faster; they were trying
to figure out how to make Smalltalk or a Smalltalk-like language run as fast as C.
The thing we know Self best for is the stuff they did in performance. They did amazing
work in garbage collection systems ó generational scavenging came out of this language. The
hot-spot technology that made Java acceptable came out of the Self project. The V8 system
that's being used at Google also came out of the Self project. So Self was a big influence
on performance. But also it did a really good job of demonstrating the idea of prototypes,
where you don't need a class, you simply have an object that inherits from an object. That
turns out to be a really powerful idea. It's a newer, more recent idea than classical object
orientation, which is why the idea was rejected out of hand by most programmers because again,
it's new and unfamiliar. If I don't know about it and I'm such a hotshot guy, then it can't
be important. It turned out it's very important, and we saw it first in Self.
That brings us to the Actor Model. This is another kind of indirect spin off from Smalltalk.
Smalltalk would talk about how you send a message to an object. Carl Hewitt at MIT listened
to the way they were describing it and going: well, that's not what you're doing. These
are just invocations, you're not sending a message. But what if you were sending a message?
What if each of these objects was an independent process, let's call it an actor, and the only
way they could communicate with each other is to send messages? The messages will be
asynchronous, you just send the message like you're sending an email, and every actor will
have a queue of incoming messages that it can then process in order. What kind of programming
model would you have?
It turns out you would have a model with really interesting properties. It scales really well
because you can take all these things and put them on one CPU or put them on a million
CPUs and they work exactly the same. It also had really interesting security properties
in that each of these was a separate process that was completely sealed, so nothing could
interfere with it. So they all protect their boundaries. Any actor can only talk to other
actors that it has knowledge of; if it doesn't know their address it can't send them an email.
It beautifully demonstrated the capability principles, and it's really good stuff.
It turns out if we step back from things, a lot of things that we're already familiar
with are already in the actor model. For example, modern desktop applications are all built
around an event loop. That event loop looks very much like the message queue that an actor
would have, so an application is an actor. Looking at the web, a web service is an actor.
It's something you send a message to and it may send you a message. So the actor model
is more familiar than we may realize, but it was still pretty new and again, too radical
for most programmers.
But there were a couple of programmers at MIT who wanted to understand it better, so
they took Hewitt's actor model and implemented a part of it in LISP and created another language
that looked a lot like LISP but had slightly different semantics. The thing that they discovered
was the actor dispatch model looked exactly like their function dispatch model, and that
functions and messages were the same thing, which completely surprised them. They weren't
expecting that at all. They kind of refined that idea and came up with the language called
Scheme, which is sort of the perfection of LISP.
LISP was the artificial intelligence language that had been developed at MIT in 1958, and
they got it right. Part of what happened was that they needed tail recursion in order to
allow you to keep calling things and never expect them to return without running out
of memory. It also allowed for flexible closure so that if a function is nested inside of
another function, it gets access to everything that the outer function has even if the outer
function is already returned. So there are all these really intricate nested actor patterns
that fell out of the work, really brilliant stuff. Scheme then went on to influence a
lot of language designers. It also influenced LISP, so common LISP owes a lot to things
that Scheme figured out.
The actor model also influenced the design of a data flow language called Joule. Joule
had been designed specifically for security applications. There was then another project
that took Joule, gave it Java syntax, and created a new language called E. E is the
language that demonstrates the object capability model, which turns out to be the savior of
secure systems going forward. There's a lot of work now in trying to make JavaScript into
a secure language that's all deeply informed by the work that happened at E. You may have
heard of Caja ó it's something that we're using here at Yahoo! in order to secure applications.
Caja was developed at Google based on work that had happened in E.
OK, let's take a little detour now. Xerox had done this brilliant work with Smalltalk
and the Dynabook and was unable to commercialize it. Steve Jobs got a demo of it and immediately
understood the potential of this stuff, so he took it back to Apple and eventually Apple
produced a device called the Macintosh. It had a 68000 Processor in it, 128 kilobytes
of RAM ó so it was 1000 times better than a VCS ó but was still too small, so initially
you couldn't program this machine in anything but assembly language. It had the bitmap displays
and a mouse and a lot of the stuff that had been demonstrated at Xerox, but it was still
hard to program, particularly for programmers who were used to the basic model where you
input and print. You don't do that on this device because the program has to be running
all the time and the user has to be able to click anywhere and have it be meaningful.
So the old stop and wait for input model just doesn't work. Apple gave people advice on
how to write their applications, but it was really difficult to get programmers doing
that.
So Bill Atkinson came up with a really interesting application. He had written MacPaint and QuickDraw,
and he came up with this little database tool which he thought was going to make it easy
for people to make applications. They added a little scripting language to it and then
suddenly stuff that had been so difficult about Mac programming became easy enough that
non-programmers could do it. That was called HyperCard. For awhile HyperCard was free on
all Macintoshes, and it was extremely successful. It was imagined to be the future of software,
that all applications from this point on were going to be HyperCard stacks. HyperCard was
going to be the way everything was going to be made going forward.
What is HyperCard? HyperCard's basically a file format of stuff that can be displayed
visually, and it has a very small set of types in it. There's the stack, which can contain
any number of backgrounds and cards. There's a background which contains an image, and
maybe some buttons and fields that get shared by every card that has that background. You
can have cards which can use one of those backgrounds and can also have buttons and
fields on it. A button can be a clickable area and it can have text or image, and a
field is a thing that has text that you can type into. Many of these things we have on
web pages, but this was an earlier model.
The whole thing was a little IDE in that you could type in Command B and that would make
a new button and open up a dialog. Then you could give the button a name, and you could
configure it to have it tell you what kind of input it was going to be. You could also
then click on its link, and that would take you to another page in which you could set
its script.
What did its script look like? Well, its script could look something like this. You'd say:
'on mouseUp'… Some of you might be going: woah, 'on mouseUp'? That sounds eerily familiar.
This is where all that stuff came from. HyperTalk wanted to look like English. Their motivation
was a little bit different than COBOL's, but similar. They wanted to make the language
easy to teach by making it look familiar. Here we say 'set the location of card button
x to pos', but the modern language would probably write something like 'card.buttons set x = pos'.
Both would do the same thing, but that's how you write it in HyperTalk. HyperTalk is trying
to look wordy.
One of the disadvantages of HyperTalk is that you can't ever see the whole program, because
all of the handlers are nested inside of their individual components. So you never get the
big overview. But the plus side of that ó because everything's a trade-off ó is that
if you put a script in a button and then move that button onto a different card or into
a different stack, that button will still work because the script travels with the button.
Also there was a delegation model in that I could put this script in a button and then
if I click on the button then this script will run. Or I could put it in the card that
the button is on and then if the button doesn't handle it, it delegates to the card. Again,
that might seem very familiar to some of you, and we'll see more of that in future evenings.
HyperCard had stacks of cards containing buttons, images, text fields. It didn't anticipate
color. It was a strictly one pixel black and white system, so it didn't always look very
good. That may have been because Bill Atkinson was colorblind and didn't see the need for
it. It had things you could click on and then go to something else, go to a different stack
or a different card. But it didn't allow you to put the links inside of the text fields;
that was an obvious thing, but they just never figured out how to express that. Probably
the biggest limitation was that it didn't anticipate networking, so everything was expected
to be distributed on floppy disk. Also it had a terrible security model, because if
you loaded someone's stack onto your machine and if it came from an evil person they owned
your machine now. It didn't protect you from that kind of stuff. Almost overnight HyperCard
just sort of collapsed. It had been the biggest thing anyone had ever seen and then it virtually
disappeared.
Winding back a little bit more, going back to Engelbart's system. Engelbart's system
at SRI was not just a demo, it actually worked, it was a real system. SRI sold his system
to a timesharing company called TimeShare. Then TimeShare was sold to McDonnell Douglas,
and then McDonnell Douglas buried it, so unfortunately that stuff didn't go forward. It died inside
that corporation.
Engelbart was a big influence on Ted Nelson, and Ted Nelson came up with an extremely ambitious
hypertext system called Xanadu. In fact, Nelson invented the term 'hypertext'. His system
had bidirectional links in it, and transclusions, and inclusions, and a payment system, and
all kinds of stuff that he considered to be necessary. He had a brilliant team of engineers
building this stuff, but they never finished it. Xanadu had a small influence on HyperCard.
Basically that influence was the name ó the 'hyper' in HyperCard was lifted from the hyper
in hypertext, but that was about the only similarity.
Tim Berners-Lee's World Wide Web was also directly influenced by Xanadu, except he really
didn't know very much about Xanadu, and he knew nothing about Engelbart. But as a result
it was really simple, because he had never thought of all the really complicated things
he can do, and because it was really simple he was able to implement it. It turns out
that getting the thing done counts more than just about anything else.
The World Wide Web itself was influential. After Sir Tim published his specs, a lot of
people started imitating it. The most famous of those was the Mosaic project at the University
of Illinois at Urbana-Champaign. They developed the Mosaic browser. At that time there were
a handful of protocols that were all contending to be the popular front end of the web, and
this team couldn't decide which of those was going to win, so they made a program that
could implement all of them. It could do Gopher, and WAIS, and everything. They called it Mosaic
because it was made up of all those different pieces. It turned out that the web component
was the one that people liked, because they added an image tag to it so that even though
it wasn't what everybody wanted, it could be made to look exactly like what everybody
wanted, and that was enough to send it to the moon. So Mosaic and the Web became extremely
popular after that.
Then that team split into two separate start ups, Netscape and Spyglass. Netscape announced
that they were going to destroy Microsoft, so Microsoft bought Spyglass and turned it
into Internet Explorer. Netscape had an idea to take the ideas in HyperCard, particularly
that easy to use program model that was event driven, based on buttons and fields, and put
that into the browser. They hired this guy to do it ó that's Brendan Eich. Brilliant
guy. They hired him out of Silicon Graphics, they asked him what he wanted to do, and he
said he wanted to write a Scheme interpreter because he'd read this game import and thought
that it was really cool. So they said great, they hired him, and then said 'but you can't
do Scheme, that's just too weird looking, people won't like that. Make it look more
like Java.' So he designed a language that looked more like Java.
Basically, he took these components. He took the syntax of Java, he took the function model
of Scheme ó which was brilliant, one of the best ideas in the history of programming languages
ó and he took the prototype objects from Self. He put them together in a really interesting
way, really fast; he completed the whole thing in a couple of weeks. It's a shame that he
wasn't given the freedom that Xerox had to spend a decade to get this right. Instead
of ten years it was more like ten days, and that was it. I challenge any language designer
to come up with a brand new design from scratch in ten days and then release it to the world
and call it done and see what happens with that.
One of the consequences of it was that there are parts of it that are just awful. If they'd
had more time they probably would have recognized that and fixed it, but they didn't. Netscape
was not a company that had time to get it right, which is why there's no longer a Netscape.
[laughter]
But despite that, there is absolutely deep profound brilliance in this language, and
this language is succeeding in places where many other languages have failed because of
that brilliance; it's not accidental that JavaScript has become the most popular programming
language in the world.
Many people may not remember that the language of the browser was supposed to be Java. Java
Applets in 1995 were the hottest thing anyone had ever seen, and they were going to rule
the world. They were hotter than HyperCard. It was going to be big. And the Java community
doesn't remember this, but Java failed on its face, hard, total, complete failure. They
managed to find a niche on the server side, so there's good in Java and it survives. Good
for them. But the thing that Java was intended to do, the thing they told the world this
is what it's all about, Java totally failed. In that same venue, JavaScript is succeeding
brilliantly, so the argument that it's just luck, because it's in the browser, that's
why it's doing so well, that completely ignores history, because Java was in there first and
got every break. Just being in the browser was not enough to assure success. We'll be
talking more about what the language got right and wrong in future episodes.
In 1969 Jean Sammet wrote a brilliant book called Programming Languages: History and
Fundamentals, which was basically a survey on all of the work on automatic programming
that had happened in the '50s and '60s. She countered over 100 languages which she describes
in her book, because that was a time of amazing innovation. I'm very happy that we are, again,
in another of those periods of innovation; we've got a lot of interesting languages now,
including some pretty wild designs like Haskell, Erlang, and Scala, which are all getting attention.
And there are lots of other languages which are also getting attention.
One thing that's different now than in the '50s and '60s is there are lot of computers
out there, and there are a lot of people writing programs now. It's possible to get a community
of people even if you have a minor language, enough to do useful things, to do a lot of
group work. You've got a group large enough to justify writing books, which was something
we didn't have back in the '50s and '60s. So I think this is a great time to be a programmer.
We have lots of choices, and we need to be smart about making those choices and be open
to accepting the new ideas, because there are a lot of new ideas out there that we shouldn't
be rejecting just because they're unfamiliar and we don't see the need for them. There
are actually a lot of good ideas in all of these languages, not least of which is JavaScript,
which will be the subject going forward.
There's much, much more of this history. The history of computers and software and programming
languages is incredibly rich. I was only able to scratch the surface of it in these two
hours, but I highly recommend that you take a deeper look at it. Next time we'll come
back here and do Chapter 2, and we'll look at JavaScript, I promise.
Thank you, and good night.
[applause]