"I'm a Computer" and other tales: March 2011

Thursday, March 31, 2011

CS THEORY: Memorable numbers

See, I told you this would be the title of the next article.

I've decided to start with something that is more fundamental to computer science, and less fundamental to computer architecture. I think talking about the concepts behind computer memory from a theoretical, mathematical kind of perspective will be more valuable to more people when starting out than talking about binary mathematics. So for now, I'm not going to talk about binary numbers (although you're welcome to imagine all the numbers I do talk about as such if you like). Instead, I'm going to talk about regular numbers and memory cells.

I kind of struggled with how I wanted to explain computer memory this time. Last time I talked about a lineup of bathtubs filled with binary digit buckets. That was pretty weird, in retrospect. Let's move away from that and talk instead about regular numbers and boxes that can hold those numbers.

Imagine a spreadsheet. Imagine the giant grid that extends on forever to the right and down, with the alphabet along the top naming each of the columns, and the numbers along the side naming the rows. Now, erase columns B-infinity, so we're left only with column A. This is a pretty good way to visualize computer memory. A single, numbered list of boxes that can each hold a number. It doesn't really make a difference whether you imagine this numbered list of boxes as being vertically or horizontally oriented, but for the sake of space efficiency, I'm going to write it out as a horizontal row of boxes of numbers. Here it is:

[_] [_] [_] [_]

Imagine that as 4 boxes, none of which currently has a number in it. I just got tired of typing those out after 4 boxes, but in real computer memory there are billions. Yeah, billions. That's a lot. One of the things about these memory boxes, or "memory cells" as I'm going to call them, is that they *need* to hold numbers. They can't actually be empty. So let's put some numbers in there:

[83] [101] [116] [104]

Pretty good numbers, right? Let's list some things we could do with those numbers. Well, we could add two of them together, we could multiply two of them together, we could do any kind of math on them that we want. Those are pretty predictable uses of numbers. One thing about computers, though, is that the computer doesn't care what we do with those numbers or what we pretend those numbers mean (unless we ask the computer to do math on those numbers, then the computer will treat them as numbers and do the math correctly).

In fact, those numbers I listed up above are not totally random. There is a way of pretending that numbers are actually letters and symbols called ASCII that says that those numbers I chose up there actually stand for something else. According to ASCII, the 83 stands for 'S', the 101 stands for 'e', the 116 stands for 't' and the 104 stands for 'h'. So [83] [101] [116] [104] can actually stand for "Seth" (that's my name, by the way).

Now to relate this to the computer you're reading this on right now. Your computer has billions of memory cells, each with some kind of number in it, and those numbers can mean things that humans can understand, like letters we can read, or colors we can see, or sounds. Since I typed out this word "Seth" and your computer downloaded this webpage so it could be displayed on your screen, somewhere in your computer's memory is the sequence of numbers: [83] [101] [116] [104].*

From today's article I hope you got an idea that there is a series of memory cells in your computer and each cell holds a number in it. These numbers can be meaningful to humans in ways other than just how we understand numbers as numbers. We used the example of the ASCII encoding for the word "Seth" as a way to understand how a series of numbers can really be understood as a series of letters we can read.

Next time we talk about CS THEORY we'll talk about how memory cells are organized.

*Ok, so this webpage was probably encoded with some other number-to-text system than ASCII, but you get the point.

Wednesday, March 30, 2011

Next time, on ...

Hey, everybody. I haven't made any updates in the last week or so. I recently heard from a couple of smart people who have read through my posts so far that I'm not doing a good enough job of explaining everything plainly. Because of this, I'm going to do a "reboot" of this series of articles on the basics of computer science.

In writing this series I was attempting to do the opposite of what is usually done in computer science education. Usually, you start with the basics of creating a working computer program in a computer language like C or Java, and then eventually reveal the details under the hood that allow that program to be run on the computer.

I was trying to do the opposite of this, and talk about how things like binary math and computer memory work without giving them any context or enough explanation. In my reboot, I'm still going to do this, but I'm going to interleave a few different types of topics as I go. I'm also going to re-write on some of the topics I've already covered, trying to give them a better treatment than I did the first time. I'm not going to erase anything that has come before, because I think I have written some good things there, that maybe you should revisit as a reader after I've given more background.

Like I already mentioned, I'm going to discuss topics under large umbrella topics. At the beginning of each post title I will say in capital letters which of these umbrellas that post falls under. For example, my first post "Of bits" would be named "MATH: Of bits" under this new naming convention.

So that's the plan moving forward. I think the first thing I will talk about with this new plan will be titled "CS THEORY: Memorable numbers." I'm going to forget talking about binary numbers until later, and just focus on the basic concept of what computer memory is and what it does.

I've also received a request to talk about more electricity-oriented topics (digital circuits and whatnot). I am not very well qualified to explain the circuits that make computers happen, but I will take a stab at it after I have read up on the subject a little more. Until next time.

Wednesday, March 23, 2011

Two's complement examples

My friend suggested that I should talk more about one's complement and endianness. I think that endianness is confusing, so I don't want to talk about it, other than to say that it deals with the order of bytes in a multi-byte number in memory (click the link if you want to get confused more). I will mention one's complement, though, because I think it's interesting.

Two's complement is the system where in order to turn a number into its negative you flip all the bits and add one. One's complement is similar, except you don't add one at the end of the number conversion. For example, in one's complement the number -11 would be represented like this:

0b0000 1011 (that's positive 11)
flip all the bits
0b1111 0100 (and that's -11 in one's complement)

This looks a lot easier, right? Well, the downside is that there's an additional step when adding negative numbers with this representation. After you do the normal addition, you have to look at the final carry-out and add that value back to the one's place of your result. That right there isn't that big of a deal, and adders could be built to automatically do that.

The other bad thing about one's complement is that it's possible to have both positive and negative ZERO. Both 0b0000 0000 and 0b1111 1111 mean "zero" but one of them is positive and one is negative (remember to look at the sign bit to determine whether it's positive or negative). That's weird and it means that we can only represent 255 numbers with a single byte instead of 256 that we can represent using two's complement.

I'm not going to do any examples of one's complement, because it isn't used in real life. Instead, I'm going to do a couple examples of two's complement. First, let's transform positive 56 into negative 56, and back again. I didn't mention it last time, but you can tell the value of a negative two's complement number by repeating the same steps that made transformed it from positive to negative in the first place. Like this:

0b0011 1000 (+56)
flip all the bits
0b1100 0111
and add 1
0b1100 1000 (now we have -56)

Now let's turn that negative number back into a positive:

0b1100 1000 (-56)
flip all the bits
0b0011 0111
and add 1
0b0011 1000 (and we're back to +56)

Now let's subtract 15 from 32 (32-15=?).

Here's 32:
0b0010 0000 (+32)

Now let's figure out what -15 is:
0b0000 1111 (+15)
flip all the bits
0b1111 0000
and add 1
0b1111 0001 (-15)

Now we add them together, because 32 + (-15) is the same as 32-15.

+0b0010 0000 (32)
+0b1111 0001 (-15)
=0b0001 0001 (17)

This time we're going to totally ignore the final carry-out. Look at the sign bit, and notice that it's a 0, that means we ended up with a positive number. In fact, that number we ended up with is 17, which is the right answer. 32-15=17.

Just to review, this is how subtraction is accomplished in real computers, and how negative numbers are represented in real computers. We use two's complement because it lets us represent a wide range of positive and negative values (it doesn't have that silly +0/-0 issue one's complement has), and it is terribly, terribly convenient for allowing us to use regular multi-bit adders to perform subtraction.

I don't have a plan already in my head for what to present next, but if you ever have a recommendation, let me know in the comments.

Monday, March 21, 2011

Two's compliment

Two thinks you look very nice today. Oh wait, it's not supposed to say "compliment." I meant to say "two's complement" (with an 'e' in the middle of that second word), as in "those shoes would complement two's outfit very well." That time I used the right word at least, but we're still not supposed to be talking about fashion.

What's really going on is that I'm finally going to talk about negative binary numbers. I kept putting this off because I think it's confusing, and I don't want anyone who reads this to feel confused (other than the confusion I intentionally created with the first paragraph). There are two things about two's complement numbers that are difficult for people (including me) to remember and understand. The first is how to convert a positive binary number into its negative, two's complement form. The second is why we would even want to do that, and what benefit comes from going to the trouble.

It's simple to represent a negative number. Just take the original number and put a '-' in front of it. Unfortunately, with binary numbers in computers we don't have minus symbols. All we have are 0s and 1s. So that means that using only 0s and 1s we somehow have to differentiate between positive and negative numbers. In the two's complement form we do this by kind of pretending that the "most significant bit" is the positive/negative symbol (if it's 0 it's a positive number, and if it's 1 it's a negative number). We call this the "sign bit."

I have to clear up something from that last sentence. The "most significant bit" is the left-most bit when writing out the binary number, and its value has the biggest effect on the final value of the entire binary number. For example, in an 8-bit number, the "128s place" bit is the most significant bit. The 1 in 0b100 0000 is the most significant bit.

So now we know to look at the most significant bit to see if we're dealing with a positive or negative number, but the trick with two's complement is what's going on with all the rest of the bits. It's not enough to put a 1 for the sign bit and then read the rest of the bits the same as we always did. Two problems with that. First, it would be too easy. Second, it makes addition and subtraction of negative numbers more complicated than I'll show it is in a minute.

With two's complement you have to do the following to turn a positive number into its negative (I've put this off until the last possible moment): First, invert all the bits (turn all the 0s to 1s and vice versa). Second, add 1 to it. I know. It doesn't make sense. It sounds stupid. Sure, maybe the first part, flipping all the bits, can be justified because negative numbers live in Bizzaro World, so maybe everything's just opposite of a positive number. But why add 1 after that? I don't have a good explanation of that, other than that the resulting representation of the negative binary number is very convenient.

Let's do a couple of examples using 8-bit binary numbers to clear our brains.

0b0000 0001 (decimal 1)
invert all the bits
0b1111 1110
and add 1
0b1111 1111
So the two's complement representation of -1 is 0b1111 1111. Note that the most significant bit, or sign bit, is 1, meaning this is a negative number.

0b0100 1111 (decimal 79)
invert all the bits
0b1011 0000
and add 1
0b1011 0001
Again, notice that the sign bit is one, so this is a negative number.

Before I tell you the big payoff we get for going to all this trouble to use two's complement, I'll tell you about the drawback, or rather the perceived drawback of doing this. If we have an 8-bit number, and we're using one of those bits to be the sign bit (in other words, that bit is tied up telling us whether the number is positive or negative, it can't tell us about the magnitude of the number any more), then we only have 7 bits left to tell us about the magnitude of the number. With 7 bits we can only represent positive numbers from 0-127, as opposed to the 0-255 we could represent with 8 bits.

So we have about half the positive numbers we had before, BUT we also have all the negative numbers, so it turns out we can represent just as many numbers with a "signed" binary number, it's just a different range of numbers than an "unsigned" binary number can represent. With a signed 8-bit number we have the positive numbers 1 to 127, the number 0, plus the negative numbers -1 to -128, for a total of 256 different numbers we can represent. It's kind of a weird thing that we can count 1 further into the negative numbers than we can count into the positive numbers when using a signed binary number. That's one of those weird things you just have to remember.

Okay, now we'll finally talk about the payoff. That multi-bit adder we designed many articles ago can be used, no problems, no modifications necessary, on signed two's complement numbers. That's it. That same adder we designed to add positive, unsigned-only, binary numbers can be used to add signed, two's complement numbers also, AND IT TREATS THEM CORRECTLY AS NEGATIVE NUMBERS. Holy crap, it's like magic. Seriously, when I first learned this I was like "yeah, but, there's no way that actually is true." Except it is.

In fact, it's so magical that we can just design one adder, and then we can use it to add any combination of positive or negative numbers together. We can add two positive numbers, two negative numbers, or one of each. Subtraction also becomes very easy. You just convert the number you want to subtract into a negative number (using the exact same process we did above), and then add it to the number you want to subtract from. BAM! Subtraction.

This article has already grown far too long, so I'm going to hold off on doing examples of subtraction until my next entry.

Wednesday, March 16, 2011

Our current picture

We've covered a lot of ground, shared a few laughs, and now I want to review where we are now and how we got here before we move on to the next topic.

First we talked about single bits, how they are "things" (whatever that means) that are always in one of two states (on/off, up/down, 0/1, black/white, whatever). Bytes are series of 8 bits, and can be used to represent binary numbers. Binary numbers use base two (instead of base ten in decimal), so all of their digits are either 0 or 1, and these digits can very conveniently be represented by individual bits.

Once we had some understanding of binary numbers we moved on to performing math on those numbers. First we talked about the single-bit adder, and then how we can perform multi-bit addition using a series of single-bit carry adders.

Next we talked about computer memory. Computer memory is like a big long ordered list of numbers. Each number in the list has an address associated with its place in the giant list, so if you know a memory address you can look to see what number lives there. Loading means to read a number from the memory and storing means to put a number into the memory (destroying forever what was in that memory location before).

We also talked about what kinds of information those numbers in the memory can represent. They can be either instructions or data. Data can have any meaning given to it by the person programming the computer. For example a programmer could interpret a binary number in the memory as a few things: the number itself, a specific color, a location on a map, a kind of food, or anything really. It just depends on what the programmer wants for the data to stand for. It's up to the programmer to be consistent with how he treats the data, because the computer itself doesn't care about that at all.

Instructions are the other kind of information that can be stored in memory. Instructions tell the computer's processor what to do next. There's a special register in the processor called the program counter (PC) that tells the processor where, in the memory, its next instruction will come for. The processor fetches (or loads, in memory terms) that instructions, figures out what it says to do, and then does it. After this it moves on tot he next instruction (by increasing the PC) and repeats the process.

Or does it? We also learned about special processor instructions called jumps and branches which directly change the value of the program counter. Jumps unconditionally change the value of the PC to whatever the programmer wants. Branches might change the value of the PC, depending on some condition, like if one variable's value is greater than another variable's value. Once we knew about branching and jumping we could write just about any computer program under the sun, and we looked at an example that shows how we can create a loop using branches and jumps that adds a number to itself several times, essentially performing multiplication.

The last thing we talked about is the register file of the processor. Each processor has a number of registers that are temporary storage for whatever's in the memory, and act as the variables in all of the instructions the processor can execute. The registers don't have addresses the say way that memory locations do, but they still do have distinct names, like $r0 and $r17. Long, complicated programs consist of many loads from memory into registers, additions, branches based on the contents of registers, jumps, and stores back into memory. Even the most complicated programs can be broken down into these elemental pieces.

The last thing we talked about is one other operation that we can perform on registers, and that is the shift. Shifting can be done in either the right or left direction, and these operations are mathematically similar to division and multiplication. The catch is that a single shift operation can only multiply or divide by powers of two. If you want to multiply by numbers other than powers of two, then you need to get more creative. We'll talk about these ways at some future time.

And now we're here. If you missed any posts go back and start from the beginning. I'm trying to build on all of my previous posts, hopefully forming over time one big picture of computer science that anyone can understand.

Tuesday, March 15, 2011

Shifty eyes

Right shift, left shift. These are not just keys on your keyboard. They are the names of mathematical operations that are enormously convenient and helpful in the world of binary math.

First I'll explain what they are, and then I'll explain why they're so helpful. A shift in either direction basically amounts to picking up a binary number, moving it a few places to either the right or left and then plopping it back down. Remember, those "places" in binary numbers are the "2s place," the "4s place," the "8s place," and so on. If you shift a number to the right and some digits get shifted right off the end, then so be it. Similarly, if you shift a number to the left, you have to invent new digits in the low places of that number. Shifting is often represented with these symbols: ">>" for right shift, and "<<" for left shift (see, it's like a double arrow pointing in the direction of the shift).

Here's an example of a right shift of a single-byte binary number:

0b1100 0101 >> 4 = 0b0000 1100

and here's an example of a left shift of a single-byte binary number:

0b0001 1010 << 3 = 0b1101 0000

Do you see how it's like we just picked up the original number and plopped it down in a new place? The right shifted number ended up 4 places to the right, and the left shifted number ended up 3 places to the left.

Here are another couple of examples using decimal that will give you a hint as to why shifting is so helpful in computers:

4563 >> 2 = 45

23 << 3 = 23000

You'll notice in those decimal examples that shifting right by 2 is the same as dividing by 100 (ignoring the resulting remainder or fractions, we're just talking about integers here). Also notice that shifting left by 3 is the same as multiplying by 1000. So now we're starting to see how this is a really nice tool for a computer to have.

Let's look at single-byte binary numbers again and do 2 more concrete examples:

0b1000 0001 >> 1 = 0b0100 0000 (129 shifted right by 1 = 64)

0b0000 0011 << 4 = 0b0011 0000 (3 shifted left by 4 = 48)

Using that exponent notation again from an earlier post, a right shift is like dividing by 2^x (where x is the amount shifted), and a left shift is like multiplying by 2^x (where x is again the amount shifted). When left shifting (multiplying) you need to be careful because you might shift some 1s clear off the end of the binary number, and the result won't really be the correct answer for multiplication. Also, we can only work in powers of two here. If you want to multiply or divide by 3, for example, you cannot do that with a single shift like you can when multiplying or dividing by 2 or 4 (which are powers of 2).

The great thing about the shift operation is that it's so fast. It's even faster and easier than addition to do a in a processor. In our program loop branching example we multiplied 7 by 8, and it took us 8 trips through that loop, adding 7 to a variable each time around. With shifting we can get that same result in a single operation, like this:

0b0000 0111 << 3 = 0b0011 1000 (shift left by 3 because 8=2^3)

And that's it for shifting. I don't have a plan for my next post, but I'll make another one soon anyway.

Monday, March 14, 2011

File it under "register"

That's the best I could come up with, sorry. I'm talking about the title of this post. Sorry.

If you couldn't guess by that awesome title, this post is about the register file. Go ahead and read the title again. Get it? It was pretty good after all, I guess. Anyway, the register file is a part of a computer processor. The simplest way to describe what it is is to say that it's where the variables of a program live when they're not sitting in the computer's memory. The register file is the most temporary of all temporary storage. It is comprised of several "registers," which are just places in the processor that can store a number. If you have a 32-bit processor, then each of the registers in the processor can hold a 32-bit number. It is common for a processor to have from 8-32 (or more) registers.

Data is moved into the registers from memory, some kind of mathematical operation is done with that data, and then the program moves on, putting new data into the register file (the contents of the register file at any one moment doesn't stay relevant for very long usually). If we still need the old data (or the result of some mathematical operation), then we can store the contents of a register back to the memory.

Registers are parts of the computer that can store information, just like memory, however the registers do not have addresses like the memory does. Well, they don't have addresses in the same way, at least, but they do still have a way of telling them apart. Each register has its own name, which is commonly something like $r0 or $r3. The '$' is just convention for saying that we're talking about registers, and then the 'r' is just a further reminder. A 32-register processor would have registers with names from $r0-$r31.

If you'll remember when we talked about branch instructions, all of the branches depended on comparing two different variables to each other, and depending on the comparison, the branch is taken or not. Well, in a real computer we can't just talk about variables as abstractly as that. We have to be more concrete. We have to use registers. A real "branch when x is greater than y" instruction would really look more like this:

bgt $r3, $r4

It's up to the programmer to remember that $r3 really means 'x' and that $r4 really means 'y.'

Load and store instructions are the same way. We don't store 'x' into a memory location, and we don't load into another variable 'y.' Loads and stores look more like this:

ld $r3, 4008 (ld is short for "load")

st 4012, $r4 (st is short for "store")

Every time that you perform some action in a processor, it will involve registers. Are you adding two variables and saving the result in another variable? You're really talking about registers there.

You might think that 8 or 32 registers isn't enough to store all the variables of a program (especially really huge programs like web browsers or word processors). You're right, it's not enough. Most of the variables of a program sit in the computer's memory most of the time. They are only temporarily brought into the registers when they are needed, and then stored back into the memory when they're done. This is happening millions and billions of times every second in your computer right now.

Next time we'll talk about more mathematical operations we can do with binary numbers, but we still aren't going to talk about negative numbers yet. I need to gear up for that one.

Branching example

I'm going to show you a sample of how branching is used in a real program. In this example, I'm going to have several computer instructions, including their instruction addresses. I'm going to follow this format in future examples, so get used to it. The format will be like this:

4: add y, x, 10

Read this as saying that the address for that this instruction is 4 (that is, this instruction gets executed when the program counter is set to 4). The instruction the processor will execute is the add instruction. The next three are numbers or variables. y and x are variables. At any given moment they will have a real value (a number), but a variable's value can be changed. We can replace its old value with a new number. This specific example says to "add x and 10 together, and save the result of that addition in y."

We'll be following the convention that MIPS uses (look up MIPS on Wikipedia if you want more information at this time), where we list where the result of an instruction will be placed first, and then after that list any other inputs we need to compute the result. In this case, the result will be stored in y, and we need the additional inputs of x and 10 in order to complete the add instruction.

Hopefully that was enough to make this instruction format clear. Let's next go through a basic loop in a program. Imagine in this example that going into this loop, x already has the value of 0, y already has the value of 8, and z has the value of 0. I'll just write it out first, and then explain the parts that need to be explained.

(pretend there are instructions before instruction 104 here)
104: add z, z, 7
108: add x, x, 1
112: beq x, y, 120
116: j 104 (j is shorthand for "jump:)
120: (... just pretend more interesting stuff happens after the loop)

Remember in this example, z starts out at 0, x is 0, and y is 8. The add instruction for the z variable say "add z + 7 and store the result in z," and the add instruction for the x variable says "add x + 1 and store the result in x." So each time through our loop we're going to be doing those two things: adding 7 to z, and 1 to x. The branch instruction at 112 says that if x and y are equal to each other, then branch to instruction 120, otherwise, just move on to instruction 116 like normal. If that branch isn't taken, the PC does counts up to 116, and instruction 116 says to immediately jump back to 104, which is the beginning of the loop. This loop continues, each time increasing z by 7, and increasing x by 1, until the branch condition for "x equals y" becomes true, which happens when the loop has executed enough times that x is increased all the way to 8. At this point, the branch is taken and the PC is changed to 120, which is now outside the loop.

One thing you probably noticed is that the instructions are spaced 4 apart instead of just 1 apart. This is because most instructions need to much information to be executed than can be stored in just 1 byte. Again, following the MIPS convention, I've made all instructions 4 bytes long. This isn't how all computers work, but it works well for this example.

So what did this loop do? At the end of it, z is equal to 56, because we're added 7 to z 8 times. In other words, we've performed the multiplication of "8 times 7." I hope this example gives you an idea that you can create really complicated computer programs that are just built out of really simple building blocks.
Next time we'll talk about the register file, which will give more context the structure of processor instructions and give more insight into how computers are physically organized.

Monday, March 7, 2011

A Fork in the Road

A computer processor will just execute instruction after instruction, all in a row as its program counter (PC) increases, unless it encounters a "jump" or "branch" instruction. A jump or branch instruction directly changes the value of the PC, and therefore it changes where the processor's next instruction will come from. Last time we talked about jump instructions, which unconditionally change the PC to whatever value the programmer wants. This allows the programmer to do things like loop back to execute again instructions that the processor has already executed. In this post we'll talk about branch instructions.

Branch instructions are forks in the road. A branch is a jump instruction with a condition attached. When the processor encounters a branch instruction one of two things will happen. Either the "branch condition" will be met (or in other words, it will be true), or it will not be true. If the branch condition is true, then the branch is "taken," meaning that the jump is performed. In this case the PC gets changed to whatever the branch instruction says it should be. If the branch condition is false, and therefore fails, the branch is "not taken," and the PC is just incremented like normal to the next instruction (PC+1).

There are many kinds of branches, but they all involve comparing two things against each other, and then seeing if some property about that comparison is true. Here are some common types of branches, their associated computer instruction shorthand version, and what they mean in English. The shorthand all starts with "b" for "branch," and the remaining letters give you a hint about what kind of comparison the branch instruction is performing. I'm also going to write two variable names, x and y. Treat them the same as you did in algebra. They're just two numbers, ANY two numbers. Their value isn't important (just like algebra).

beq x,y - branch when x is equal to y
bez x - branch when x equals zero
bgt x,y - branch when x is greater than y
bge x,y - branch when x is greater than or equal to y
blt x,y - branch when x is less than y
ble x,y - branch when x is less than or equal to y
bgtz x - branch when x is greater than or equal to zero
bltz x - branch when x is less than or equal to zero

A lot of those are redundant if you think about it. For example, asking if "x is greater than y" is the same as asking if "y is less than or equal to x." There's no real reason why we need both versions, but having both might make it so that reading computer instructions is easier if you need to debug the code after writing it.

The next post will have a brief example of branching in action, showing a loop from a program that I just made up!