Binary, OCTal, HEXadecimal
Counting in Computerese: The Magic of Binary, Octal and Hexadecimal. Computers deal in the mystical numbering systems, like Hexadecimal, Octal and Binary. People get concerned over it sounding complex, but they are really quite simple. If you can read this article, you should have a really good understanding of what they are, and how they work.
Contents
Decimal (Base-10)
Decimal is the numbering system we are taught in school. Basically it means base-10 -- or that there are 10 unique symbols (0 to 9), then we roll over. When we rollover, it takes another column (digit) to represent the number we want (just like a car odometer) -- or in other words, it takes two digits (a ten's column, and a one's column) to represent a number that is greater than 9. The value in the tens column, really means that number multiplied by ten, and the value in the ones column means that value multiplied by 1. When we go beyond the tens column (greater than 99), we have to roll-over into a 3rd column -- the hundred's column -- where we multiply that digit by 100, and so on.
In decimal, each significant digit we add is ten times larger than the previous digit. So each column is 10 to the "power" of that columns position. ("Power of", just means multiplied by itself that many times -- or exponent). So the ten's column is 10 x the number. The hundreds column is 10 x 10 x the number (or 100 times the number -- or also the same as 10 times itself two times). The thousands column is 10 x 10 x 10 (or 1,000 times the number -- or 10 times itself three times), and so on. So hopefully you can see the pattern.
The shorter way to represent this is 10n, with the 'n' representing how many times 10 should be multiplied by itself. (Sometimes the '^' symbol is used to represent to the power of, 10^2 means the same as 102).
Other Bases
Now this may be a weird way of representing numbers to you, but it actually more "pure" to the numbers themselves. This is just what the symbols really mean, you just haven't had to think about it this way. Our decimal numbering system is just an arbitraty grouping in Base-10, and we could have as easily chosen a different base. The reason we (Western Culture and most of humanity) chose base-10 seems to be because that is how many digits (fingers) that we have. (There is a joke about the dumb guy that kept getting arrested for indescency when he would count to 11 in public). There are many other counting systems -- some for computers, some for other culutres. But let's take what we know, and apply it to other systems (like those used in computers).
Binary is just Base-2 (or digital). There are only two seperate symbols or states to represent a number (0 and 1; also known as "on" or "off"). If you want to represent a number greater than one, you must go to the next column.
So a conversion table from base-2 to base-10 would look as follows.
- 0000 = 0
- 0001 = 1
- 0010 = 2
- 0011 = 3
- 0100 = 4
- 0101 = 5
- 0110 = 6
- 0111 = 7
- 1000 = 8
- 1001 = 9
- 1010 = 10
- ...
Hopefully, you can see the pattern. Each time we fill up a column (it's a 1) and we go beyond the greatest symbol (in this case a 1), we roll over (carry) to the next column, and clear the columns below it.
Remember how columns mean things in Decimal? Well they mean something in binary. Each column represents a value in binary as well -- they're just different values.
To do binary conversion, we just multiply a column by 2 to the power of it's column position. Then we add up all the values. So in the example above (1101 0110), converting to decimal we would have 128 + 64 + 16 + 4 + 2, or the decimal value 214.
If we rolled over to the ninth column, the bit (digit) would represent, 2^8, or 256's column. The next column after that would be the 512's, and so on. (Just keep multiplying the value by 2).
The computer always thinks in "binary" because computers think with little microscopic switches, that are either on or off (zero or one).
Octal (Base-8)
Now binary is not space efficient, it takes 5 columns (10011) to represent a number that takes only 2 colums in decimal (19). Imagine you have a huge decimal number, like 14,256,324, in binary. That would be a long run of 1's and 0's that would be hard to look at, and if you misplaced one bit, your number would be wrong.
Well, with decimal, we group numbers to help align the columns. Americans use a comma, but some other coutries use a "." or a space to seperate the groupings for thousands, millions, and so on. We place one of these characters every three columns (to make things less confusing). So people decided to apply the same rules to binary numbers as well -- and programmers bundled bits in to groups of 3. So a number like 18 decimal as follows - 010 010 (binary). Or a number like 187 would be - 010 111 011 (binary). At least that is easier when dealing with large mess of 1's and 0's.
Then some people figured, if we have them grouped in 3's, why not just change the base as well? With 3 bits, we can represent a value between 0 - 7. This is called base-8 (or Octal), since there are 8 posibilities for each column (remember to count zero as a possibility).
So a conversion table from base-2 to base-8 would look as follows:
- 000 000 = 0
- 000 001 = 1
- 000 010 = 2
- 000 011 = 3
- 000 100 = 4
- 000 101 = 5
- 000 110 = 6
- 000 111 = 7
- 001 000 = 10
- 001 001 = 11
- 001 010 = 12
- ...
Notice that when we roll over beyond our 3 bit grouping (or 8th possibility) we need another column to represent our number (both in binary and in octal). So Binary and Octal align nicely -- you just group binary digits (called bits) into groups of 3 to convert to Octal. To go the other way, you just expand an Octal number into 3 columns of binary. Computers still think in the binary form, but we can express their form in a much more space efficient manner.
Now to convert to decimal, just remember what each column in octal means; each column is just 8 times greater than the previous column (Base-8).
Multiply a column by 8 to the power of it's column position. Then we add up all the values. So in the example above (1234), converting to decimal, we would have (1 x 512) + (2 x 64) + (3 x 8) + (4 x 1), or the decimal value 668. If we rolled over to the fifth column, the value (digit) would represent, 8^5 (8 x 8 x 8 x 8 x 8), or 4,096 (decimal). The next column after that would 8 times larger (32,768) and so on. (Just keep multiplying the value by 8).
Hexadecimal (Base-16)
Computers think in bits (binary), but they group bits into larger chunks, like 4 bits (a nibble), 8 (called a Byte), 16, or 32 bits, and so on. The computers really work with that many bits all at once; this enables them to work with larger numbers that are more relevant to problems (than just zero and one).
The size of those groups define how "big" a computer is, and why we say things like a computer is a "32 bit" computer -- it works with 32 bits at a time. (Notice that we always choose nice even powers of 2 -- which makes perfect sense to computers).
Also notice that our previous construct (Octal), we groups bits into groups of 3. There is a problem there -- 3 bit groupings don't match up very well with 8 or 16 bit computers. To get 8 bits (256 possibilities) into an octal computer takes 3 columns (377 octal = x11 111 111 binary), with the largest column only allowed to go from 0-3 and not 0-7. So while Octal is more efficient than binary, it doesn't fit with 8, 16 or 32 bit computers very well. There has to be a better way -- and there is -- Hexadecimal.
Why not group bits by four bits instead of three? That is hexadecimal (hex for short), it lines up with 4 bits, 8 bits, 16 bits, 32 bits, and so on. The perfect match.
There is one problem, though. Hexadecimal has 16 unique values (24, or 2 x 2 x 2 x 2) for each column (not 2 like binary, 8 like octal, or 10 like decimal). We only have enough symbols for 10 unique values (0, 1, 2, 3, 4, 5, 6, 7, 8, 9). So where do we get the other 6 symbols? We borrow from our alphabet of course. When we run out of number characters, we just use letters as numbers, and we get the following series (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F) -- then we roll over to the next column.
So a conversion table from base-2 (binary) to base-16 (Hex) would look as follows.
- 0000 = 0
- 0001 = 1
- 0010 = 2
- 0011 = 3
- 0100 = 4
- 0101 = 5
- 0110 = 6
- 0111 = 7
- 1000 = 8
- 1001 = 9
- 1010 = A
- 1011 = B
- 1100 = C
- 1101 = D
- 1110 = E
- 1111 = F
When we roll over beyond our 4 bit grouping, and we need another column to represent our number (both in binary and in hex).
Each column represent a value 16 times greater than the previous column (Base-16). To do Hex conversion (to decimal), we would just multiply a column by 16 (to the power of it's column position) -- then add up the values. So in the example above (F3D9), converting to decimal, we would have (15 x 4096) + (3 x 256) + (13 x 16) + (9 x 1), or the decimal value 62,425. Notice that once we go beyond 9, the letters just represent values; A=10 (decimal), B=11, C=12, D=13, E=14, and F=15. After that, we carry to the next column, and things proceed as normal. If we rolled over to the fifth column, the value (digit) would represent, 164, or 65,536 (decimal). The next column after that would 16 times larger (1,048,576) and so on. (Just keep multiplying the value by 16).
With Hex everything is lined up nicely (in 4 bit chunks), and is far better than those weird left overs in Octal. It is more space efficient than even decimal for presenting a value. A large decimal number like 65,535 is a nice grouping of Hex characters [FFFF] -- and that is far easier to read than a stream of 1's and 0's. Because of alignment and space issues, Hexadecimal has replaced Octal almost everywhere (about 20 years ago). Computers still deal in binary (or binary groupings), but humans represent most of those values in hex.
Conclusion
So I hope this math lesson wasn't too boring, and gave you some insights into all the mystical numbering systems of computers. Now days, it is usually low-level programmers (people that write Device Drivers, Operating Systems, etc.) that need to worry about Hexadecimal or Binary -- and Octal is pretty much gone the way of the Dinosaur. But it is good reference for all programmers, and certainly doesn't hurt many users to know and understand either. So there is nothing hard about these numbering systems, it just takes a little time to do conversions (for humans). Hopefully you enjoyed learning about it. It will give you a great foundation for understanding memory counting, storage sizes, and other things related to computers as well (just remember, computers like powers of 2). It is also an amusing math lesson to remind ourselves of our arbitraty counting system (decimal), and enlightening to realize that there are other ways to group and count numbers.
Tech | Programming : Anti-aliasing • Basics of BASIC • Big or Little Endian • Command Line Interface • Databases • Digitized Sound • Enterprise Tools • FUD • Forward Compatibility • Free Features • Hack, Crack or Phreak • Hiring Programmers • History of Visual Basic • How does compression work? • MHz or GHz • RISC or CISC • Raster Images • Software Consultants • Software Development Live Cycle • Synthesized Sound • UNIX • What is MP3? • What is a WebApp? • Why is software so buggy? •
2002.04.03