Notes on bits and bytes

Synopsis

A bit (binary digit) is the smallest unit of data in a computer. A bit has a single binary value, either 0 or 1.

A byte is a collection of bits. In most computer systems, there are eight bits in a byte.

When data is stored in more than one byte the additional byte is stacked to the left of the first byte.

The structure of an 8-bit Byte

Therefore a single 8-bit byte can hold eight 0s or 1s:

A Byte
1 2 3 4 5 6 7 8
0 or 1 0 or 1 0 or 1 0 or 1 0 or 1 0 or 1 0 or 1 0 or 1

So each of the eight boxes (bits) in a byte can have maximum two combinations. 0 or 1. Therefore each 8 bit byte can have a maximum of 256 unique combinations. (2^8 = 256)

Representing Data in Binary format.

Unsigned integers

(We’ll start with shorts for simplicity, although exactly the same principles apply to regular and long unsigned integers.)

Usually a short unsigned integer is stored in 16 bits wide (2 bytes). Each bit represents a decimal value:

Byte 2 Byte 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
32768’s column 16384’s column 8192’s column 4069’s column 2048’s column 1024’s column 512’s column 256’s column 128’s column 64’s column 32’s column 16’s column 8’s column 4’s column 2’s column 1’s column

So to represent the number 164 you would put a 1 in the 128’s column, a 1 in the 32’s column and a 1 in the 4’s column. Zero in the others.

short unsigned int foo = 164;
Byte 2 Byte 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
32768’s column 16384’s column 8192’s column 4069’s column 2048’s column 1024’s column 512’s column 256’s column 128’s column 64’s column 32’s column 16’s column 8’s column 4’s column 2’s column 1’s column
0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0

A Byte is made of 8 Bits, so the number of combination for
Byte would be 2^8 (256).

However, if asked the maximum (or highest) number represented then it
would be (2^8)-1 (255). As the first number is 0, not 1

Storing larger numbers

Exactly the same principle applies for regular integers as it does for shorts. Regular integers are usually stored in 32 bits wide (8 bytes). This gives the ability to store larger numbers. Here the columns for each of a 32 bits decimal representations and what 30,287 would look like in binary format:

unsigned int foo = 30287;
Byte 4 Byte 3 Byte 2 Byte 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
2147483648’s column 1073741824’s column 536870912’s column 268435456’s column 134217728’s column 67108864’s column 33554432’s column 16777216’s column 8388608’s column 4194304’s column 2097152’s column 1048576’s column 524288’s column 262144’s column 131072’s column 65536’s column 32768’s column 16384’s column 8192’s column 4069’s column 2048’s column 1024’s column 512’s column 256’s column 128’s column 64’s column 32’s column 16’s column 8’s column 4’s column 2’s column 1’s column
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1 1 0 0 1 0 0 0 1 1 1

Signed integers

Almost always in modern computer science signed integers are stored using two’s compliment (although there are other methods). This is because two’s compliment does away with the negative zero requirement. This video explains it incredibly well:

Floating point numbers

TODO

Chars

The ASCII Example

The extended American Standard Code for Information Interchange (ASCII) is a character-encoding scheme originally based on the English alphabet that encodes 256 unique characters*

The computer uses a single (8-bit) byte to store each character in memory. So for example here is how the characters “A” and “$” are stored:

Binary Number Symbol
01000001 A
00100100 $

This in essence this is how computers store char values. Each value is mapped to a binary number.

Like anything, as well as being able to represent symbols in base 2, they can also be represented in base 10, 8 & 16.

Symbol Decimal Octal Hexidecimal Binary
$ 36 44 24 100100
j 106 152 6A 1101010

So, you can represent the $ symbol in any of the following ways:

char charSymbol = '$';
printf("Value of charSymbol is: %c \n", charSymbol); // outputs $

char charDec = 36;
printf("Value of charDec is: %c \n", charDec); // outputs $

char charOct = 044;
printf("Value of charOct is: %c \n", charOct); // outputs $

char charHex = 0x24;
printf("Value of charHex is: %c \n", charHex); // outputs $

char charBin = 0b100100;
printf("Value of charBin is: %c \n", charBin); // outputs $

*Some of which are no longer or rarely used and 33 are non-printing control characters (many now obsolete). Although there are 256 characters in the set usually only the first 127 are used. The full table can be found here: [ascii-code.com](http://www.ascii-code.com)