If you get into the binary system, there are certain words and terms that you will inevitably encounter. Bit, Byte, and to a lesser extent Char and Word.
All of them refer to amounts of information. They are easy to understand, although they are often mixed up. So let’s briefly explain them to learn to speak properly.
Bits
The Bit (binary digit) is the smallest unit of information in the binary system. The term “bit” was coined by mathematician John Tukey in 1946.
As its name implies, it is each of the digits that are part of a binary number. Each bit can have one of two possible values, 0
or 1
, and is used to represent a single piece of data or information.
In digital electronics, electronic devices rely on the presence or absence of voltage to indicate whether a bit is 0
or 1
.
Bytes
A Byte is a sequence of bits that the computer can manage “at once”. The term originated in the 1950s at IBM, where “byte” referred to the amount of information that a computer could “bite” at one time.
Currently, in most machines, a Byte has a length of 8 bits. But note that this is not always the case. There have been and still are machines where the Byte is 4, 6, 7, or 16 bits.
Choosing 8 bits as the standard was closely related to the need to encode characters, as we will see next. It was also an important factor that 8 is a power of 2 (it is 2³), so it was a convenient number.
However, nowadays, generally, most of the time when we say Byte we refer to 8 bits grouped together. With them, we can represent any number between 0 and 255 (2⁸ - 1).
However, it would be more correct to specifically call the set of 8 bits an Octet, to avoid confusion with other machines. In fact, this is done, for example, in some communication texts.
Char
A Char (from English “character”) is a set of bits that represent a character. It is not very well known, but the concept of Char and Byte has always been closely related.
Encoding and processing text has always been a requirement of computers and, especially in the beginning, it was not something so simple. The existence of machines with Bytes of 4, 6, 7, or 8 bits is precisely related to having the capacity to store a character. (Curious, isn’t it?)
In any case, in most systems and “almost always”, the length of a Char is the same as that of a Byte.
Words
A Word (palabra) is a set of Bytes that the machine uses internally to work “in blocks”. It was named that way because a “Word” is a set of Char. And we have already said that Char and Byte are almost the same.
Modern machines do not work internally with just one Byte, but rather they work with groups of several Bytes at once. Internally, they are designed to operate this way, handling larger blocks.
The size of the word is defined by the architecture of the processor and varies from machine to machine. In most modern computers, a word consists of 32 bits or 64 bits.
For example, a 32-bit machine will work with blocks of 4 Bytes at once, for example, to perform calculations or store memory positions. A 64-bit machine will work with blocks of 8 Bytes.
However, unless we are doing things at a very low level, in general, we will not have to worry too much about it (but it doesn’t hurt to know the term).
MSB and LSB
Two terms you will sometimes encounter are MSB and LSB. They are common acronyms in the context of binary data representation, such as in digital and computing systems.
In this context, “significant” is a synonym for “weight” in the number. The bits further to the left have greater “weight” because they correspond to higher powers of 2.
- MSB (Most Significant Bit): This term refers to the most significant bit in a binary number. The most significant bit is the first bit on the left.
- LSB (Least Significant Bit): This term refers to the least significant bit in a binary number. The least significant bit is the last bit on the right.
For example, in the 8-bit binary number 11010110
, the most significant bit (MSB) would be the first bit on the left. While the least significant bit (LSB) would be the 0
on the right.
Many times you will find the terms MSB and LSB in literature. It is a more precise and rigorous way than referring to the “bit on the left” or the “bit on the right”. It also protects us from the possibility that (who knows why) someone had the brilliant idea of storing a binary number “backwards”.
However, in this course, I will continue saying “bit on the left” and “bit on the right” because, in my opinion, it is easier to understand. (but if you encounter the terms, now you know what they mean).