One of the ways to handle text strings in C++ is through the so-called null-terminated strings.
This model was inherited directly from the C language. Although in C++ std::string
is a more powerful, convenient, safe (and generally safer) alternative.
Whenever possible, we should prefer std::string
over a null-terminated string.
However, it is still used in low-level programming, embedded systems, and libraries. So we need to learn how to use it 👇.
What is a Null-Terminated String
A null-terminated string is simply a fixed-size array of characters large enough to hold our text.
This null character (\0
) is a special character that serves as a marker to indicate the end of the string, allowing functions that operate on strings to know where to stop.
For example, the string "Hello"
in memory would be represented as:
['H', 'e', 'l', 'l', 'o', '\0']
Characteristics:
- Array of characters: Defined using the
char[]
type in C++. - Null character at the end: The character
'\0'
is mandatory (and must be handled carefully). - Real size: The real size of the string must include the null character (i.e., it will be characters + 1).
Declaration of Null-Terminated Strings
We can declare null-terminated strings in the following ways:
char greeting[] = "Hello";
- This creates an array of 6 characters (
'H'
,'e'
,'l'
,'l'
,'o'
, and'\0'
). - The inclusion of the null character is automatic.
char greeting[6] = {'H', 'e', 'l', 'l', 'o', '\0'};
Here we manually specify the null character and the size of the array. If we omit '\0'
, the array will not be a valid string:
char greeting[] = {'H', 'e', 'l', 'l', 'o'}; // Incorrect as a null-terminated string
Why would someone want to do this? I don’t know, you tell me 😆
char empty[] = "";
This creates an array of a single character ('\0'
).
Basic Operations with Null-Terminated Strings
Handling null-terminated strings requires care, as we have to manage memory and the null character manually.
To copy strings, we use the strcpy
function from the <cstring>
library:
char source[] = "Hello";
char destination[10];
strcpy(destination, source);
strcpy
copiessource
todestination
, including the null character.
You must ensure that destination
has enough space to avoid overflows.
To concatenate strings, we use strcat
:
int main() {
char greeting[20] = "Hello, ";
char name[] = "World";
strcat(greeting, name);
strcat
appendsname
to the end ofgreeting
.
The array greeting
must have enough space to store the complete result, including the null character.
The strlen
function returns the length of the string, excluding the null character:
char text[] = "Hello";
std::cout << "Length: " << strlen(text) << std::endl;
Output:
Length: 5
To compare strings, we use strcmp
:
char string1[] = "Hello";
char string2[] = "Hello";
if (strcmp(string1, string2) == 0) {
std::cout << "The strings are equal." << std::endl;
} else {
std::cout << "The strings are different." << std::endl;
}
- It returns
0
if the strings are equal, a negative value if the first is less than the second, and a positive value otherwise.
Any incorrect operation may cause overflows or unexpected results (and your program may close in an ungraceful manner) 💥
Safe Handling of Null-Terminated Strings
Null-terminated strings are very susceptible to errors, especially in memory management and data corruption.
Instead of strcpy
, use strncpy
to limit the number of characters copied:
strncpy(destination, source, sizeof(destination) - 1);
destination[sizeof(destination) - 1] = '\0'; // Ensures the null character
Before concatenating strings, check that the buffer size is sufficient:
if (strlen(greeting) + strlen(name) + 1 < sizeof(greeting)) {
strcat(greeting, name);
} else {
std::cerr << "Error: insufficient size to concatenate." << std::endl;
}
Make sure not to access indices outside the range of the array:
char string[10];
string[10] = 'x'; // Error: out of valid range
Disadvantages of Null-Terminated Strings
Although they can be efficient, null-terminated strings have several limitations:
- Manual memory management: The programmer must ensure that buffers are large enough.
- Lack of safety: Errors such as buffer overflows are common if not handled correctly.
- Complex operations: Manipulating long strings or performing advanced operations requires a lot of additional code.
In summary, they are a pain to use 🤷. Although, they do have some advantages.
The main one is that we don’t need to use the STD library (or another), which can be a bit large for certain hardware (especially embedded systems)