One of the ways to handle strings in C++ is through what are called null-terminated strings.
This model was directly inherited from the C language. Although in C++ std::string
is a more powerful, convenient, safe (and typically safer) alternative.
Whenever possible, we should prefer std::string
over a null-terminated string.
However, it is still used in low-level programming, embedded systems, and libraries. So we need to learn how to use it 👇.
What is a Null-Terminated String
A null-terminated string is simply a fixed-size array of characters large enough to hold our text.
This null character (\0
) is a special character that serves as a marker to indicate the end of the string, allowing functions that operate on strings to know where to stop.
For example, the string "Hello"
in memory would be represented like this:
['H', 'e', 'l', 'l', 'o', '\0']
Features:
- Array of characters: Defined using the
char[]
type in C++. - Null character at the end: The
'\0'
character is mandatory (and must be handled with care). - Actual size: The actual size of the string must include the null character (i.e., it will be characters + 1).
Declaration of Null-Terminated Strings
We can declare null-terminated strings in the following ways:
char greeting[] = "Hello";
- This creates an array of 6 characters (
'H'
,'e'
,'l'
,'l'
,'o'
, and'\0'
). - The inclusion of the null character is automatic.
char greeting[6] = {'H', 'e', 'l', 'l', 'o', '\0'};
Here we manually specify the null character and the size of the array. If we omit '\0'
, the array will not be a valid string:
char greeting[] = {'H', 'e', 'l', 'l', 'o'}; // Incorrect as a null-terminated string
Why would someone want to do this? I don’t know, you tell me 😆
char empty[] = "";
This creates an array of a single character ('\0'
).
Basic Operations with Null-Terminated Strings
Handling null-terminated strings requires care, as we have to manage memory and the null character manually.
To copy strings, we use the strcpy
function from the <cstring>
library:
char source[] = "Hello";
char destination[10];
strcpy(destination, source);
strcpy
copiessource
intodestination
, including the null character.
You must ensure that destination
has enough space to avoid overflows.
To concatenate strings, we use strcat
:
int main() {
char greeting[20] = "Hello, ";
char name[] = "World";
strcat(greeting, name);
strcat
appendsname
to the end ofgreeting
.
The greeting
array must have enough space to store the complete result, including the null character.
The strlen
function returns the length of the string, excluding the null character:
char text[] = "Hello";
std::cout << "Length: " << strlen(text) << std::endl;
Output:
Length: 5
To compare strings, we use strcmp
:
char string1[] = "Hello";
char string2[] = "Hello";
if (strcmp(string1, string2) == 0) {
std::cout << "The strings are equal." << std::endl;
} else {
std::cout << "The strings are different." << std::endl;
}
- It returns
0
if the strings are equal, a negative value if the first is less than the second, and a positive value otherwise.
Any incorrect operation can cause overflows or unexpected results (and your program crashing in an ungraceful way) 💥
Safe Handling of Null-Terminated Strings
Null-terminated strings are very susceptible to errors, especially in memory management and data corruption.
Instead of strcpy
, use strncpy
to limit the number of characters copied:
strncpy(destination, source, sizeof(destination) - 1);
destination[sizeof(destination) - 1] = '\0'; // Ensures the null character
Before concatenating strings, check that the buffer size is sufficient:
if (strlen(greeting) + strlen(name) + 1 < sizeof(greeting)) {
strcat(greeting, name);
} else {
std::cerr << "Error: insufficient size to concatenate." << std::endl;
}
Make sure not to access indices outside the array range:
char string[10];
string[10] = 'x'; // Error: out of valid range
Disadvantages of Null-Terminated Strings
Although they can be efficient, null-terminated strings have several limitations:
- Manual memory management: The programmer must ensure that buffers are large enough.
- Lack of safety: Errors such as buffer overflows are common if not handled properly.
- Complex operations: Manipulating long strings or performing advanced operations requires a lot of additional code.
In summary, they are a pain to use 🤷. Although, they also have some advantages.
The main one is that we don’t need to use the STD library (or another), which can be a bit large for certain hardware (especially embedded systems).