📅 Thu, Nov 21, 2024 ⏲️ ~ 4 minutes Advanced

Null-terminated string in C++

One of the ways to handle text strings in C++ is through the so-called null-terminated strings.

This model was inherited directly from the C language. Although in C++ std::string is a more powerful, convenient, safe (and generally safer) alternative.

Whenever possible, we should prefer std::string over a null-terminated string.

However, it is still used in low-level programming, embedded systems, and libraries. So we need to learn how to use it 👇.

What is a Null-Terminated String

A null-terminated string is simply a fixed-size array of characters large enough to hold our text.

This null character (\0) is a special character that serves as a marker to indicate the end of the string, allowing functions that operate on strings to know where to stop.

For example, the string "Hello" in memory would be represented as:

['H', 'e', 'l', 'l', 'o', '\0']

Characteristics:

Array of characters: Defined using the char[] type in C++.
Null character at the end: The character '\0' is mandatory (and must be handled carefully).
Real size: The real size of the string must include the null character (i.e., it will be characters + 1).

Declaration of Null-Terminated Strings

We can declare null-terminated strings in the following ways:

Direct Initialization

char greeting[] = "Hello";

This creates an array of 6 characters ('H', 'e', 'l', 'l', 'o', and '\0').
The inclusion of the null character is automatic.

Manual Initialization

char greeting[6] = {'H', 'e', 'l', 'l', 'o', '\0'};

Here we manually specify the null character and the size of the array. If we omit '\0', the array will not be a valid string:

char greeting[] = {'H', 'e', 'l', 'l', 'o'}; // Incorrect as a null-terminated string

Why would someone want to do this? I don’t know, you tell me 😆

Empty Strings

char empty[] = "";

This creates an array of a single character ('\0').

Basic Operations with Null-Terminated Strings

Handling null-terminated strings requires care, as we have to manage memory and the null character manually.

Copying a String

To copy strings, we use the strcpy function from the <cstring> library:

char source[] = "Hello";
char destination[10];

strcpy(destination, source);

strcpy copies source to destination, including the null character.

You must ensure that destination has enough space to avoid overflows.

Concatenating Strings

To concatenate strings, we use strcat:

int main() {
char greeting[20] = "Hello, ";
char name[] = "World";

strcat(greeting, name);

strcat appends name to the end of greeting.

The array greeting must have enough space to store the complete result, including the null character.

Calculating Length

The strlen function returns the length of the string, excluding the null character:

char text[] = "Hello";

std::cout << "Length: " << strlen(text) << std::endl;

Output:

Length: 5

Comparing Strings

To compare strings, we use strcmp:

char string1[] = "Hello";
char string2[] = "Hello";

if (strcmp(string1, string2) == 0) {
	std::cout << "The strings are equal." << std::endl;
} else {
	std::cout << "The strings are different." << std::endl;
}

It returns 0 if the strings are equal, a negative value if the first is less than the second, and a positive value otherwise.

Any incorrect operation may cause overflows or unexpected results (and your program may close in an ungraceful manner) 💥

Safe Handling of Null-Terminated Strings

Null-terminated strings are very susceptible to errors, especially in memory management and data corruption.

Use Safe Functions

Instead of strcpy, use strncpy to limit the number of characters copied:

strncpy(destination, source, sizeof(destination) - 1);
destination[sizeof(destination) - 1] = '\0'; // Ensures the null character

Validate Sizes

Before concatenating strings, check that the buffer size is sufficient:

if (strlen(greeting) + strlen(name) + 1 < sizeof(greeting)) {
    strcat(greeting, name);
} else {
    std::cerr << "Error: insufficient size to concatenate." << std::endl;
}

Avoid Out of Bounds Access

Make sure not to access indices outside the range of the array:

char string[10];
string[10] = 'x'; // Error: out of valid range

Disadvantages of Null-Terminated Strings

Although they can be efficient, null-terminated strings have several limitations:

Manual memory management: The programmer must ensure that buffers are large enough.
Lack of safety: Errors such as buffer overflows are common if not handled correctly.
Complex operations: Manipulating long strings or performing advanced operations requires a lot of additional code.

In summary, they are a pain to use 🤷. Although, they do have some advantages.

The main one is that we don’t need to use the STD library (or another), which can be a bit large for certain hardware (especially embedded systems)