Language: EN

cpp-null-terminated-string

Null-terminated string in C++

One of the ways to handle strings in C++ is through what are called null-terminated strings.

This model was directly inherited from the C language. Although in C++ std::string is a more powerful, convenient, safe (and typically safer) alternative.

Whenever possible, we should prefer std::string over a null-terminated string.

However, it is still used in low-level programming, embedded systems, and libraries. So we need to learn how to use it 👇.

What is a Null-Terminated String

A null-terminated string is simply a fixed-size array of characters large enough to hold our text.

This null character (\0) is a special character that serves as a marker to indicate the end of the string, allowing functions that operate on strings to know where to stop.

For example, the string "Hello" in memory would be represented like this:

['H', 'e', 'l', 'l', 'o', '\0']

Features:

  • Array of characters: Defined using the char[] type in C++.
  • Null character at the end: The '\0' character is mandatory (and must be handled with care).
  • Actual size: The actual size of the string must include the null character (i.e., it will be characters + 1).

Declaration of Null-Terminated Strings

We can declare null-terminated strings in the following ways:

char greeting[] = "Hello";
  • This creates an array of 6 characters ('H', 'e', 'l', 'l', 'o', and '\0').
  • The inclusion of the null character is automatic.
char greeting[6] = {'H', 'e', 'l', 'l', 'o', '\0'};

Here we manually specify the null character and the size of the array. If we omit '\0', the array will not be a valid string:

char greeting[] = {'H', 'e', 'l', 'l', 'o'}; // Incorrect as a null-terminated string

Why would someone want to do this? I don’t know, you tell me 😆

char empty[] = "";

This creates an array of a single character ('\0').

Basic Operations with Null-Terminated Strings

Handling null-terminated strings requires care, as we have to manage memory and the null character manually.

To copy strings, we use the strcpy function from the <cstring> library:

char source[] = "Hello";
char destination[10];

strcpy(destination, source);
  • strcpy copies source into destination, including the null character.

You must ensure that destination has enough space to avoid overflows.

To concatenate strings, we use strcat:

int main() {
char greeting[20] = "Hello, ";
char name[] = "World";

strcat(greeting, name);
  • strcat appends name to the end of greeting.

The greeting array must have enough space to store the complete result, including the null character.

The strlen function returns the length of the string, excluding the null character:

char text[] = "Hello";

std::cout << "Length: " << strlen(text) << std::endl;

Output:

Length: 5

To compare strings, we use strcmp:

char string1[] = "Hello";
char string2[] = "Hello";

if (strcmp(string1, string2) == 0) {
	std::cout << "The strings are equal." << std::endl;
} else {
	std::cout << "The strings are different." << std::endl;
}
  • It returns 0 if the strings are equal, a negative value if the first is less than the second, and a positive value otherwise.

Any incorrect operation can cause overflows or unexpected results (and your program crashing in an ungraceful way) 💥

Safe Handling of Null-Terminated Strings

Null-terminated strings are very susceptible to errors, especially in memory management and data corruption.

Instead of strcpy, use strncpy to limit the number of characters copied:

strncpy(destination, source, sizeof(destination) - 1);
destination[sizeof(destination) - 1] = '\0'; // Ensures the null character

Before concatenating strings, check that the buffer size is sufficient:

if (strlen(greeting) + strlen(name) + 1 < sizeof(greeting)) {
    strcat(greeting, name);
} else {
    std::cerr << "Error: insufficient size to concatenate." << std::endl;
}

Make sure not to access indices outside the array range:

char string[10];
string[10] = 'x'; // Error: out of valid range

Disadvantages of Null-Terminated Strings

Although they can be efficient, null-terminated strings have several limitations:

  • Manual memory management: The programmer must ensure that buffers are large enough.
  • Lack of safety: Errors such as buffer overflows are common if not handled properly.
  • Complex operations: Manipulating long strings or performing advanced operations requires a lot of additional code.

In summary, they are a pain to use 🤷. Although, they also have some advantages.