We continue to delve deeper into the advanced use of the serial port on processors like Arduino. In this post, we will see how to add frame delimiters and control characters to our transmission systems to make them more robust.
Previously, we have seen how to send bytes over the serial port as a convenient and “professional” way to perform communication. In the previous post, we saw that we will often use one or more structures defining a message we want to send or receive.
Now we want to expand the frame (the bytes we send in the communication) by surrounding the data bytes with a series of elements that increase the “quality” of the communication. One example is adding a checksum function to verify data integrity, something we will see in the next post.
Another example, which is what we will see in this post, is adding frame delimiters. That is, a specific “signal” or mark that identifies the start and end of the communication. While we’re at it, we would also like to be able to add certain control characters that have a special meaning.
As usual, all of this is already invented and they are precisely called control characters. In fact, we have been using them frequently since the first communication post every time we use ‘\n’ (line feed) or ‘\r’ (carriage return).
Here is a list of some of the available control characters with their hexadecimal value and their meaning.
| Code | Hex | Alt. | Meaning |
|---|---|---|---|
| NUL | 0 | \0 | Null |
| SOH | 1 | Start of Heading | |
| STX | 2 | Start of Text | |
| ETX | 3 | End of Text | |
| EOT | 4 | End of Transmission | |
| ENQ | 5 | Enquiry | |
| ACK | 6 | Acknowledge | |
| BEL | 7 | \a | Bell |
| BS | 8 | \b | Backspace |
| HT | 9 | \t | Horizontal Tabulation |
| LF | 0A | \n | Line Feed |
| VT | 0B | \v | Vertical Tabulation |
| FF | 0C | \f | Form Feed |
| CR | 0D | \r | Carriage Return |
| SO | 0E | Shift Out | |
| SI | 0F | Shift In | |
| DLE | 10 | Data Link Escape | |
| DC1 | 11 | Device Control One (XON) | |
| DC2 | 12 | Device Control Two | |
| DC3 | 13 | Device Control Three (XOFF) | |
| DC4 | 14 | Device Control Four | |
| NAK | 15 | Negative Acknowledge | |
| SYN | 16 | Synchronous Idle | |
| ETB | 17 | End of Transmission Block | |
| CAN | 18 | Cancel | |
| EM | 19 | End of medium | |
| SUB | 1A | Substitute | |
| ESC | 1B | Escape | |
| FS | 1C | File Separator | |
| GS | 1D | Group Separator | |
| RS | 1E | Record Separator | |
| US | 1F | Unit Separator | |
| SP | 20 | Space | |
| DEL | 7F | Delete |
In particular, we see that the control characters accepted for the start and end of a frame are, respectively, 0x02 (STX) and 0x03 (ETX). Of course, we are not forced to use these characters. In fact, sometimes you will see codes on the Internet using ‘H’ (Header) as the beginning of a header. There is no rule preventing its use, but, given that control characters exist, it is logical (and cleaner) to use the standard.
The operation is simple. When starting to send a frame, we will begin by sending the STX character, and at the end, ETX. We are increasing the frame size by two bytes, at the cost of better communication quality. The relative increase in frame size is smaller the larger the amount of data we are sending.
Here is an example of sending a data array with frame delimiters.
const char STX = '\x002';
const char ETX = '\x003';
const int data[] = {0, 50, 100, 150, 200, 250};
const size_t dataLength = sizeof(data) / sizeof(data[0]);
const int bytesLength = dataLength * sizeof(data[0]);
void setup()
{
Serial.begin(9600);
Serial.write(STX);
Serial.write((byte*)&data, dataLength);
Serial.write(ETX);
}
void loop()
{
}
While an example of a receiver would be the following,
const char STX = '\x002';
const char ETX = '\x003';
const int dataLength = 3;
size_t data[dataLength];
const int bytesLength = dataLength * sizeof(data[0]);
void setup()
{
Serial.begin(9600);
}
void loop()
{
if (Serial.available() >= bytesLength)
{
if (Serial.read() == STX)
{
Serial.readBytes((byte*)&data, bytesLength);
if (Serial.read() == ETX)
{
//performAction();
}
}
}
}
However, control characters are just bytes. How safe are these delimiters? That is, is it possible that we confuse them with a data byte containing 0x02 or 0x3? Is it possible that, even after losing bytes, we mistakenly interpret a data byte as a delimiter?
Well, indeed, that’s right, no system is completely robust. Adding frame delimiters improves the system, but does not make it infallible. In fact, we are not even checking data integrity, we are only trying to see if we maintain a certain degree of “synchronization.”
For the delimiters to fail, it must coincide that, after losing several or a few bytes, the byte received at the position where the delimiter should be has the same value. If we are working in an environment with many failures, this will not be enough to filter out all defects.
It may seem unlikely, but in reality, the possibility of incorrectly interpreting a control code is 1/256. However, the combined probability of incorrectly interpreting both the start and end of the message simultaneously is 1/65,536.
Nevertheless, the real advantage is that it provides a certain capacity for “resynchronization.” In a “normal” environment, in the event of a punctual packet loss, the system can detect the failure and eventually recover synchronization.
Of course, we can greatly improve the transmission process by adding a timeout or a checksum. We will see all this in the upcoming posts.
Download the Code
All the code from this post is available for download on Github.

