Language: EN

regex-literales-y-caracteres-especiales

Literals and Special Characters in Regex

To start working with Regex queries, the first thing to do is to begin with the two simplest patterns: literals and special characters.

Let’s take a look at them 👇

What are literals in Regex

Literals are simply exact patterns that we are looking for in a string of text (i.e., the same characters, in the same order).

When we write a literal in a regular expression, we are indicating that we want to find that exact text in the text we are evaluating.

For example, if we want to find the word "hello" in a string of text, we can write:

hello

This will match any exact occurrence of the word "hello" in the text. Let’s see it in action,

hello, how are you?

Hello, world!

hello123 and hello world

In the example, we can observe that:

  • In the first case, the word "hello" matches exactly with the string.
  • In the second example, there is no match, as the literal "hello" does not account for the uppercase "H".
  • In the third case, it matches twice: the first within "hello123" and the second in "helloworld".

If we wanted to ignore case sensitivity, we would use a modifier (we will see this in its own article).

Literals are the simplest searches. They start to get interesting when mixed with special characters and quantifiers.

What are special characters in Regex

Special characters are those that can match more than one character (for example, all letters, or all digits).

These characters allow us to create more complex patterns than simply searching for literals.

Here are some of the most commonly used special characters in Regex:

Here is the list converted into a table:

SymbolMatches
.Any character
\wAny alphanumeric character
\WAny character that is not alphanumeric
\dAny digit
\DAny character that is not a digit

Alphanumeric means a letter or number, i.e., a-z, A-Z, or 0-9.

Let’s see it with an example

abc123 123!@#

Date: 2023-09-15

There are 25 people here
  • The use of \w finds all alphanumeric characters, while \W looks for non-alphanumeric characters.
  • On the other hand, \d looks for numbers, and \D looks for anything that is not a number.

In the previous example, try entering \w, \W, \d, or \D.

Escaping special characters

If we want to use a special character as a literal, we need to escape it using a backslash (\).

Escaping means treating a special character as if it were a normal literal, removing its special meaning.

For example, if we wanted to search for a literal period (.) in a string of text, we cannot just put . because in RegEX a period means “any character”.

So we must escape it, like this:

\.

Let’s see it in an example

file.txt

No period here

Version 1.0.3

In this example,

  • In the first line, it matches only the character "." that appears in "file.txt".
  • In the second line, since there is no period in the text, there are no matches.
  • In the third line, the pattern matches all periods in "1.0.3", as they are present in the text.