📅 Mon, Apr 22, 2024 ⏲️ ~ 6 minutes Intermediate

Type value and reference

We have already seen what REFERENCES are, elements that contain a link to another element. If you haven’t seen it yet, it’s worth checking this out What is a reference.

Now let’s see what implications references have in the field of computer science and programming (in particular, in the management of our variables).

In fact, most languages divide their variables into two types:

Value type
Reference type

The concept of REFERENCE is very “core” to your computer. It plays a crucial role in memory management and its internal functioning.

These two types, value and reference, have distinct behaviors within a program, affecting many aspects (such as speed, mutability, dynamic memory).

So it’s important to understand it so you don’t mess up (and for general knowledge, which isn’t bad either). So let’s get to it 👇.

What are Value Types and Reference Types

Value Types

Value type variables are those in which the data is stored directly in the variable.

That is, your variable has a “little space” to hold its value.

tipo-valor

In general, value type is limited to the simplest types (primitive) of each language. In particular, they are usually only:

Integers, and floating-point numbers
Characters (not strings)
Booleans
Groupings of simple variables

This means that when we assign one variable to another, an independent copy of the data is made.

Reference Types

Reference type variables are those in which the variables contain references to the data instead of the data itself.

That is, your variable contains a link to some data that is “somewhere else”.

In reality, the REFERENCE does have a value. What happens is that the value is the way to get to the data (the address).

tipo-referencia

Reference types include:

Collections (arrays, lists, dictionaries…)
Objects
Complex data structures

That is, basically everything is REFERENCES. Except for numbers, characters… and a little more.

Differences between Value Types and Reference Types

The main difference between a value type and a reference type is how they behave when copied.

Let’s see it with an example. We’ll do the same experiment, with these steps:

We create a variable A and assign it a value
We create another variable B and set it equal to A
We modify B
We see what happened to A

In summary, we want to know if modifying B also modifies A.

Value Type

Let’s start with a value type. For example, an integer.

int A = 10;  // we create a variable A, and set its value to 10
int B = A;   // we create a variable B, and set its value to A

int B = 20;  // we change B. Did A change?

What is the value of A at the end of the process? In this case, A is 10 and B is 20.

With value type: Modifying B -> NO ❌ has not modified A

This is because A and B are value type, they are independent. Each has its own value.

tipo-valor-ejemplo

When copying the value of A into B, we have copied its value, but they remain separate variables. Any subsequent change does not affect the other because there is no link between them.

Reference Type

Now let’s do the same with a reference type. For example, let’s take an array (we will only work with position 0 of the array, enough to show what we want to see).

So we perform the same experiment:

int[] A = {10, 0, 0};  // we create a variable A
int[] B = A;           // we create a variable B, and set its value to A

B[0] = 20;             // we change B. Did A change?

What is the value of A[0]? In this case, A[0] is 20.

With reference type: Modifying B -> YES ✔️ has modified A

This happened because we are working with reference type. When we created the variable A, it was a reference to an array. When we created the new variable B, we set it equal to A. Therefore, we had two references pointing to the same Array.

tipo-referencia-ejemplo

Thus, if we modify anything in B, we are modifying it in A. Because both variables point to the same data.

If we wanted to make them independent again, we would have to assign B a different Array.

int[] A = {10, 0, 0};   // we create a variable A
int[] B = A;            // A and B point to the same data

int[] B = {10, 0, 0};   // now B points to another Array
B[0] = 20;              // if we change B, it no longer affects A

tipo-referencia-ejemplo-2

Internal functioning Advanced

The first thing you need to know is that memory is organized into cells. Each of these cells is identified with a number.

tipo-valor-referencia-interno-1

When we create a variable like A or B, the program takes this as an alias for memory addresses. Internally, the computer sees neither your aliases A nor B. It translates them to, for example, 0x17 and 0x21 (or whatever applies). And it only works with that.

When working with value type variables, these memory cells will hold the value of your variable. In our example, 10. But they are two independent cells. If I put 20 in one, the other is not affected.

tipo-valor-referencia-interno-2

When you have reference type variables, you also have two cells for your variables. But instead of having the value, this time they hold the memory address where the “real” data is.

In our example, let’s say we have a ARRAY. The computer has created this array at position 0x30. Our variables A and B have this position as their value.

tipo-valor-referencia-interno-3

So both A and B are modifying the same data, which is actually at memory position 0x30.

On the other hand, your programming language knows how to handle both types. When accessing:

value type: It knows that the value is available at whatever address
reference type: It knows that it is a memory address and that it has to “jump” to get to the data

That operation of “jumping” that it has to do with reference types to reach the real data is called indirection.

In principle, this is the reason why accessing a reference type variable is a little slower than value types. But it is so common and so optimized that the difference is usually minimal or even nonexistent. But it’s there.

On the contrary, reference types are much faster to copy or move. This is because you are not actually copying the data; you are only copying the memory address. Which generally takes up a few bytes. While the actual data can be… as large as your computer’s memory allows.