Let’s continue talking about value types and reference types by looking at how another part of the program (a function) behaves when we pass a variable.
If you haven’t done so, it is highly recommended that you have read the two previous articles What is a reference and Value types and reference types (or this is going to be tough for you).
So, with the homework done, I’m going to start the house from the roof! What interests us most is to know if a function that receives a variable can modify it, or cannot (and what modifications it can make).
This is mainly what all this fuss is about. Let’s see the classic answer:
If the function receives a parameter of value type, it cannot modify its value
If it receives it by reference, it can modify its value
Well, that was easy, right? We are done, let’s go home! Eeeeh … obviously not… (that is precisely the normal answer I want to avoid, because it doesn’t clarify anything).
The real answer is:
A function always receives a copy of the parameters
“Well, I’ve been told that if it’s a reference type and I pass it by reference, I don’t know… and… there’s a lunar eclipse 🌒 … and… and… “. Well, what can I say, you’ve been lied to 😜.
Therefore, the function that receives a parameter can never modify it, because it only has a copy. This has been the case since practically the beginning, and it is inherent to the way our current processor architecture works.
But of course, the fact that your functions cannot modify anything… takes away a lot of the fun. Well, what we can do is one thing:
- We pass a copy of the data to the function and it modifies its copy
- It gives me another copy with the result
- I discard the data I had and replace them with the new copy
But this would be super slow! I would have to keep moving data back and forth all the time. That’s not going to work, it wasn’t practical at all.
The very intelligent people who designed the processors already knew this. They had to invent a way for functions, which can only receive copies of data, to exchange data without having to copy all the entire data.
So let’s see what they invented to fix it, which is logically related to why there are value types and, above all, reference types.
How to make a function able to modify a variable
Imagine you live in a shared apartment with a roommate. To make it difficult, this roommate doesn’t speak your language and works at night, so you can never see him in person. You communicate through photos via mobile.
You need your roommate to sign the rental contract. But you cannot send him the contract via mobile, you can only send him a photograph of the contract.
You could try sending him the photograph over and over again, but obviously, your roommate cannot sign it. At most, he could print it and would sign HIS copy of the contract. But not the real contract.
So you come up with a different idea. I will leave the contract in a drawer, and I send him a photo of the drawer where I have kept it.
So your roommate goes at night, takes the contract, signs it. And the next day, you retrieve the already signed contract. Everyone’s happy 😊.
That is, instead of sending him your data, you send him a reference saying where the data is stored. This way both of you can modify the same data.
Passing parameters to functions
Let’s recap. We have just seen that REFERENCES allow different parts of the program to modify a variable. This is because they do not contain data, but links to the data.
On the other hand, functions always receive copies of the data, not the data itself. But the behavior will be different if what they receive is a value type or a reference type. (In particular, the function that receives will not be able, or will be able, to modify the parameters it receives in one case or another).
To make everyone’s day better, many languages allow passing variables to a function by value or passing them by reference. Which complicates things a bit more (we’ll see that shortly).
So, in the end, we have four possible combinations between “pass by value or by reference” and passing a “value type or reference type”.
Let’s look at each of the four cases, and what can or cannot be modified in each of them.
- Pass by value, value type: If the parameter is modified inside the function, NO ❌ it is modified for the rest of the program
- Pass by value, reference type: If the parameter is modified inside the function, YES ✔️ it is modified for the rest of the program
- Pass by reference, value type: It is identical to the previous one, YES ✔️ it is modified.
- Pass by reference, reference type: The function YES ✔️ can modify the parameter it receives. Moreover, YES ✔️ it can even return me a different reference.
And if we put it in a summary table, it looks like this:
Type | By value | By reference |
---|---|---|
Value | ❌/❌ | ✔️/❌ |
Reference | ✔️/❌ | ✔️/✔️ |
Where, in each cell, the combination🅰️/🅱️ represents:
- 🅰️: Can the function modify the variable?
- 🅱️: Can the function return me a different variable?
Internal functioning Advanced
If you want to understand the behavior of each of the four combinations, keep reading as we are going to analyze them.
It is important that you understand two things well:
- That the function always receives a copy of the parameters
- How a REFERENCE works
And with that in mind, let’s get to it 👇
1. Parameters by value
First assumption, passing parameters by value. This is the “normal” way to pass parameters to a function.
1.1 - Passing by value, a value type
Case 1, let’s pass a variable of value type. For the example, I’m going to use an integer, a simple int
. But any value type would work.
// function that changes a variable
function change_variable(int received_variable)
{
received_variable = 20;
}
int my_variable = 10; // create a value type variable
change_variable(my_variable) // we pass the variable by value
// what is my_variable worth now? 10
Let’s see what happened:
- We create a variable
A
that is worth 10 - The function
change_variable
receives a copyA-2
, which has its own 10 - Inside the function,
A-2
changes its value to 20 - When the function finishes,
A
outside still has a value of 10
That is, each variable has its own value. Both variables are independent, and changing one does not modify the other.
1.2 - Passing by value, a reference type
Case 2, now we do the same but with a reference type variable. For example, let’s use an array for the example, but we could use any other reference type.
// function that changes a variable
function change_variable(int[3] received_variable)
{
received_variable[0] = 20;
}
int[] my_variable = {10, 0, 0}; // create a reference type variable
change_variable(my_variable) // we pass the variable by reference
// what is my_variable[0] worth now? 20
Let’s see what happened:
- We create a reference type variable
A
- The function
change_variable
receives a copy of the referenceA-2
- Both references point to the same data
- Inside the function, we change the value of the Reference
- When the function finishes,
my_variable
has been modified
That is, now both variables change the data because they are two equal References. They are copies, they are two different references, but they point to the same data.
Therefore, if through one or the other we access the data, they are modifying the same data. That is why the variable changes.
2. Passing parameters by reference
Passing by reference is “syntactic sugar” to indicate to the program to automatically create a reference for us. That is, it is a way the language offers to make something simpler or easier to read.
In this case, what makes “it easier” is creating a reference, without us having to do it ourselves, nor seeing it. This has advantages and disadvantages (see below Tips)
2.1 - Passing by reference, a value type
Case 3 is basically exactly the same as case 2, passing a reference by value. Simply, instead of working with a reference, the language creates one for us.
This reference we never see (and that’s why I named it ...
) but it is the same case as before. That is, the function can modify the value, and it is modified for the rest of the program.
2.2 - Passing by reference, a reference type
Case 4, which generates the most confusion, passing a reference type by reference. Spoiler for the Tips below, avoid doing this.
But let’s analyze it, first the most basic. If the function that receives the parameter modifies it, it modifies it for the rest of the program. Just like the other two cases (2 and 3) that involve references, only the value type / by value (case 1) is exempt.
So what is different about this case 4? It is the only one that can modify the reference we pass to point to another location.
The two “automatic” references point to the same reference. That is, we can change the value to which A
points, for example, to point to new data.
// function that changes a variable
function change_variable(ref int[3] received_variable)
{
received_variable = {20, 0, 0}; // we change where the reference points
}
int[] my_variable = {10, 0, 0}; // create a reference type variable
change_variable(my_variable) // we pass the variable by reference
// what is my_variable worth now? {20, 0, 0}
The rest of the cases (1 to 3) that we have seen cannot change the reference to point to something different. Cases 2 and 3 can modify the data to which my reference points, but not the reference itself.
Best practices Tips
As much as possible, try to always avoid passing parameters by reference. It is very atypical and leads to weird effects that are hard to detect.
It is much better that if you need to pass something by reference, you explicitly create a type that wraps what you want.
For example, if you have a football_player
object and need a function to modify them, simply pass the player by value.
function doSomething(football_player player)
{
}
If now you need a function that manipulates football_players
, create the object you need, for example, football_team
.
class football_team {
football_player[] players;
}
function doSomething(football_team team)
{
}
There is no need to pass parameters by reference if your objects are well-designed. In very rare cases (never) is it necessary to pass parameters by reference.
On the other hand, in addition to the considerations we have seen about whether it is possible to modify a variable from a function, copying by reference also has implications in terms of speed. In principle:
- Value type variables are slightly faster to access
- Reference type variables are much faster to copy
However, do not give this too much importance, at least at first. Often, the compiler will do things for us, like switching from value type to reference or vice versa if it deems it more efficient for execution.
You simply use the type that is really appropriate, except in cases where efficiency is really very important. And in these cases, test it well, because sometimes you may be surprised that your “improvement” actually worsens performance.