Pointers vs References in C++
Video Tutorial
Overview
A reference in C++ is an alias for another variable. There are multiple types of references such as lvalue references, rvalue references, and constant references. References can also be used as the return type of functions or in the parameter type of functions. Although references are safer and easier to use, they are less powerful than pointers.
Scope
- This article defines what a reference variable in C++ stands for and how it is used.
- This article also highlights the differences between a reference and a pointer in C++.
Introduction
In the real world, we use many synonymous words interchangeably. One might refer to a "car" as an "automobile" or use the word "residence" instead of "house". Similarly, a reference variable in C++ is just an alternative name for another variable of the same type. After the declaration of the reference, the previous and the new variable name can be used interchangeably and they would access/modify the same data and the same location in memory.
A pointer to a variable in C++ points to the memory location where the variable is stored. A reference in C++ is not a "new" variable and thus, a separate memory location is not allocated for it. Since it can access the same memory location** as the variable of which it is a reference, a reference variable is internally implemented as a pointer. But the reference is not the same as pointer there are many differences between them which we will see in this article.
Creating Reference in C++
We use the reference operator in C++ (& operator) to create a reference. The general syntax for creating a reference is as follows:
Note: In the syntax above, the data type of the variable on the right side of the assignment operator should be the same as that of the reference variable. The following code shows how a reference can be created for an integer type variable. The same code can be extended to other data types as well.
Output:
In the output shown above, we can see that the initial value of x is 10. Then we declare y as a reference to x and use y to change the value of x. Using the second cout statement, we can verify that the value of x was indeed changed by y.
Types of References
References in C++ can be categorized based on the type of variables they can refer to such as lvalue references and rvalue references. Additionally, we can also have constant references in C++ which provides immutability. Let's explore these types in further detail.
Lvalue References
Lvalue references in C++ are used to reference existing objects (and not temporary objects and literals). The syntax for declaring an lvalue reference is:
Lvalue references are mostly used to create an alias for another variable. They are also used when an object has to be passed by reference to a function.
As per the syntax shown above, the right-hand side of the expression mentions an existing object name. In other words, we need to provide an lvalue on the right-hand side of the expression. Thus, we can say that an lvalue reference can only refer to a variable existing in memory.
If we want to store the reference of a rvalue (such as temporary objects and literals) in a lvalue reference, we need to use the const keyword as follows:
The following code demonstrates the usage of lvalue references in C++ :
In the above code, if we uncomment line 9, we'll get the following error because lref2 is not a constant variable and we are trying to initialize it with a rvalue.
Similarly, if we uncomment line 12, we'll get the following error because lref3 is a constant variable.
Rvalue References
Rvalue references in C++ are used to reference temporary objects and literal. The benefit of using rvalue references is that we can utilize the resources of temporary objects and prolong their life in the program. The syntax for declaring a rvalue reference is :
While using rvalue references in C++, we can even modify the rvalue that is stored in the reference variable (which is not possible using a constant lvalue reference). The following code illustrates this:
Constant References
Constant references in C++ are used in function arguments when we want to prevent accidental changes to the value of the parameter. A constant reference in C++ behaves exactly as a constant variable would doesn't allow modifications to its value. Both lvalue and rvalue references can be made constant as shown in the code below:
Output:
Explanation:
The output above shows that we cannot modify the constant references a and b inside func1() and func2() respectively. The phrase "read-only" here means that these variables cannot be modified and only their data can be accessed.
Properties of References
-
References in C++ must be initialized at the time of their declaration. This means that the following snippet is not valid.
-
After a variable has been initialized as a reference, it cannot be reassigned to refer to another variable. The following snippet shows this.
-
Function parameters can be passed as references. This is especially helpful when we want to pass large objects as parameters. So, we can reduce the overhead of copying the large object into the parameters.
It is also useful when we want the changes made to the parameters in the function to be reflected in the variables in the calling function.
The following code shows the usage of references as function parameters :
Output:
Explanation:
As we can see in the above code, passing obj as a reference saves us the overhead of copying two variables x and y. Also, the swap in swapValue() method is reflected in the main() method as well as evident from the output.
- In case of nested data structures, we can use references as a shortcut to access the innermost data members. The following shows how this is done :
Output:
Explanation:
In the above code, we can now use y as an alias for obj_c.obj_b.obj_a.x which makes the program more readable and clean.
- While using for each loops, we must use references if we want to modify the objects being accessed. The following code demonstrates this :
Output:
Explanation:
As we can see in the output above when we try to modify objects for each loop without using a reference, only the temporary objects are modified (and not the actual objects). However, while using a reference for each loop, the values in the vector change because now we are modifying the actual object itself.
Rules for Using Reference in Functions
- While passing objects by reference, we need to remind ourselves that changes made to the parameters in the function will be reflected in the calling function.
- To avoid accidental changes to the parameters, we can use constant references as parameters.
- While returning references from a function, we must ensure that the scope of the variable whose reference is being returned is not limited to that function only.
Reference Collapsing
Since references in C++ are not separate objects, it is not allowed to create a reference to a reference in C++ with the usual syntax declaration. However, we can make use of type manipulations in typedef to create a reference to reference. In such cases, reference collapsing rules are followed. So, we can have four possible combinations :
- Lvalue reference to lvalue reference
- Lvalue reference to rvalue reference
- Rvalue reference to lvalue reference
- Rvalue reference to rvalue reference
Except for the last type above (rvalue reference to rvalue reference), all the other combinations are treated as lvalue references. Rvalue reference to rvalue reference is treated as an rvalue reference. The following code shows the four combinations :
Returning Values by Reference
There are two important subjects related to references in C++: References as Parameters and References as Return Value. Passing references as function parameters have been covered in the "Properties of Reference" section. Functions can also return values by reference in C++. By doing so, we can use a function call statement as the left side of an expression. While returning a reference from a function, we must also ensure that the scope of the variable which is being referenced extends to the calling function. The following code makes this clear :
Output:
Explanation:
The func() function shown in the code above returns a reference to a static variable y. The visibility scope of y extends to main() since y is static. When the first function call is made, i.e. func() = 200; the current values of x and y are printed and a reference to y is returned by the function. Then the value of y is updated to 200. In other words, this is the same as writing y = 200; When the second function call is made, the current values of x and y are printed again and we can see that the value of y has been modified to 200 and the value of x remains the same as before.
References vs Pointers
References | Pointers |
---|---|
Reference must have a data type | Pointers can be of type void |
Reference cannot be reassigned | Value of a pointer can be changed to another address |
Reference is declared using the & operator | Pointer is declared using the * operator |
There cannot be multiple indirection levels in references (except for double references using typedef manipulation) | Pointers can have multiple levels of indirection (eg. double-pointer, triple pointer, etc) |
References have to be initialized at the point of declaration. They cannot be NULL | Pointers can be NULL |
Why are References Less Powerful Than Pointers?
- References cannot be NULL, unlike pointers which can be NULL to denote that they are not pointing to anything.
- References cannot be reassigned to another variable after they have been created.
- Referenced must be initialized at the time of declaration.
Owing to such limitations, references are not as powerful as pointers. This is why pointers are preferred in the implementation of many popular data structures like Linked List, Tree, etc. JAVA, on the other hand, uses references to implement these data structures because it doesn't have such restrictions on references. This is why JAVA doesn't use pointers.
References are Safer and Easier to Use
- Since references are initialized at the time of declaration, we don't need to worry about situations like wild pointers (uninitialized pointers which point to arbitrary memory locations).
- References don't have to be dereferenced like pointers and can be used like normal variables (as an alias for some other variable).
- It is necessary to pass objects as a reference in a copy constructor. Otherwise, if it's passed by value, the copy constructor would call itself to create a copy of the object which in turn would call another copy constructor and this loop would go on until the compiler runs out of memory.
- While overloading operators like ++, references are used to ensure that the original variable is being returned and not a new copy.
Conclusion
- References in C++ are an alias for other existing variables.
- References must be initialized at the time of their declaration and cannot be reassigned.
- There are different types of references such as lvalue references, rvalue references, and constant references.
- Function parameters can be passed as references. Functions can return references as well.
- References are less powerful than pointers but are safer and easier to use.