C and C++ Pointer Tutorial
Introduction
This is the first in a tutorial series on C and C++ pointers. It is written for beginning programmers to clarify pointer concepts, but is a great refresher for all programmers. It includes code snippets, diagrams, and discussion text. For a more advanced, under-the-hood deep dive on arrays, pointers, and references, see this post. Let’s dive right in.
What are Pointers in C and C++?
Pointers in C and C++ are variables that point to other variables. They can also point to unnamed objects or regions in computer memory.
Foundations: What is a Variable?
In his book Programming: Principles and Practice Using C++, Edition 2, Bjarne Stroustrup defines a variable this way:
An object is a region of memory with a type that specifies what kind of information can be placed in it. A named object is called a variable.
He further explains:
You can think of an object as a “box” into which you can put a value of the object’s type…
So, we can visualize a variable as a box with a type (like char
, int
, string
, etc.) that also has a name by which we
can refer to it, and which contains a value that is stored inside the box (in computer memory):
char Variables
In the above variable definition, c
is the name of the variable, and char
is the type. The type of a variable defines both the size of the variable
in memory (for char
variables, one byte, the smallest addressable area in memory) and how it is to be handled by the compiler
(it can be assigned a letter like 'a'
or a signed integer whose values can go from -128
to 127
).
int Variables
Above, we have defined a variable named i
of type int
. It is an object that takes up four bytes in memory, and is treated by the compiler as a
signed integer variable whose values can go from -32768
to 32767
.
Note that the variable names are arbitrary, as long as they are in the proper format and do not collide with other names, such as keywords or names in libraries or other linked in code.
Variables in Memory
When the program is compiled and run, these variables will be placed at a certain location in memory.
For instance, if we have 16 bytes of memory, with the first byte starting at address zero and the last byte starting at address 15
(zero to 15 inclusive is 16), the variable c
could be placed at address 2
(the third byte from the bottom), and i
at address 4
(the fifth byte from the bottom):
Note that in C and C++, addresses and indexes (such as an index into an array - look for next post) are zero based. That is, they start at zero.
So, the address (or index) of the 1st byte in memory is 0
, the 2nd is 1
, …, the nth byte is n - 1
.
Pointers Explained
A pointer is exactly what its name implies: a pointer to something, often another variable, which is what we deal with in this tutorial.
A pointer is a variable, just like a char
or int
variable, but its type is pointer to type, where you specify the type:
Defining Pointers
Above, the *
postfix operator after the type (char
and int
) is what indicates to the compiler you are defining a pointer to that type.
The type of pc
is pointer to char
, and the type of pi
is pointer to int
.
Assigning Values to Pointers
Since pointers are variables, like char
and int
, you must assign them a value, and the value you put in them is the address of the variable you
want the pointer to point to (you can also assign them the value null
in C and C++, or nullptr
in C++ only, which indicates they point to nothing):
Above, the &
prefix operator gets the address of a variable, so &c
returns the address of the variable c
, and here we assign it as the
value for the pointer pc
. Since the address of c
is 2
, that is the value that gets put into pc
.
Similarly, the value 4
gets put in pi
.
Notice that in the printf
statement, we use the %p
format specifier to format pointer (address) printouts, in this case
as 0
padded hexadecimal numbers (see below).
You can see above clearly that the value assigned to the pointer (the number put into the pointer) is the address of the variable it points to.
Pointers are Just Variables in Memory Too
Pointers are variables. In the diagram above, I put them to the side to make it easier to see the point being made: that they point to a variable and that the value stored in the pointer variable is the memory address of the variable they point to. However, Pointers are variable objects just like any other variable, so they also reside in computer memory:
As you can see, I have changed the figure to have the pointers placed in memory like any other variable, and have placed them at their very own respective addresses. Note that for a 16 byte total memory system, a single byte is sufficient to hold all the possible addresses, so that is how I did it in the above diagram to save space:
In the printf
statement, I use the %p
format specifier to print four addresses as 0
padded hexadecimal numbers
(in our “make believe” system of 16 bytes of memory). There are sixteen hex digits which include the 10 from decimal (0
-9
)
plus six more: A
(10
), B
(11
), C
(12
), D
(13
), E
(14
), and F
(15
). So, the 0C
for the address of pc
is
just 12, and the 09
for the address of pi
is just 9, both of which match the addresses where pc
and pi
are located in the
diagram above.
The first two are the addresses of the pointer variables pc
and pi
. I use the &
operator to get the address of pc
as the first number
to be printed out. This not what the pointer pc
points to, but the address in memory of the pointer variable pc
itself (hence &pc
).
A quick look at the diagram above, and we see the pointer variable pc
is indeed located at the address 12
. Similarly, &pi
prints out as
the hex value 09
, and looking at the diagram above, we see I have located the pointer variable pi
at address 9
.
The next two 0
padded hex numbers show the contents of, or value stored in the pointer variables. These are the addresses of
the char
variable c
(&c
) and the int
variable i
(&i
), which are 02
and 04
, respectively. You can see from the diagram
above that indeed I placed c
at address 2
and i
at address 4
.
Remember, this is our “make believe” system that only has 16 bytes of memory, pointers only occupy one byte in memory, and I have determined myself the location of all the variables in memory for illustration purposes. Below I discuss what a typical real system would look like, but the “make believe” system works well here for simplicity and is correct conceptually and better for simplicity and space savings.
Using Pointers
Once one has pointed a pointer to another variable, one uses the *
prefix operator to retrieve the value that is inside the variable that the pointer
points to. This is called dereferencing (a pointer refers to another variable, so de-reference-ing it is to change it from a reference to the
variable to the actual value of the variable it references, just as if one directly used the referenced variable itself):
Note, above I have used the shortcut notation that assigns an initial value right in the definitions of each variable. First, I define a char
variable named
c
, simultaneously assigning it the value 'a'
. Then, I define a pointer to char
variable pc
, simultaneously assigning it the address of the
variable c
.
Using printf
, I first print out the value of c
(a character), directly using the variable c
itself.
I then print out the value of c
again, but this time indirectly through the pointer to it (pc
). I use the *
prefix operator to dereference pc
(hence, *pc
), which returns the value of the variable to which it points, in this case the value of c
(‘a’). So, we end up just printing ‘a’ twice.
Changing Pointers
Pointers can be changed at any time to point to something else, and you can assign the contents of one pointer to another:
After defining and assigning values to the two char
variables c
and d
, I define and assign the address of c
to pc
then the address of d
to pd
.
The first printf
shows what one would expect: c
holds 'a'
, d
holds 'f'
. &pc
and &pd
are the addresses where pc
and pd
reside in memory.
pc
holds the address of c
, pd
holds the address of d
. In the printout, dereferencing pc
(*pc
) returns the value of c
, and dereferencing pd
(*pd
) returns the value of d
.
Since pointers are just variables that hold values which happen to be the addresses of the variables they point to, I can assign the value (“contents of”)
one to another, as long as they are the pointers of the same type. I do this (pd = pc
), then the following printf
shows that pd
now points to c
,
just like pc
points to c
. We can verify this because the printout of the address held in both pc
and pd
are the same
(the address of c
, which is 2
). Also, when dereferenced, they both print out the same value: the value of c
(which is ‘a’). Note that the
values in c
and d
remain unchanged, as you can see from the first to numbers in the print out.
Finally, I can directly put the address of d
in pc
(pc = &d
) without affecting pd
. As the following printf
shows, we have ended up switching
what pc
and pd
point to: pc
points to d
and pd
points to c
. Comparing the addresses stored in pc
and pd
, they are switched from the first
printf
to the last, and printing *pc
and *pd
prints out the values ‘f’ then ‘a’, switched in order from the first printout of ‘a’ then ‘f’.
Again, the values of c
and d
printed using those variable names directly have not changed.
Assigning Values Using Pointers
Finally, we show below how to change the value of the variable a pointer points to:
Just dereference (using the *
prefix operator) the pointer in the assignment, and the variable the pointer points to will be updated.
Of course, the type of the value you are assigning must be the same type as the variable being pointed to (and not an address):
in this case, a char
type, not a pointer to char
(not a char*
) type.
When we assign to a dereferenced pointer (*pc
), the address of the pointer itself (&pc
) and the value stored in pc
(the address of c
) do not change, as you can see from the address printouts from both printf
s. Just the value of the variable
c
that pc
points to changes. So, *pc = 'f'
changes the value in c
from ‘a’ to ‘f’.
Real Memory Spaces
In a typical desktop or server system, the program might run in a 32 bit environment (for instance x86), so the size of the
pointer would need to be 4 bytes (the same size as an int
). More commonly today, it could run in a 64 bit environment (for instance x64),
so the size would need to be 8 bytes (the size of a long long
).
The maximum theoretically addressable memory space in a 32 bit system is 2^32, or 4,294,967,296 bytes (4 GB). In Windows, typically only half of
this is available to a user process. The theoretically maximum addressable memory space in a 64 bit system is 2^64, or around 16 exabytes.
Only a small portion of this is used by Windows systems. Here is a diagram that illustrates details of an x64
paged virtual address space.
The diagram below shows how the variables are stored in a process (program), illustrating an actual run of the code snippet immediately below it. Scale and proportion have been sacrificed to keep the diagram a workable size, but are good enough to get the point across:
The 0x
in front of each 0
padded hex number is the way to specify a hex number literal when writing code in C or C++. In the diagram, as well as in
the code below, the simplistic small number addresses have been changed to the actual 32 bit (4 byte) addresses printed out when I ran the
instrumentation code. The diagram reflects that these addresses are larger than a char
(c
) and the same size as an int
(i
).
The the pointer’s addresses reflect their addresses to the right side in the diagram, and the contents of the pointers reflects the addresses of the variables they point to, also to the right.
Note also that the arrangement of the variables is opposite from all the other diagrams - the first defined variable is at the top, the last at the bottom. This is to show the reality that in an actual system, since the associated code snippet below was instrumented in a function, each local variable is pushed onto the stack in the order defined, and the stack grows from top to bottom.
The above code snippet is identical to the code above under the heading
Pointers are Just Variables in Memory Too, except that instead of showing the addresses from
my 16 byte example system in the printf
outputs as I have done everywhere else above, I show the real printouts from my sample code
running as a 32 bit process which displays 32 bit (4 byte) addresses.
This section is not critical for an introductory understanding of pointers, but is here for completeness and to show in real terms how my 16 byte address space with 1 byte pointers really is just for pedagogical purposes. If it seems a little too complex at this point, don’t worry about it - just try to understand the gist of the section.
Conclusion
I hope I have accomplished my goal of providing an approachable and insightful tutorial on C and C++ pointers. Next up in the series is investigating the close relationship between arrays and pointers along with pointer arithmetic.
Thanks to Pexels for the free image