PHP references tutorial

PHP’s references are a powerful but often misunderstood feature of the language. Tutorials and guides usually compare PHP’s references to C’s pointers, but this leads to misconceptions about their use. This page explains how variables and references really work in PHP, and ends with notes on some technical and security issues.

How variables really work in PHP

Chapter 12 of the PHP manual, Variables, conflates the concepts of a variable and its value, and proceeds to describe references as pointers to variables. Later in the manual the unset function is described as ‘destroying’ a variable, creating further confusion about the separation of a variable from its value.

PHP internally implements variable values through a structure know as a _zval_struct, generally referred to simply as a zval. In addition to storing the value and information about its type, the zval also specifies a refcount. The refcount counts the number of references to the value and is essential to the operation of the garbage collector, allowing memory to be freed when it is no longer in use.

Creating and unsetting variables

Suppose a variable is created and assigned a value:

// create a variable
$example = 'something';

When this happens PHP creates new zval, containing the value ‘something’ and information about its type (in this case, that it’s a string). PHP then creates a new entry in the symbol table — a structure mapping variable names onto their values — specifying the zval corresponding to the variable, and then increments the refcount of the zval.

Suppose the variable is then ‘destroyed’ by using the unset function:

// unset the variable

When this happens, PHP looks in the symbol table to find the zval corresponding to this variable, decrements the refcount, and removes the variable from the symbol table. Because the refcount is now zero, the garbage collector knows that there is no way of accessing this zval, and can free the memory it occupies.

How references work in PHP

A reference in PHP is simply a variable corresponding to the same zval as another variable. References can be explicitly created using a special form of the assignment operator with an ampersand after the equals sign. For example:

// create two variables referencing the same value
$example1 =  'something';
$example2 =& $example1;

In this case, $example1 and $example2 refer to the same zval, and the refcount of the zval is incremented. Changing the value of either variable affects the same zval and hence affects the other variable as well. An important thing to realise is that $example1 does not have a special status because it was declared with a value — both variables refer to the same zval and are treated equally by PHP.

Unsetting and reassigning references

Suppose one of the variables is now unset:

// unset one of the variables

As above, the refcount is decremented and the variable is removed from the symbol table. However, as the refcount is non-zero the zval is not garbage collected, and $example2 can still be used to refer to the value. In other words, the unset function doesn’t ‘destroy’ values but just removes references to them — only the garbage collector destroys values.

Suppose a variable is reassigned to a new reference:

// create two variables
$example1 = 'something';
$example2 = 'something2';

// create a variable referencing one of the values
$example3 =& $example1;

// change the variable to reference the other value
$example3 =& $example2;

In this case the reassignment of $example3 doesn’t affect the value of $example1. This is because reference assignment doesn’t change the values contained in zvals, but only alters the refcount of the zval and the variable’s entry in the symbol table.

Notes on technical and security issues

The following are some technical and security issues concerning the use of references of which programmers should be aware.

References created by the global keyword

The global keyword allows variables from outside the scope of a function to be used within it. The keyword actually creates a reference to an entry in the $GLOBALS array. The following are therefore equivalent:

// use the global keyword to create a reference to a global variable
global $example;

// directly create a reference to a global variable
$example =& $GLOBALS['example'];

Reference counts in debug_zval_dump

The function debug_zval_dump, available since PHP 4.2.0, outputs information on the zval corresponding to a variable, including the value of the refcount. However, there are some subtleties in its behaviour caused by optimisations within the PHP Zend engine. For further details consult the entry for debug_zval_dump in the PHP manual.

Security risks due to refcount overflow

In PHP 4, the refcount is stored as a 16-bit value. This means that the refcount returns to zero once 65,536 references have been created. If 65,537 references are created and then one reference is unset, the garbage collector frees the memory occupied by the zval as it believes no references remain. The memory can then be reassigned, despite the existence of 65,536 references that can be used to write to that memory. For further details, see the article MOPB-01-2007: PHP 4 Userland ZVAL Reference Counter Overflow Vulnerability by the Hardened-PHP Project.

Note that this issue does not affect PHP 5 as in that version the refcount is stored as a 32-bit value. For this value to overflow 4,294,967,297 references would need to be created, and such a PHP script would almost certainly encounter memory or time limits before reaching that point.

Where now?

Found this useful? Share it:

Also in PHP: