Databases and PHP

Database support

PHP includes support for several databases: Different database types are historically accessed using the various sets of function calls corresponding to the particular database you are using.

With the release of PHP 4.x, and the PEAR library system (similar to Perl's CPAN), there is a database abstraction class. Rather than use database-specific calls one may use the unified class to expose database functionality. The class determines what function-level calls to make based on the connection string you use. This allows you to use the same database code regardless of the DB platform you choose, much the same as Perl's DBI modules.

A simple example

Keeping things simple, let's take a look at an example of getting data from mySQL. In English, we want to:
  1. Connect to the database
  2. Query the database
  3. Get the results
  4. Traverse and display the results
In PHP:

<?

  $dbUser = "username";
  $dbPasswd = "password";
  $dbServer = "localhost";
  $dbPort = "3306";
  $dbDatabase = "mytest";
  
  $link = mysql_connect("$dbServer:$dbPort", $dbUser, $dbPasswd);
   
  if (!$link) {
    // there was a problem connecting
    die("Could not connect to database:" . mysql_error());
  }
  
  $rc = mysql_select_db($dbDatabase, $link);
  
  if (!rc) {
    // problem selecting database
    die("Could not select database:" . mysql_error());
  }
  
  $query = "SELECT name, phone FROM contacts";
  
  $result = mysql_query($query, $link);
  
  if (!$result) {
    // there was a problem executing the query
    die("Could not execute query:" . mysql_error());
  }
  
  while ($row = mysql_fetch_array($result)) {
    ?>
    
    Name: <? echo $row[name]; ?>, Phone: <? echo $row[phone]; ?> <br />
    
    <?
  }

  mysql_close($link);
  
?>
    

Let's take a look at what we've done:

mysql_connect() establishes a connection to the specified database server, authenticating with a username and password. Specifying the port is only necessary if we are running the server on a port other than the default (3306 for MySQL). If the call to mysql_connect() is successful it returns a positive link identifier ($link) that we can utilize later in the script to use this connection.

Next we check to make sure the call was successful. If mysql_connect() failed for some reason, $link would not be a valid (positive) identifier. We test $link, and if it is not valid, we abort the script.

Please note that to make things easy to understand we have chosen to expand the various "test for success" portions of our script. It would be preferable to code these like so:

  $link = mysql_connect("$dbServer:$dbPort", $dbUser, $dbPasswd)
    or die("Could not connect:" . mysql_error());

Once we have established a connection, we need to tell MySQL what database we are going to use with mysql_select_db(). We also pass the link identifier to use. As with most of PHP's MySQL functions, if we do not pass a link identifier it assumes the last opened link. If you are accessing multiple database servers it's important to keep them straight!

Now that we have a connection, and we know what database we are using, we create a SQL query and query the database using mysql_query(). This returns a result identifier (if it works!). If the call to mysql_query fails, it will return a FALSE value. While we will use our result identifier ($result) to traverse the result rows of our query, it is important to note that a query may be successful yet not return any rows (for example, an UPDATE or INSERT query).

Using our result identifier, $result, we now walk through the rows using mysql_fetch_array(), which returns an associative array containing a single row from our results. When there are no more rows, mysql_fetch_array() will return FALSE and our while loop will end.

You will note that I have broken out of PHP and back into HTML to display the results. While it would be easier to use echo or print statements to display our single line, in the real world you're likely to be outputting all manner of HTML constructs such as table rows, cells, divs, spans, etc. Having to escape quotes can become rather tiresome and can often limit the readability of your code, so instead we "escape in" to PHP only when we need to print out a variable.

You can also do this:

  print <<<EOF
   User: $row[user], Phone: $row[phone]
   ... (more HTML) ...
EOF;
... which allows you to mix HTML and PHP very easily. Perl users - Note the extra <; without the extra sign it is interpreted as a bitwise operator. The problem (I have) with this is that the EOF; must not be indented or you will have parsing errors. As an indentation fiend I find this unacceptable (in Perl too! ;) ).

Once we are finished we close the database connection with mysql_close(). This is actually unnecessary, because PHP will close any open connections when the script finishes executing (unless you are using persistent connections).

Persistent connections

For the majority of websites database access means: Connect, grab a small amount of data, display, disconnect. In these cases, most of the script's overhead is connecting to the database. To save on resources, PHP will allow you to use persistent connections.

Persistent connections do not close when your script finishes executing. When you request a persistent connection to be opened, PHP checks to see if a similar connection is already open and uses it if it is available. This is a lot faster than opening a new connection. Persistent connections do not give you any added features, and don't require any changes to your scripts other than removing calls to mysql_close() and using mysql_pconnect() instead of mysql_connect().

You can turn on and configure persistent connections in php.ini using the following options:

mysql.allow_persistent boolean
Whether to allow persistent MySQL connections. 

mysql.default_host string
The default server host to use when connecting to the database server if no 
other host is specified. 

mysql.default_user string
The default user name to use when connecting to the database server if no 
other name is specified. 

mysql.default_password string
The default password to use when connecting to the database server if no 
other password is specified. 

mysql.max_persistent integer
The maximum number of persistent MySQL connections per process. 

mysql.max_links integer
The maximum number of MySQL connections per process, including persistent 
connections. 

A sample application

To demonstrate some other functions, let's take a look at a sample application: a page to search a database of newsgroup postings.

Our database is populated regularly with news posts by a perl script, and we have created a FULLTEXT index on the subjects and bodies of the posts. This allows us to search the full text of the archive very quickly.

Please note that this particular indexing feature is specific to MySQL. Other DBMSes may have different ways of indexing large amounts of text data. Also note that indexing large amounts of text drastically increases the amount of storage space required by the database, and increases the amount of time necessary on write operations to the database.

While our backend has many different tables, we are really only interested in two; posts and newsgroups:

posts table newsgroups table

Let's take a look at the application before we dive into the code.