Re: Query meltdown: caching results

From: Gordon <gordon(dot)mcvey(at)ntlworld(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Query meltdown: caching results
Date: 2008-02-27 16:45:32
Message-ID: 5f232343-5d5e-4724-b486-ca193142e491@60g2000hsy.googlegroups.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

(Sorry for the repost but I thought this would be appropriate to both
groups. I did tell Google to delete my first post but odds are some
guys got that copy already anyway)

After a lot of hairpulling, I finally found a mechanism in PHP for
doing what I wanted. I just had to know 2 things:

1) How to get the PDO engine to use my customized prepared statement
class instead of PDOStatement
2) Extending PDOStatement to transparently add results caching is too
difficult and complex in the timeframe required

Once I knew these things I made a PDOStatement extension class that
instead of trying to transparently add caching to the existing methods
added a couple of new ones instead. Code below:

<?php

class Statement extends PDOStatement
{
private $resultCache = array ();
private $database = NULL;
public $hits = 0;
public $misses = 0;

public function ask (array $params = array ())
// Executes a prepared statement on the database that fetches
data
{
$hash = md5 (implode (',', $params));
if (!$this -> resultCache [$hash])
{
$this -> misses++;
// Execute the query
if ($this -> execute ($params))
{
// Cache the results
$this -> resultCache [$hash] = $this -
> fetchAll
(PDO::FETCH_ASSOC);
}
}
else
{
$this -> hits++;
}
return ($this -> resultCache [$hash]);
}
public function tell (array $params = array ())
// Execute a prepared statement that causes the database to be
modified
{
// Execute the query
if ($this -> execute ($params))
{
$rowCount = $this -> rowCount ();
if ($rowCount)
{
// Tell the parent Database object to
clear statement caches
$this -> database -> clearResults ();
}
return ($rowCount);
}
}
public function clearResults ()
// Clear cache
{
$this -> resultCache = array ();
}
private function __construct ($db)
// Class constructor
{
$this -> database = $db;
//print_r ($this);
}

}

class Database extends PDO
// Adds some extra functionality to the built in PDO class
{
private $statementCache = array ();
private $txCount = 0;
private $txErr = false;

// Prepared statement cacheing
public function prepare ($statement, array $options = array
())
{
$hash = md5 ($statement);
if ((!isset ($this -> statementCache [$hash]))
|| (!is_object ($this -> statementCache [$hash])))
{
//echo ('Preparing statement "'.
$statement .'"<br>');
$this -> statementCache [$hash] =
parent::prepare ($statement,
$options);
}
else
{
//echo ('Statement "' . $statement . '"
already prepared<br>');
}
return ($this -> statementCache [$hash]);
}
public function clearResults ()
// Clear the results cache of all associated prepared
statements
{
foreach ($this -> statementCache as $thisStatement)
{
$thisStatement -> clearResults ();
}
}
// Add support for transaction nesting
public function beginTransaction ()
{
if (($this -> txCount == 0) && (!$this -> txErr))
{
$result = parent::beginTransaction ();
}
$this -> txCount ++;
if (DEBUG_TX)
{
echo ('begin: ' . $this -> txCount . '
transaction(s)<br />');
}
return ($result);
}
public function commit ()
{
$this -> txCount --;
if ($this -> txCount <= 0)
{
$this -> txErr? $result =
parent::rollback (): $result =
parent::commit ();
$this -> txErr = false;
}
if (DEBUG_TX)
{
echo ('commit: ' . $this -> txCount . '
transaction(s)<br />');
}
return ($result);
}
public function rollback ()
{
$this -> txErr = true;
$this -> txCount --;
if ($this -> txCount <= 0)
{
$result = parent::rollback ();
$this -> txErr = false;
}
if (DEBUG_TX)
{
echo ('rollback: ' . $this -> txCount . '
transaction(s)<br />');
}
return ($result);
}
// Housekeeping
private function removeExpiredLocks ()
{
$query = 'DELETE FROM cms_locks
WHERE lck_timestamp +
lck_duration < NOW();';
$preparedQuery = $this -> prepare ($query);
if ($preparedQuery -> execute ())
{
return ($preparedQuery -> rowCount ());
}
}
// Class constructor
public function __construct ()
{
parent::__construct ( 'pgsql:host=' . CFG_DB_HOST
.'
port=' . CFG_DB_PORT
.'
dbname=' . CFG_DB_DBNAME,

CFG_DB_USERNAME,

CFG_DB_PASSWORD);
/*
The documentation for the following line on php.net is
really bad!

What is does is set what class will be used to handle
prepared
statements. By default
the PDO -> prepare() command returns a prepared
statement as a
PDOStatement class object.
We want to extend PDOStatements to provide some extra
functionality,
so when we run the
prepare () method, we want to return something other
than a
PDOStatement.

This line tells the database to use our Statement
class for prepared
statements instead
of the PDOStatement class. It also passes a reference
to the
database object that
spawned it to the constructor as an argument. We'll
use this
reference in the Statement
class to send messages back to the Database class
*/
$this -> setAttribute (PDO::ATTR_STATEMENT_CLASS,
array
('Statement', array ($this)));
$this -> setAttribute (PDO::ATTR_ERRMODE,
PDO::ERRMODE_WARNING);
$this -> query ('SET search_path = ' . CFG_DB_PATH .
';');
$this -> removeExpiredLocks ();
}

}

The system works like this: I use my Database class where I would
have used PDO to create a database connection. When I prepare() a
statement I either get a new prepared statement back, or the
previously created one is returned. The prepared statement contains
an array that caches the results produced with various parameters. My
prepared statement class also adds 2 database querying methods, ask ()
and tell (), that I use where I normally would have used an execute
(). If I want to retrieve information from the database I ask () it
to return the data matching my parameters. Whenever I want to make a
change I tell () it the data I want saved to the database. The ask ()
method checks the statement's local cache of results and if it finds
one matching the passed parameters, it returns it. If not, then it
execute()s the query to retrieve them.

When I tell () the database to make a change the statement that
received the message to do so sends a message to its parent Database
object. The object goes through its list of cached prepared
statements and sends them messages telling them to dump the contents
of their caches.

It's not an ideal solution, if two different queries produce identical
result sets or ones that overlap in some way then they get cached
twice, and the cache invalidation mechanism is extremely primitive (if
any change is made, dump everything). Nonetheless this little caching
system has had quite a dramatic effect on performance. It's still not
as fast as I would like though. :)

I reposted the code here in the hope that somebody finds it useful,
and/or in case somebody has suggestions for improvement. While the
code itself was quite simple in the end, finding the information
needed to make it work wasn't. The ATTR_STATEMENT_CLASS parameter is
not well documented. Hopefully others can learn from my difficulties
here and learn from them.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alvaro Herrera 2008-02-27 16:47:08 Re: Regarding interval conversion functions and a seeming lack of usefulness
Previous Message akshay bhat 2008-02-27 16:33:40 help for loading a psql file