Re: Pros / cons for indexing a small table

From: Josh Kupershmidt <schmiddy(at)gmail(dot)com>
To: Rikard Bosnjakovic <rikard(dot)bosnjakovic(at)gmail(dot)com>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: Pros / cons for indexing a small table
Date: 2009-11-10 19:41:08
Message-ID: 4ec1cf760911101141t6ebdb6a6jb040afd018964fe0@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

On Tue, Nov 10, 2009 at 6:25 AM, Rikard Bosnjakovic
<rikard(dot)bosnjakovic(at)gmail(dot)com> wrote:
> I'm having a simple table for managing hit counts on WWW. The table
> was setup some year ago using this:
>
> CREATE TABLE hit_counts (
>  page varchar(255),
>  hits integer,
>  last_visit timestamp,
>  id serial primary key
> );
>
> When updating a hit, I use this PHP-code:
>
>  $q = "SELECT hits FROM hit_counters WHERE page='$this_uri'";
>  $res = pg_query($db, $q);
>  if (pg_num_rows($res) == 0) {
>    $q = "INSERT INTO hit_counters (page) VALUES ('$this_uri')";
>  } else {
>    $row = pg_fetch_object($res);
>    $hits = $row->hits;
>    $hits++;
>    $q = "UPDATE hit_counters SET last_visit = (SELECT
> localtimestamp(0)), hits = $hits WHERE page='$this_uri'";
>  }
>  pg_query($db, $q);
>
[snip]
>
> 1. Since the id-column is never used, I will probably drop it to save
> some space. Or, is there any point in keeping it?

Drop it, and make column "page" your PRIMARY KEY. The way you have
that table declared now, you're implicitly using "page" as your
primary key anyway (i.e. lookups are done against "page", and your
code assumes page is unique), but you're not actually enforcing a
uniqueness constraint. In case other tables have FK references to
"hit_counts"."id", google for "keyvil" to understand the pitfalls of
using this surrogate key.

>
> 2. From what I've read in the manual and what other people have told
> me, creating an index can speed up table accessing. I tried by adding
> an index to the page-column in the above table. Then by using ANALYZE
> and EXPLAIN, the cost got reduced from 259 to 6 which I guess is a
> pretty big save although I'm not sure what the value really is. Is
> creating an index on 'page' the best way to speed up this particular
> table?
>

A unique index or primary key on page will help enforce database
integrity and probably speed up your queries as well.

> 3. The page-column is queried using full length strings (no regexps or
> wildcards). Will there be any performance issues if I have the
> page-column indexed and then run wildcard-queries on it? What I mean
> is, if I run wildcard-queries on this indexed column, will the index
> be of any use at all?
>

See here, this should answer your question:
http://www.postgresql.org/docs/8.3/interactive/indexes-types.html

Some additional comments on your schema and application code:

First, you have a race condition in your code, assuming you are using
the default READ COMMITTED isolation level. What if another client
inserts a row in between when you check for the row's existence, and
insert if the row doesn't exist? SELECT FOR UPDATE is probably what
you want to use.

Second, I really hope that $this_uri is being properly escaped before
your queries are executed...

Third, I recommend using "timestamp with time zone" instead of your
timestamp type for last_visit -- and then just use CURRENT_TIMESTAMP
instead of calling localtimestamp(0). Your table right now looks like
it'll have problems during daylight savings switches.

Josh

In response to

Browse pgsql-novice by date

  From Date Subject
Next Message Kris Kewley 2009-11-11 00:26:29 Re: Pros / cons for indexing a small table
Previous Message Rikard Bosnjakovic 2009-11-10 11:25:41 Pros / cons for indexing a small table