Why is my (empty) partial index query slow?

From: Richard Frith-Macdonald <richard(at)tiptree(dot)demon(dot)co(dot)uk>
To: pgsql-performance(at)postgresql(dot)org
Subject: Why is my (empty) partial index query slow?
Date: 2006-06-16 07:17:14
Message-ID: 5510ED72-8266-438B-A4DF-9591ED26BA38@tiptree.demon.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I have a producer/consumer setup where various producer processes
insert new records into a table and consumer processes mark those
records as having been handled when they have dealt with them, but
leave the records in the table so that we can generate reports later.

The records are added with a char(1) field specifying their state and
a timestamp, and a varchar(40) saying which class of consumer may
handle them.

There is a partial index on consumer and timestamp where the state
field says the record is new ... so there are no records in this
index except when a producer has just added them and no consumer has
yet handled them.

Each consumer polls the database with a query to select a batch of
unhandled records (ordered by timestamp) ... the idea being that,
even though the table has a huge number of historical records used
for reporting, the partial index based query should be tiny/quick as
there are usually few/no unhandled records.

Problem 1 ... why does this polling query take 200-300 milliseconds
when the partial index is empty, and what can be done about it?
This is on a fast modern machine and various other queries take under
a millisecond.

I guess that the fact that records are constantly (and rapidly) added
to and removed from the index may have caused the index to become
inefficient somehow ...
If that's the case, dropping it and creating a new one might
temporarily fix the issue... but for how long?
As the actual table is huge (44 million records) and reading all the
records to create a new index would take a long time (simply doing a
'select count(*)' on the table takes some minutes) and lock the table
while it's happening, I can't really experiment, though I could
schedule/agree downtime for the system in the middle of the night at
some point, and try rebuilding the index then.

Problem 2 ... tentative (not readily reproducible and haven't managed
to rule out the possibility of a bug in my code yet) ... a long
running consumer process (which establishes a connection to the
database using libpq, and keeps the connection open indefinitely) was
reporting that the query in question was taking 4.5 seconds, but
starting up the psql command-line tool and running the same query
reported a 200-300 millisecond duration. Could this be a problem in
the database server process handling the connection? or in the libpq
code handling it? The disparity between the times taken for queries
as logged in psql and within the consumer application only seemed to
occur for this polling query (which gets executed very often), not
for other queries the consumer program did on other tables.
Restarting the consumer process 'cured' this ... now I'm waiting to
see if this behavior returns.
Anyone seen anything like this or know what might cause it?

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Gourish Singbal 2006-06-16 07:22:49 Re: Delete operation VERY slow...
Previous Message A. Kretschmer 2006-06-16 07:17:07 Re: Delete operation VERY slow...