Skip site navigation (1) Skip section navigation (2)

Why is my (empty) partial index query slow?

From: Richard Frith-Macdonald <richard(at)tiptree(dot)demon(dot)co(dot)uk>
To: pgsql-performance(at)postgresql(dot)org
Subject: Why is my (empty) partial index query slow?
Date: 2006-06-16 07:17:14
Message-ID: 5510ED72-8266-438B-A4DF-9591ED26BA38@tiptree.demon.co.uk (view raw or flat)
Thread:
Lists: pgsql-performance
I have a producer/consumer setup where various producer processes  
insert new records into a table and consumer processes mark those  
records as having been handled when they have dealt with them, but  
leave the records in the table so that we can generate reports later.

The records are added with a char(1) field specifying their state and  
a timestamp, and a varchar(40) saying which class of consumer may  
handle them.

There is a partial index on consumer and timestamp where the state  
field says the record is new ... so there are no records in this  
index except when a producer has just added them and no consumer has  
yet handled them.

Each consumer polls the database with a query to select a batch of  
unhandled records (ordered by timestamp) ... the idea being that,  
even though the table has a huge number of historical records used  
for reporting, the partial index based query should be tiny/quick as  
there are usually few/no unhandled records.

Problem 1 ... why does this polling query take 200-300 milliseconds  
when the partial index is empty, and what can be done about it?
This is on a fast modern machine and various other queries take under  
a millisecond.

I guess that the fact that records are constantly (and rapidly) added  
to and removed from the index may have caused the index to become  
inefficient somehow ...
If that's the case, dropping it and creating a new one might  
temporarily fix the issue... but for how long?
As the actual table is huge (44 million records) and reading all the  
records to create a new index would take a long time (simply doing a  
'select count(*)' on the table takes some minutes) and lock the table  
while it's happening, I can't really experiment, though I could  
schedule/agree downtime for the system in the middle of the night at  
some point, and try rebuilding the index then.

Problem 2 ... tentative (not readily reproducible and haven't managed  
to rule out the possibility of a bug in my code yet) ... a long  
running consumer process (which establishes a connection to the  
database using libpq, and keeps the connection open indefinitely) was  
reporting that the query in question was taking 4.5 seconds, but  
starting up the psql command-line tool and running the same query  
reported a 200-300 millisecond duration.  Could this be a problem in  
the database server process handling the connection? or in the libpq  
code handling it?   The disparity between the times taken for queries  
as logged in psql and within the consumer application only seemed to  
occur for this polling query (which gets executed very often), not  
for other queries the consumer program did on other tables.
Restarting the consumer process 'cured' this ... now I'm waiting to  
see if this behavior returns.
Anyone seen anything like this or know what might cause it?


Responses

pgsql-performance by date

Next:From: Gourish SingbalDate: 2006-06-16 07:22:49
Subject: Re: Delete operation VERY slow...
Previous:From: A. KretschmerDate: 2006-06-16 07:17:07
Subject: Re: Delete operation VERY slow...

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group