Skip site navigation (1) Skip section navigation (2)

Re: Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>,pgsql-performance(at)postgresql(dot)org
Subject: Re: Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit
Date: 2008-03-10 14:48:39
Message-ID: 47D54A47.3030608@postnewspapers.com.au (view raw or flat)
Thread:
Lists: pgsql-patchespgsql-performance
Tom Lane wrote:
> "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
>> For 8.4, it would be nice to improve that. I tested that on my laptop 
>> with a similarly-sized table, inserting each row in a pl/pgsql function 
>> with an exception handler, and I got very similar run times. According 
>> to oprofile, all the time is spent in TransactionIdIsInProgress. I think 
>> it would be pretty straightforward to store the committed subtransaction 
>> ids in a sorted array, instead of a linked list, and binary search.
> 
> I think the OP is not complaining about the time to run the transaction
> that has all the subtransactions; he's complaining about the time to
> scan the table that it emitted.

Yes, but only in succeeding statements in the same transaction as the 
procedure that creates all the subtransactions. Table scan times return 
to normal after that transaction commits.

> Presumably, each row in the table has a
> different (sub)transaction ID and so we are thrashing the clog lookup
> mechanism.  It only happens once because after that the XMIN_COMMITTED
> hint bits are set.

It seems to happen with every statement run in the same transaction as, 
and after, the procedure with all the subtransactions. As soon as a 
COMMIT is executed, operations return to normal speed. There's no 
significant delay on the first statement after COMMIT as compared to 
subsequent statements, nor do successive statements before the COMMIT 
get faster.

In other words, if I repeatedly run one of the statements I used in 
testing for my initial post, like:

EXPLAIN ANALYZE SELECT * FROM booking;

... after running the problem stored procedure, it takes just as long 
for the second and third and so on runs as for the first.

As soon as I commit the transaction, the exact same statement returns to 
running in less than a second, and doesn't significantly change in 
runtime for subsequent executions.

I'll bang out a couple of examples at work tomorrow to see what I land 
up with, since this is clearly something that can benefit from a neat 
test case.

In any case, avoding the use of an exception block per record generated 
worked around the performance issues, so it's clearly something to do 
with the vast numbers of subtransactions - as Heikki Linnakangas 
suggested and tested.

--
Craig Ringer

In response to

Responses

pgsql-performance by date

Next:From: Heikki LinnakangasDate: 2008-03-10 14:53:32
Subject: Re: Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit
Previous:From: Tom LaneDate: 2008-03-10 14:33:58
Subject: Re: Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit

pgsql-patches by date

Next:From: Heikki LinnakangasDate: 2008-03-10 14:53:32
Subject: Re: Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit
Previous:From: Tom LaneDate: 2008-03-10 14:33:58
Subject: Re: Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group