Skip site navigation (1) Skip section navigation (2)

Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit

From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>,"Craig Ringer" <craig(at)postnewspapers(dot)com(dot)au>,"pgsql-patches" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit
Date: 2008-03-12 17:14:04
Message-ID: 47D80F5C.2060500@enterprisedb.com (view raw or flat)
Thread:
Lists: pgsql-patchespgsql-performance
Pavan Deolasee wrote:
> On Wed, Mar 12, 2008 at 9:27 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> 
>>  I didn't like it; it seemed overly complicated (consider dealing with
>>  XID wraparound),
> 
> We are talking about subtransactions here. I don't think we support
> subtransaction wrap-around, do we ?

Imagine that you start a transaction just before transaction 
wrap-around, so that the top level XID is 2^31-10. Then you start 20 
subtransactions. What XIDs will they get? Now how would you map those to 
a bitmap?

It's certainly possible, you could index the bitmap by the index from 
top transaction XID for example. But it does get a bit complicated.

>> and it would have problems with a slow transaction
>>  generating a sparse set of subtransaction XIDs.
> 
> I agree thats the worst case. But is that common ? Thats what I
> was thinking when I proposed the alternate solution. I thought that can
> happen only if most of the subtransactions abort, which again I thought
> is not a normal case. But frankly I am not sure if my assumption is correct.

It's not that common to have hundreds of thousands of subtransactions to 
begin with..

>> I think getting rid of
>>  the linear search will be enough to fix the performance problem.
> 
> I wonder if a skewed binary search would help more ? For example,
> if we know that the range of xids stored in the array is 1 to 1000 and
> if we are searching for a number closer to 1000, we can break the
> array into <large,small> parts instead of equal parts and then
> search.

Possibly, but I doubt it's worth the trouble. The simple binary search 
solved the performance problem well enough. In the test case of the OP, 
with 300000 subtransactions, with the patch, there was no longer any 
measurable difference whether you ran the "SELECT COUNT(*)" in the same 
transaction as the INSERTs or after a COMMIT.

> Well, may be I making simple things complicated ;-)

I think so :-).

-- 
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

In response to

Responses

pgsql-performance by date

Next:From: Tom LaneDate: 2008-03-12 17:22:13
Subject: Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit
Previous:From: Pavan DeolaseeDate: 2008-03-12 17:02:37
Subject: Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit

pgsql-patches by date

Next:From: Tom LaneDate: 2008-03-12 17:22:13
Subject: Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit
Previous:From: Pavan DeolaseeDate: 2008-03-12 17:02:37
Subject: Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group