Skip site navigation (1) Skip section navigation (2)

Re: [HACKERS] [COMMITTERS] pgsql: Fix TransactionIdIsCurrentTransactionId() to use binary search

From: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: [HACKERS] [COMMITTERS] pgsql: Fix TransactionIdIsCurrentTransactionId() to use binary search
Date: 2008-04-21 16:29:00
Message-ID: 200804211229.00871.xzilla@users.sourceforge.net (view raw or flat)
Thread:
Lists: pgsql-committerspgsql-hackerspgsql-performance
On Thursday 27 March 2008 17:11, Tom Lane wrote:
> Robert Treat <xzilla(at)users(dot)sourceforge(dot)net> writes:
> > On Sunday 16 March 2008 22:18, Tom Lane wrote:
> > > > > Fix TransactionIdIsCurrentTransactionId() to use binary 
> > > > > search instead 
> > > > > of linear search when checking child-transaction XIDs.
> >
> > > > Are there any plans to backpatch this into REL8_3_STABLE?
> > >
> > >  No.
> > >
> > > > It looks like I am
> > > > hitting a pretty serious performance regression on 8.3 with a stored
> > > > procedure that grabs a pretty big recordset, and loops through doing
> > > > insert....update on unique failures.  The procedure get progressivly
> > > > slower the more records involved... and dbx shows me stuck in
> > > > TransactionIdIsCurrentTransactionId().
> > >
> > > If you can convince me it's a regression I might reconsider, but I
> > > rather doubt that 8.2 was better,
> > > 

> > Well, I can't speak for 8.2, but I have a second system crunching the
> > same data using the same function on 8.1 (on lesser hardware in fact),
> > and it doesn't have these type of issues.
>
> If you can condense it to a test case that is worse on 8.3 than 8.1,
> I'm willing to listen...

I spent some time trying to come up with a test case, but had no luck.  Dtrace 
showed that the running process was calling this function rather excessively; 
sample profiling for 30 seconds would look like this: 

FUNCTION                                                COUNT   PCNT
<snip>
postgres`LockBuffer                                        10   0.0%
postgres`slot_deform_tuple                                 11   0.0%
postgres`ExecEvalScalarVar                                 11   0.0%
postgres`ExecMakeFunctionResultNoSets                      13   0.0%
postgres`IndexNext                                         14   0.0%
postgres`slot_getattr                                      15   0.0%
postgres`LWLockRelease                                     20   0.0%
postgres`index_getnext                                     55   0.1%
postgres`TransactionIdIsCurrentTransactionId            40074  99.4%

But I saw similar percentages on the 8.1 machine, so I am not convinced this 
is where the problem is.  Unfortunatly (in some respects) the problem went 
away up untill this morning, so I haven't been looking at it since the above 
exchange.  I'm still open to the idea that something inside 
TransactionIdIsCurrentTransactionId could have changed to make things worse 
(in addition to cpu, the process does consume a significant amount of 
memory... prstat shows:

 PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
 3844 postgres 1118M 1094M cpu3    50    0   6:25:48  12% postgres/1

I do wonder if the number of rows being worked on is significant in some 
way... by looking in the job log for the running procedure (we use 
autonoumous logging in this function), I can see that it has a much larger 
number of rows to be processed, so perhaps there is simply a tipping point 
that is reached which causes it to stop performing... still it would be 
curious that I never saw this behavior on 8.1

= current job
     elapsed     |                         status
-----------------+--------------------------------------------------------
 00:00:00.042895 | OK/starting with 2008-04-21 03:20:03
 00:00:00.892663 | OK/processing 487291 hits up until 2008-04-21 05:20:03
 05:19:26.595508 | ??/Processed 70000 aggregated rows so far
(3 rows)

= yesterdays run
|     elapsed     |                         status
+-----------------+--------------------------------------------------------
| 00:00:00.680222 | OK/starting with 2008-04-20 04:20:02
| 00:00:00.409331 | OK/processing 242142 hits up until 2008-04-20 05:20:04
| 00:25:02.306736 | OK/Processed 35936 aggregated rows
| 00:00:00.141179 | OK/
(4 rows)

Unfortunatly I don't have the 8.1 system to bang on anymore for this, (though 
anecdotaly speaking, I never saw this behavior in 8.1) however I do now have 
a parallel 8.3 system crunching the data, and it is showing the same symptom 
(yes, 2 8.3 servers, crunching the same data, both bogged down now), so I do 
feel this is something specific to 8.3.  

I am mostly wondering if anyone else has encountered behavior like this on 8.3 
(large sets of insert....update exception block in plpgsql bogging down), or 
if anyone has any thoughts on which direction I should poke at it from here. 
TIA.

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

In response to

Responses

pgsql-performance by date

Next:From: Alvaro HerreraDate: 2008-04-21 16:54:24
Subject: Re: Re: [HACKERS] [COMMITTERS] pgsql: FixTransactionIdIsCurrentTransactionId() to use binary search
Previous:From: Guillaume CottenceauDate: 2008-04-21 15:31:03
Subject: Re: Vacuum settings

pgsql-hackers by date

Next:From: Brendan JurdDate: 2008-04-21 16:37:34
Subject: Re: Commitfest namespacing (was: TODO, FAQs to Wiki?)
Previous:From: Joshua D. DrakeDate: 2008-04-21 16:26:02
Subject: Re: TODO, FAQs to Wiki?

pgsql-committers by date

Next:From: Alvaro HerreraDate: 2008-04-21 16:54:24
Subject: Re: Re: [HACKERS] [COMMITTERS] pgsql: FixTransactionIdIsCurrentTransactionId() to use binary search
Previous:From: User C2mainDate: 2008-04-21 13:34:42
Subject: muninpgplugins - muninpgplugins: Imported Sources

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group