Skip site navigation (1) Skip section navigation (2)

Re: 8.4 open item: copy performance regression?

From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Alan Li <ali(at)truviso(dot)com>
Subject: Re: 8.4 open item: copy performance regression?
Date: 2009-06-20 11:15:48
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers
Greg Smith wrote:
> On Fri, 19 Jun 2009, Stefan Kaltenbrunner wrote:
>> In my case both the CPU (an Intel E5530 Nehalem) and the IO subsystem 
>> (8GB Fiberchannel connected NetApp with 4GB cache) are pretty fast.
> The server Alan identified as "Solaris 10 8/07 s10x_u4wos_12b X86" has a 
> Xeon E5320 (1.86GHz) and a single boring SAS drive in it (Sun X4150). 
> The filesystem involved in that particular case is UFS, which I am 
> suspicious of as being part of why the problem is so pronounced 
> there--the default UFS tuning is pretty lightweight in terms of how much 
> caching it does.  Not sure if Alan ran any tests against the big ZFS 
> volume on the other sever, I think all the results he posted were from 
> the UFS boot drive there too.
>> so 4096 * 1024 / BLCKSZ seems to be the sweet spot and also results in 
>> more or less the same performance that 8.3 had.
> It looks like it's a touch higher on our 8/07 system, it levels out at 
> 8192 * (haven't checked the other one yet).  I'm seeing this, using 
> Alan's original test set size (to make sure I was running exactly the 
> same test) and just grabbing the low/high from a set of 3 runs:
> 8.3.7:  0m39.266s   0m43.269s (alan:  36.2 - 39.2)
> 256:    0m50.435s   0m51.944s (alan:  48.1 - 50.6)
> 1024:   0m47.299s   0m49.346s
> 4096:   0m43.725s   0m46.116s
> 8192:   0m40.715s   0m42.480s
> 16384:  0m41.318s   0m42.118s
> 65536:  0m41.675s   0m42.955s

hmm interesting - I just did a bunch of runs using the lineitem table 
from the DBT3 tests (loading 60M rows in each run) and the same config 
Alan used.

8.4(postpatch - not RC1 but that one seems to behave exactly the same way)

256 9min38s
512 9min20s
1024 7m44.667s/7m45.342s
2048 7m15.500s/7m17.910s
4096 7m11.424s/7m13.276s
8192 6m43.203s/6m48.293s
16384 6m24.980s/6m24.116s
32768 6m20.753s/6m22.083s
65536 6m22.913s/6m22.449s
1048576 6m23.765s/6m24.645s



so on this workload the sweetspot seems to be much higher than on the 
one with the narrower rows.

> That's actually doing less I/O per capita, which is why it's also got 
> less waiting for I/O%, but it's completing the most work.  This makes me 
> wonder if in addition to the ring buffering issue, there isn't just 
> plain more writing per average completed transaction in 8.4 with this 
> type of COPY. This might explain why even with the expanded ring buffer, 
> both Stephan and my test runs still showed a bit of a regression against 
> 8.3.  I'm guessing we have a second, smaller shooter here involved as well.

well yes I also suspect that there is some secondary effect at play here 
   and I believe I have seen the "more IO with 8.4" thing here too but I 
have not actually paid enough attention yet to be sure.

> In any case, a bump of the ring multiplier to either 4096 or 8192 
> eliminates the worst of the regression here, good improvement so far.

yeah with the above numbers I would say that 8192 should remove most if 
not all of the regression. However it seems that we might have to make 
this more dynamic in the future since the behaviour seems to depend on a 
number of variables...


In response to


pgsql-hackers by date

Next:From: Greg StarkDate: 2009-06-20 11:46:17
Subject: Re: 8.4 open item: copy performance regression?
Previous:From: Greg StarkDate: 2009-06-20 11:10:37
Subject: Re: 8.4 open item: copy performance regression?

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group