Re: 8.4 open item: copy performance regression?

From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: 8.4 open item: copy performance regression?
Date: 2009-06-19 19:06:27
Message-ID: 4A3BE1B3.3060107@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> writes:
>> ok after a bit of bisecting I'm happy to announce the winner of the contest:
>> http://archives.postgresql.org/pgsql-committers/2008-11/msg00054.php
>
>> this patch causes a 25-30% performance regression for WAL logged copy,
>> however in the WAL bypass case (maybe that was what got tested?) it
>> results in a 20% performance increase.
>
> Hmm. What that patch actually changes is that it prevents a bulk insert
> (ie COPY in) from trashing the entire shared-buffers arena. I think the
> reason for the WAL correlation is that once it's filled the ring buffer,
> creating new pages requires writing out old ones, and the
> WAL-before-data rule means that the copy process has to block waiting
> for WAL to go down to disk before it can write. When it's allowed to
> use the whole arena there is more chance for some of that writing to be
> done by the walwriter or bgwriter. But the details are going to depend
> on the platform's CPU vs I/O balance, which no doubt explains why some
> of us don't see it.

hmm - In my case both the CPU (an Intel E5530 Nehalem) and the IO
subsystem (8GB Fiberchannel connected NetApp with 4GB cache) are pretty
fast. and even with say fsync=off 8.4RC1 is only slightly faster than
8.3 with the same config and fsync=on so maybe there is a secondary
effect at play too.

>
> I don't think we want to revert that patch --- not trashing the whole
> buffer arena seems like a Good Thing from a system-wide point of view,
> even if it makes individual COPY operations go slower. However, we
> could maybe play around with the tradeoffs a bit. In particular it
> seems like it would be useful to experiment with different ring buffer
> sizes. Could you try increasing the ring size allowed in
> src/backend/storage/buffer/freelist.c for the BULKWRITE case
>
> ***************
> *** 384,389 ****
> --- 384,392 ----
> case BAS_BULKREAD:
> ring_size = 256 * 1024 / BLCKSZ;
> break;
> + case BAS_BULKWRITE:
> + ring_size = 256 * 1024 / BLCKSZ;
> + break;
> case BAS_VACUUM:
> ring_size = 256 * 1024 / BLCKSZ;
> break;
>
>
> and see if maybe we can buy back most of the loss with not too much
> of a ring size increase?

already started testing that once I found the offending commit.

256 * 1024 / BLCKSZ
4min10s/4min19/4min12

512 * 1024 / BLCKSZ
3min27s/3min32s

1024 * 1024 / BLCKSZ
3min14s/3min12s

2048 * 1024 / BLCKSZ
3min02/3min02

4096 * 1024 / BLCKSZ
2m59/2m58s

8192 * 1024 / BLCKSZ

2m59/2m59s

so 4096 * 1024 / BLCKSZ seems to be the sweet spot and also results in
more or less the same performance that 8.3 had.

Stefan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2009-06-19 19:29:32 Re: rc1 tarball contains partially outdated/missing man pages
Previous Message Tom Lane 2009-06-19 18:11:14 Re: 8.4 open item: copy performance regression?