Re: 8.4 open item: copy performance regression?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: 8.4 open item: copy performance regression?
Date: 2009-06-19 18:11:14
Message-ID: 12228.1245435074@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> writes:
> ok after a bit of bisecting I'm happy to announce the winner of the contest:
> http://archives.postgresql.org/pgsql-committers/2008-11/msg00054.php

> this patch causes a 25-30% performance regression for WAL logged copy,
> however in the WAL bypass case (maybe that was what got tested?) it
> results in a 20% performance increase.

Hmm. What that patch actually changes is that it prevents a bulk insert
(ie COPY in) from trashing the entire shared-buffers arena. I think the
reason for the WAL correlation is that once it's filled the ring buffer,
creating new pages requires writing out old ones, and the
WAL-before-data rule means that the copy process has to block waiting
for WAL to go down to disk before it can write. When it's allowed to
use the whole arena there is more chance for some of that writing to be
done by the walwriter or bgwriter. But the details are going to depend
on the platform's CPU vs I/O balance, which no doubt explains why some
of us don't see it.

I don't think we want to revert that patch --- not trashing the whole
buffer arena seems like a Good Thing from a system-wide point of view,
even if it makes individual COPY operations go slower. However, we
could maybe play around with the tradeoffs a bit. In particular it
seems like it would be useful to experiment with different ring buffer
sizes. Could you try increasing the ring size allowed in
src/backend/storage/buffer/freelist.c for the BULKWRITE case

***************
*** 384,389 ****
--- 384,392 ----
case BAS_BULKREAD:
ring_size = 256 * 1024 / BLCKSZ;
break;
+ case BAS_BULKWRITE:
+ ring_size = 256 * 1024 / BLCKSZ;
+ break;
case BAS_VACUUM:
ring_size = 256 * 1024 / BLCKSZ;
break;

and see if maybe we can buy back most of the loss with not too much
of a ring size increase?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stefan Kaltenbrunner 2009-06-19 19:06:27 Re: 8.4 open item: copy performance regression?
Previous Message Kenneth Marshall 2009-06-19 17:59:00 Re: 8.4 open item: copy performance regression?