Re: BufferAccessStrategy for bulk insert

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Postgres Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: BufferAccessStrategy for bulk insert
Date: 2008-10-30 02:11:44
Message-ID: 1225332704.3971.342.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Wed, 2008-10-29 at 21:58 -0400, Robert Haas wrote:
> > If you say its a loss you should publish timings to support that. Using
> > a BAS for VACUUM was a performance gain, not a loss.
>
> Well, I can dig up and publish the timings from my laptop, but I'm not
> sure where that will get us. Trust me, the numbers were higher with
> BAS, otherwise I wouldn't be worrying about this. But I pretty much
> doubt anyone cares how my laptop runs PostgreSQL anyway, which is why
> I think someone should test this on good hardware and see what happens
> there. The only change I made to disable the BAS was a one-line
> change in GetBulkInsertState to replace BAS_BULKWRITE with BAS_NORMAL,
> so it should be easy for someone to try it both ways.
>
> Not at any point in the development of this patch was I able to match
> the 15-17% copy speedup, 20% CTAS speedup that you cited with your
> original email. I did get speedups, but they were considerably
> smaller. So either my testing methodology is different, or my
> hardware is different, or there is something wrong with my patch. I
> don't think we're going to find out which it is until someone other
> than me looks at this.
>
> In any event, VACUUM is a read-write workload, and specifically, it
> tends to write pages that have been written by other writers, and are
> therefore potentially already in shared buffers. COPY and CTAS are
> basically write-only workloads, though with COPY on an existing table
> the FSM might guide you to free space on a page already in shared
> buffers, or you might find an index page you need there. Still, if
> you are doing a large bulk data load, those effects are probably
> pretty small. So, the profile is somewhat.
>
> I'm not really trying to argue that the BAS is a bad idea, but it is
> certainly true that I do not have the data to prove that it is a good
> idea.

You should try profiling the patch. You can count the invocations of the
buffer access routines to check its all working in the right ratios.

Whatever timings you have are worth publishing.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2008-10-30 02:20:08 Re: Hot Standby: Code Snapshot (v2b)
Previous Message Robert Haas 2008-10-30 02:11:14 Re: Please make sure your patches are on the wiki page