Re: BufferAccessStrategy for bulk insert

From: "Robert Haas" <robertmhaas(at)gmail(dot)com>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Postgres Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: BufferAccessStrategy for bulk insert
Date: 2008-10-30 01:58:34
Message-ID: 603c8f070810291858j11dce56co4bcd79fb6c9b3917@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> If you say its a loss you should publish timings to support that. Using
> a BAS for VACUUM was a performance gain, not a loss.

Well, I can dig up and publish the timings from my laptop, but I'm not
sure where that will get us. Trust me, the numbers were higher with
BAS, otherwise I wouldn't be worrying about this. But I pretty much
doubt anyone cares how my laptop runs PostgreSQL anyway, which is why
I think someone should test this on good hardware and see what happens
there. The only change I made to disable the BAS was a one-line
change in GetBulkInsertState to replace BAS_BULKWRITE with BAS_NORMAL,
so it should be easy for someone to try it both ways.

Not at any point in the development of this patch was I able to match
the 15-17% copy speedup, 20% CTAS speedup that you cited with your
original email. I did get speedups, but they were considerably
smaller. So either my testing methodology is different, or my
hardware is different, or there is something wrong with my patch. I
don't think we're going to find out which it is until someone other
than me looks at this.

In any event, VACUUM is a read-write workload, and specifically, it
tends to write pages that have been written by other writers, and are
therefore potentially already in shared buffers. COPY and CTAS are
basically write-only workloads, though with COPY on an existing table
the FSM might guide you to free space on a page already in shared
buffers, or you might find an index page you need there. Still, if
you are doing a large bulk data load, those effects are probably
pretty small. So, the profile is somewhat.

I'm not really trying to argue that the BAS is a bad idea, but it is
certainly true that I do not have the data to prove that it is a good
idea.

...Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2008-10-30 02:11:14 Re: Please make sure your patches are on the wiki page
Previous Message Simon Riggs 2008-10-30 01:01:25 Re: Proposal of PITR performance improvement for 8.4.