Re: BufferAccessStrategy for bulk insert

From: "Robert Haas" <robertmhaas(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Postgres Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: BufferAccessStrategy for bulk insert
Date: 2008-10-29 03:45:30
Message-ID: 603c8f070810282045g7fa6bdf9p53f03114d49643d7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

And here's the patch, which based on comments thus far does the following:

- Replaces the use_wal, use_fsm arguments in various places with a
single options argument.
- Creates a BAS_BULKWRITE buffer access strategy.
- Creates a BulkInsertState object so that COPY and CTAS can use
BAS_BULKWRITE and also keep the most recent page pinned.

Note that the original purpose of this exercise was to implement the
optimization that COPY and CTAS would keep the most recent page pinned
to avoid repeated pin/unpin cycles. This change shows a small but
measurable performance improvement on short rows. The remaining items
were added based on reviewer comments.

One concern that I have about this approach is that the situation in
which people are probably most concerned about COPY performance is
restoring a dump. In that case, the COPY will be the only thing
running, and using a BufferAccessStrategy is an anti-optimization. I
don't think it's a very big effect (any testing anyone can do on real
hardware rather than what I have would be appreciated) but I'm sort of
unsold of optimizing for what I believe to be the less-common use
case. If the consensus is to reverse course on this point I'm happy
to rip those changes back out and resubmit; they are a relatively
small proportion of the patch.

...Robert

On Sun, Oct 26, 2008 at 8:37 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> Seems sane to me. I don't see the point of the HEAP_INSERT_BULK flag
>> bit --- providing or not providing bistate would cover that, and if
>> you have a bit as well then you have to define what the inconsistent
>> combinations mean. I concur with making all-zeroes be the typical
>> state of the flag bits, too.
>
> Thanks for the design review. I had thought to make the inconsistent
> combinations fail an assertion, but I'm just as happy to leave it out
> altogether.
>
>> FWIW, we generally declare bitmask flag variables as int, unless
>> there's some really good reason to do otherwise.
>
> OK, thanks for the tip.
>
> ...Robert
>

Attachment Content-Type Size
bulk_insert.patch text/x-diff 29.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2008-10-29 04:08:50 Re: array_agg and array_accum (patch)
Previous Message Tom Lane 2008-10-29 03:20:32 Re: Checking stack depth