Re: Bulk Inserts

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Pierre Frédéric Caillaud <lists(at)peufeu(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bulk Inserts
Date: 2009-09-16 03:04:54
Message-ID: f67928030909152004i103d7c4by3353826ac21aaa5d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2009/9/15 Pierre Frédéric Caillaud <lists(at)peufeu(dot)com>

> Does that heuristic change the timings much? If not, it seems like it
>> would
>> better to keep it simple and always do the same thing, like log the tuples
>> (if it is done under one WALInsertLock, which I am assuming it is..)
>>
>
> It is the logging of whole pages that makes it faster.
> If you fill a page with tuples in one operation (while holding exclusive
> lock) and then insert WAL records for each tuple, there is no speed gain.
>

> Inserting a full page WAL record (since you just filled the page
> completely) :
>
> - only takes WalInsertLock once instead of once per tuple
>

OK, that makes sense. I thought you had hacked either XLogInsert or the
heap WAL replay code so that you could just accumulate tuples in the rdata
chain and then submit them all under the cover of a single WALInsertLock.
If you haven't done that, then of course doing the bulk insert doesn't help
much if you still to tuple-by-tuple XLogInsert. So in the case that it is
under the limit, you first run through the tuples putting them into the
block, then run through the tuples again doing the XLogInserts?

> - reduces wal traffic
> - is about 2x faster in my benchmark
>
> And inserting a "clear new page" record (if the page was previously
> new/empty and relation is fsync'd at the end) :
>
> - only takes WalInsertLock once instead of once per tuple
> - reduces wal traffic a lot
> - is about 4x faster in my benchmark

Do you have an IO bottleneck even in the absence of fsyncs? My experience
on multi-core machines with decent IO systems has been that the amount of
WAL traffic (by volume) matters rather little, as opposed to the number
WALInsertLocks taken, which matter quite a bit. Of course this depends
quite a bit on your OS and hardware.

...

Another angle of attack would be to make wal-writing more efficient...
>

If you mean to do this without changing the xlog interfaces, I'm not
optimistic.

If you have to call XLogInsert once per row that is copied (or
insert...select), then my experiments show that simply taking the
WALInsertLock and immediately releasing it, doing absolutely no real work
while it is held, is already a substanial multi-core scalibility
bottleneck. Once we accept that this must be done, the next existing
bottleneck is the memcpy of the first byte from the rdata chain into the
shared wal_buffers, presumably because this copy involves fighting the cache
line away from other cores. Once you've copied the first byte, the rest of
them seem to be almost free. (Again, this is probably hardware and
situation dependent).

I've seen some suggestions that the wal_buffer block initation work be moved
from being done by AdvanceXLInsert to instead be done by XLogWrite.
However, I've not seen any indication that AdvanceXLInsert is a meaningful
bottlneck in the first place. Except when wal_buffers is too small: then
AdvanceXLInsert is a bottleneck, but only because XLogWrite is getting
called from within it, in which case moving work from one to the other is
probably not going to make things better.

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dan Colish 2009-09-16 03:21:35 Re: [PATCH] pgbench: new feature allowing to launch shell commands
Previous Message Robert Haas 2009-09-16 03:00:27 Re: updated join removal patch