Re: Using multi-row technique with COPY

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Using multi-row technique with COPY
Date: 2005-11-29 21:10:45
Message-ID: 1133298645.2906.426.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 2005-11-28 at 09:26 -0500, Tom Lane wrote:
> Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> > I don't see why couldn't have an additional index access method entry
> > point to insert multiple rows on one call.
>
> I think Simon was mainly on about the idea of inserting multiple *heap*
> entries with one call, ie, only one cycle of locking a heap buffer.
> It's not apparent to me that there'd be any noticeable win from batching
> index insertions, because even if the heap entries are all on one page,
> it's unlikely their index entries would be.

It might be a valid idea for monotonically increasing key indexes, but
I'll leave that idea alone for now. Tom's right: I was discussing heap
blocks only (for now).

> I'm a bit dubious about the whole thing from a concurrency standpoint,
> too: holding locks longer is not a great thing in general. The recent
> patch to avoid repeated locking cycles during a read seems OK, because
> (1) checking visibility of a tuple is ordinarily pretty cheap, and
> (2) it's only a shared lock so other people can be doing the same thing
> concurrently. However, heap insertion can make neither of those claims.
> You're talking about an exclusive lock, and you're talking about holding
> it while copying data around and then making multiple WAL entries.

I wasn't talking about holding locks for any longer than normal.

Heres the rough code:

heap_bulk_insert (tuple, bool needsToasting)
{
/* if tuple needs toasting, unpick the buffer */
if (needsToasting)
{
for (i=0; i < nTuplesBuffered; i++)
heap_insert(tupleBuffer[i]);
heap_insert(tuple);
nTuplesBuffered = 0;
sizeTupleBuffer = 0;
return;
}

/* if buffer overflows, write the whole tuplebuffer in one
* go to a freshly allocated shared_buffer/table block
* so that we do only one buffer manager lock pair per block
* rather than one per row
*/
tuple_length = length(tuple);
if ((length(tuple) + sizeTupleBuffer) > MAXDATAINBLOCK)
{
RelationGetBufferForTupleArray(nTuplesBuffered,
tupleBuffer);
tupleBuffer[0] = tuple;
nTuplesBuffered = 1;
sizeTupleBuffer = tuple_length;
}
else
{
tupleBuffer[++nTuplesBuffered] = tuple;
sizeTupleBuffer += tuple_length;
}

return;
}
in heapam.c

(needsToasting would be set for a tuple during parsing by COPY, which is
fine since it does actually know how long columns are. Actual decision
to toast or not can be wrapped in a modular call to heapam.c)

RelationGetBufferForTupleArray() in hio.c
very similar to RelationGetBufferForTupleArray but less complex because
it never needs to handle the Update case.
If we do choose to write WAL records for these inserts, then we can
simply log the whole block, rather than making repeated individual
inserts into WALbuffers, so we save on WALInsertLock overhead too. Thats
no more than we do after a checkpoint anyway, so no problems.

If the transaction fails, then we lose the buffer: so what? Slightly
better than writing them to the table then failing.

Most importantly, we don't actually allocate a new data block until we
decide to flush the tupleBuffer. So the exclusive lock is only held
momentarily while we insert all the rows from the tupleBuffer into the
fresh block.

There's no modularity wierdness, but we do need some code in COPY to
decide whether it should use fast mode or not.

Best Regards, Simon Riggs

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Gavin Sherry 2005-11-29 21:22:36 Re: ice-broker scan thread
Previous Message Tom Lane 2005-11-29 20:58:22 Re: Using multi-row technique with COPY