Re: Inserting heap tuples in bulk in COPY

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Inserting heap tuples in bulk in COPY
Date: 2012-08-08 02:44:47
Message-ID: CAMkU=1wqoupOnZdYT5iuHUrVTU6JuHR1q=qDGWqc3NuU8c2_wQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 7, 2012 at 1:52 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 7 August 2012 20:58, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>> Hi Heikki,
>>
>> Is the bulk index insert still an active area for you?
>>
>> If not, is there some kind of summary of design or analysis work
>> already done, which would be a useful for someone else interested in
>> picking it up?
>
> The main cost comes from repeated re-seeking down the index tree to
> find the insertion point, but repeated lock and pin operations on
> index buffers are also high. That is optimisable if the index inserts
> are grouped, as they will be with monotonic indexes, (e.g. SERIAL), or
> with partial monotonic (i.e. with Parent/Child relationship, on Child
> table many inserts will be of the form (x,1), (x,2), (x, 3) etc, e.g.
> Order/OrderLine).
>
> All we need do is buffer the inserts in the inserts, before inserting
> them into the main index. As long as we flush the buffer before end of
> transaction, we're good.
>
> Incidentally, we can also optimise repeated inserts within a normal
> transaction using this method, by implementing deferred unique
> constraints. At present we say that unique constraints aren't
> deferrable, but there's no reason they can't be, if we allow buffering
> in the index. (Implementing a deferred constraint that won't fail if
> we do UPDATE foo SET pk = pk+1 requires a buffer the size of foo,
> which is probably a bad plan, plus you'd need to sort the inputs, so
> that particular nut is another issue altogether, AFAICS).
>
> Suggested implementation is to buffer index tuples at the generic
> index API, then implement a bulk insert index API call that can then
> be implemented for each AM separately. Suggested buffer size is
> work_mem. Note we must extract index tuple from heap tuples -
> refetching heap blocks to get rows is too costly.

OK, thanks for the summary. It looks like your plans are to improve
the case where the index maintenance is already CPU limited. I was
more interested in cases where the regions of the index(es) undergoing
active insertion do not fit into usable RAM, so the limit is the
random IO needed to fetch the leaf pages in order to update them or to
write them out once dirtied. Here too buffering is probably the
answer, but the size of the buffer needed to turn that IO from
effectively random to effectively sequential is probably much larger
than the size of the buffer you are contemplating.

> I think we need to implement buffering both to end of statement or end
> of transaction, not just one or the other.

With the planner deciding which would be better, or explicit user action?

Thanks,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2012-08-08 03:06:25 Re: WIP: pg_pretty_query
Previous Message Bruce Momjian 2012-08-07 23:46:35 Re: WIP fix proposal for bug #6123