Re: COPY FROM WHEN condition

From: Andres Freund <andres(at)anarazel(dot)de>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Surafel Temesgen <surafel3000(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Adam Berlin <berlin(dot)ab(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: COPY FROM WHEN condition
Date: 2019-01-29 21:53:24
Message-ID: 20190129215324.boi5jkuspopumwpg@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-01-30 10:33:30 +1300, David Rowley wrote:
> On Wed, 30 Jan 2019 at 10:12, Andres Freund <andres(at)anarazel(dot)de> wrote:
> >
> > On 2019-01-30 10:05:35 +1300, David Rowley wrote:
> > > On Wed, 30 Jan 2019 at 04:22, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > > I think I might have a patch addressing the problem incidentally. For pluggable storage I slotified copy.c, which also removes the first heap_form_tuple. Quite possible that nothing more is needed. I've removed the batch context altogether in yesterday's rebase, there was no need anymore.
> > >
> > > In your patch, where do the batched tuples get stored before the heap
> > > insert is done?
> >
> > There's one slot for each batched tuple (they are reused). Before
> > materialization the tuples solely exist in tts_isnull/values into which
> > NextCopyFrom() directly parses the values. Tuples never get extracted
> > from the slot in copy.c itself anymore, table_multi_insert() accepts
> > slots. Not quite sure whether I've answered your question?
>
> I think so. I imagine that should also speed up COPY WHERE too as
> it'll no longer form a tuple before possibly discarding it.

Right.

I found some issues in my patch (stupid implementation of copying from
one slot to the other), but after fixing that I get:

master:
Time: 16013.509 ms (00:16.014)
Time: 16836.110 ms (00:16.836)
Time: 16636.796 ms (00:16.637)

pluggable storage:
Time: 15974.243 ms (00:15.974)
Time: 16183.442 ms (00:16.183)
Time: 16055.192 ms (00:16.055)

(with a truncate between each run)

So that seems a bit better. Albeit at the cost of having a few, on
demand creatd, empty slots for each encountered partition.

I'm pretty sure we can optimize that further...

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-01-30 00:46:53 Re: A few new options for vacuumdb
Previous Message Bossart, Nathan 2019-01-29 21:48:18 Re: A few new options for vacuumdb