Re: Custom table AMs need to include heapam.h because of BulkInsertState

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Custom table AMs need to include heapam.h because of BulkInsertState
Date: 2019-06-30 05:54:29
Message-ID: CAKJS1f8+gBJs_3qVy2=ckfT0nOBHRXxy=kG5O-pE8jNC0JoWCQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 24 Jun 2019 at 23:12, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
>
> On Mon, 24 Jun 2019 at 22:16, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> >
> > Don't take me bad, but I find the solution of defining and using a new
> > callback to call the table AM callback not really elegant, and keeping
> > all table AM callbacks called at a higher level than the executor
> > makes the code easier to follow. Shouldn't we try to keep any calls
> > to table_finish_bulk_insert() within copy.c for each partition
> > instead?
>
> I'm not quite sure if I follow you since the call to
> table_finish_bulk_insert() is within copy.c still.
>
> The problem was that PartitionTupleRouting is private to
> execPartition.c, and we need a way to determine which of the
> partitions we routed tuples to. It seems inefficient to flush all of
> them if only a small number had tuples inserted into them and to me,
> it seems inefficient to add some additional tracking in CopyFrom(),
> like a hash table to store partition Oids that we inserted into. Using
> PartitionTupleRouting makes sense. It's just a question of how to
> access it, which is not so easy due to it being private.
>
> I did suggest a few other ways that we could solve this. I'm not so
> clear on which one of those you're suggesting or if you're thinking of
> something new.

Any further thoughts on this Michael?

Or Andres? Do you have a preference to which of the approaches
(mentioned upthread) I use for the fix?

If I don't hear anything I'll probably just push the first fix. The
inefficiency does not affect heap, so likely the people with the most
interest in improving that will be authors of other table AMs that
actually do something during table_finish_bulk_insert() for
partitions. We could revisit this in PG13 if someone comes up with a
need to improve things here.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Goel, Dhruv 2019-06-30 07:30:01 Re: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY
Previous Message Pavel Stehule 2019-06-30 03:10:08 Re: [HACKERS] proposal: schema variables