Re: POC: postgres_fdw insert batching

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: postgres_fdw insert batching
Date: 2021-01-20 02:05:32
Message-ID: CA+HiwqHTeQKqRpCA3TGZSCYTVOcUoUdXeSrRfjRRcmT1CfM1oA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 20, 2021 at 1:01 AM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> On 1/19/21 7:23 AM, Amit Langote wrote:
> > On Tue, Jan 19, 2021 at 2:06 PM tsunakawa(dot)takay(at)fujitsu(dot)com
> >> Actually, I tried to do it (adding the GetModifyBatchSize() call after BeginForeignModify()), but it failed. Because postgresfdwGetModifyBatchSize() wants to know if RETURNING is specified, and ResultRelInfo->projectReturning is created after the above part. Considering the context where GetModifyBatchSize() implementations may want to know the environment, I placed the call as late as possible in the initialization phase. As for the future(?) multi-target DML statements, I think we can change this together with other many(?) parts that assume a single target table.
> >
> > Okay, sometime later then.
> >
> > I wasn't sure if bringing it up here would be appropriate, but there's
> > a patch by me to refactor ModfiyTable result relation allocation that
> > will have to remember to move this code along to an appropriate place
> > [1]. Thanks for the tip about the dependency on how RETURNING is
> > handled. I will remember it when rebasing my patch over this.
> >
>
> Thanks. The last version (v12) should be addressing all the comments and
> seems fine to me, so barring objections I'll get that pushed shortly.

+1

> One thing that seems a bit annoying is that with the partitioned table
> the explain (verbose) looks like this:
>
> QUERY PLAN
> -----------------------------------------------------
> Insert on public.batch_table
> -> Function Scan on pg_catalog.generate_series i
> Output: i.i
> Function Call: generate_series(1, 66)
> (4 rows)
>
> That is, there's no information about the batch size :-( But AFAICS
> that's due to how explain shows (or rather does not) partitions in this
> type of plan.

Yeah. Partition result relations are always lazily allocated for
INSERT, so EXPLAIN (without ANALYZE) has no idea what to show for
them, nor does it know which partitions will be used in the first
place. With ANALYZE however, you could get them from
es_tuple_routing_result_relations and maybe list them if you want, but
that sounds like a project on its own.

--
Amit Langote
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuro Yamada 2021-01-20 02:05:54 Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
Previous Message Tom Lane 2021-01-20 01:57:22 Re: Odd, intermittent failure in contrib/pageinspect