Re: Allow batched insert during cross-partition updates

From: gkokolatos(at)pm(dot)me
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)fujitsu(dot)com>
Subject: Re: Allow batched insert during cross-partition updates
Date: 2021-03-16 08:11:57
Message-ID: NB-VVc_lc77vRhkSv4QA3Nzew1g-qiLD1zw6arRgwnDnpNKkKE77z123nmotxPU1e0UQV5v9DXXgMKHLXoxLiC9EFuHXlPO7U3vyJ2gb82o=@pm.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Tuesday, March 16, 2021 6:13 AM, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:

> Hi Georgios,
>
> On Fri, Mar 12, 2021 at 7:59 PM gkokolatos(at)pm(dot)me wrote:
>
> > On Friday, March 12, 2021 3:45 AM, Amit Langote amitlangote09(at)gmail(dot)com wrote:
> >
> > > On Thu, Mar 11, 2021 at 8:36 PM gkokolatos(at)pm(dot)me wrote:
> > >
> > > > On Thursday, March 11, 2021 9:42 AM, Amit Langote amitlangote09(at)gmail(dot)com wrote:
> > > >
> > > > > What we do support however is moving rows from a local partition to a
> > > > > remote partition and that involves performing an INSERT on the latter.
> > > > > This patch is for teaching those INSERTs to use batched mode if
> > > > > allowed, which is currently prohibited. So with this patch, if an
> > > > > UPDATE moves 10 rows from a local partition to a remote partition,
> > > > > then they will be inserted with a single INSERT command containing all
> > > > > 10 rows, instead of 10 separate INSERT commands.
> > > >
> > > > So, if I understand correctly then in my previously attached repro I
> > > > should have written instead:
> > > >
> > > > CREATE TABLE local_root_remote_partitions (a int) PARTITION BY LIST ( a );
> > > > CREATE TABLE
> > > > local_root_local_partition_1
> > > > PARTITION OF
> > > > local_root_remote_partitions FOR VALUES IN (1);
> > > >
> > > > CREATE FOREIGN TABLE
> > > > local_root_remote_partition_2
> > > > PARTITION OF
> > > > local_root_remote_partitions FOR VALUES IN (2)
> > > > SERVER
> > > > remote_server
> > > > OPTIONS (
> > > > table_name 'remote_partition_2',
> > > > batch_size '10'
> > > > );
> > > >
> > > > INSERT INTO local_root_remote_partitions VALUES (1), (1);
> > > > -- Everything should be on local_root_local_partition_1 and on the same transaction
> > > > SELECT ctid, xmin, xmax, cmax, tableoid::regclass, a FROM local_root_remote_partitions;
> > > >
> > > > UPDATE local_root_remote_partitions SET a = 2;
> > > > -- Everything should be on remote_partition_2 and on the same transaction
> > > > SELECT ctid, xmin, xmax, cmax, tableoid::regclass, a FROM local_root_remote_partitions;
> > > >
> > > >
> > > > I am guessing that I am still wrong because the UPDATE operation above will
> > > > fail due to the restrictions imposed in postgresBeginForeignInsert regarding
> > > > UPDATES.
> > >
> > > Yeah, for the move to work without hitting the restriction you
> > > mention, you will need to write the UPDATE query such that
> > > local_root_remote_partition_2 is not updated. For example, as
> > > follows:
> > > UPDATE local_root_remote_partitions SET a = 2 WHERE a <> 2;
> > > With this query, the remote partition is not one of the result
> > > relations to be updated, so is able to escape that restriction.
> >
> > Excellent. Thank you for the explanation and patience.
> >
> > > > Would it be too much to ask for the addition of a test case that will
> > > > demonstrate the change of behaviour found in patch.
> > >
> > > Hmm, I don't think there's a way to display whether the INSERT done on
> > > the remote partition as a part of an (tuple-moving) UPDATE used
> > > batching or not. That's because that INSERT's state is hidden from
> > > EXPLAIN. Maybe we should change EXPLAIN to make it show such hidden
> > > INSERT's state (especially its batch size) under the original UPDATE
> > > node, but I am not sure.
> >
> > Yeah, there does not seem to be a way for explain to do show that information
> > with the current code.
> >
> > > By the way, the test case added by commit 927f453a94106 does exercise
> > > the code added by this patch, but as I said in the last paragraph, we
> > > can't verify that by inspecting EXPLAIN output.
> >
> > I never doubted that. However, there is a difference. The current patch
> > changes the query to be executed in the remote from:
> > INSERT INTO <snip> VALUES ($1);
> > to:
> > INSERT INTO <snip> VALUES ($1), ($2) ... ($n);
> > When this patch gets in, it would be very helpful to know that subsequent
> > code changes will not cause regressions. So I was wondering if there is
> > a way to craft a test case that would break for the code in 927f453a94106
> > yet succeed with the current patch.
>
> The test case "works" both before and after the patch, with the
> difference being in the form of the remote query. It seems to me
> though that you do get that.
>
> > I attach version 2 of my small reproduction. I am under the impression that
> > in this version, examining the value of cmin in the remote table should
> > give an indication of whether the remote received a multiple insert queries
> > with a single value, or a single insert query with multiple values.
> > Or is this a wrong assumption of mine?
>
> No, I think you have a good idea here.

Thank you.

>
> I've adjusted that test case to confirm that the batching indeed works
> by checking cmin of the moved rows, as you suggest. Please check the
> attached updated patch.

Excellent. The patch in the current version with the added test seems
ready to me.

I would still vote to have accessor functions instead of exposing the
whole PartitionTupleRouting struct, but I am not going to hold a too
strong stance about it.

If you agree with me, please provide an updated version of the patch.
Otherwise let it be known and I will flag the patch as RfC in the
commitfest app.

Cheers,
//Georgios

>
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Amit Langote
> EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2021-03-16 08:17:08 Re: shared-memory based stats collector
Previous Message Fujii Masao 2021-03-16 08:10:28 Re: fdatasync performance problem with large number of DB files