Re: Allow batched insert during cross-partition updates

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: gkokolatos(at)pm(dot)me
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)fujitsu(dot)com>
Subject: Re: Allow batched insert during cross-partition updates
Date: 2021-03-12 02:45:50
Message-ID: CA+HiwqGs0svOtSB-pT21vx3vxr_DnUMvhfQBbQrjeE5B8ABhXA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 11, 2021 at 8:36 PM <gkokolatos(at)pm(dot)me> wrote:
> On Thursday, March 11, 2021 9:42 AM, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > On Wed, Mar 10, 2021 at 9:30 PM Georgios gkokolatos(at)protonmail(dot)com wrote:
> >
> > > I continued looking a bit at the patch, yet I am either failing to see fix or I am
> > > looking at the wrong thing. Please find attached a small repro of what my expectetions
> > > were.
> > > As you can see in the repro, I would expect the
> > > UPDATE local_root_remote_partitions SET a = 2;
> > > to move the tuples to remote_partition_2 on the same transaction.
> > > However this is not the case, with or without the patch.
> > > Is my expectation of this patch wrong?
> >
> > I think yes. We currently don't have the feature you are looking for
> > -- moving tuples from one remote partition to another remote
> > partition. This patch is not for adding that feature.
>
> Thank you for correcting me.
> >
> > What we do support however is moving rows from a local partition to a
> > remote partition and that involves performing an INSERT on the latter.
> > This patch is for teaching those INSERTs to use batched mode if
> > allowed, which is currently prohibited. So with this patch, if an
> > UPDATE moves 10 rows from a local partition to a remote partition,
> > then they will be inserted with a single INSERT command containing all
> > 10 rows, instead of 10 separate INSERT commands.
>
> So, if I understand correctly then in my previously attached repro I
> should have written instead:
>
> CREATE TABLE local_root_remote_partitions (a int) PARTITION BY LIST ( a );
> CREATE TABLE
> local_root_local_partition_1
> PARTITION OF
> local_root_remote_partitions FOR VALUES IN (1);
>
> CREATE FOREIGN TABLE
> local_root_remote_partition_2
> PARTITION OF
> local_root_remote_partitions FOR VALUES IN (2)
> SERVER
> remote_server
> OPTIONS (
> table_name 'remote_partition_2',
> batch_size '10'
> );
>
> INSERT INTO local_root_remote_partitions VALUES (1), (1);
> -- Everything should be on local_root_local_partition_1 and on the same transaction
> SELECT ctid, xmin, xmax, cmax, tableoid::regclass, a FROM local_root_remote_partitions;
>
> UPDATE local_root_remote_partitions SET a = 2;
> -- Everything should be on remote_partition_2 and on the same transaction
> SELECT ctid, xmin, xmax, cmax, tableoid::regclass, a FROM local_root_remote_partitions;
>
>
> I am guessing that I am still wrong because the UPDATE operation above will
> fail due to the restrictions imposed in postgresBeginForeignInsert regarding
> UPDATES.

Yeah, for the move to work without hitting the restriction you
mention, you will need to write the UPDATE query such that
local_root_remote_partition_2 is not updated. For example, as
follows:

UPDATE local_root_remote_partitions SET a = 2 WHERE a <> 2;

With this query, the remote partition is not one of the result
relations to be updated, so is able to escape that restriction.

> Would it be too much to ask for the addition of a test case that will
> demonstrate the change of behaviour found in patch.

Hmm, I don't think there's a way to display whether the INSERT done on
the remote partition as a part of an (tuple-moving) UPDATE used
batching or not. That's because that INSERT's state is hidden from
EXPLAIN. Maybe we should change EXPLAIN to make it show such hidden
INSERT's state (especially its batch size) under the original UPDATE
node, but I am not sure.

By the way, the test case added by commit 927f453a94106 does exercise
the code added by this patch, but as I said in the last paragraph, we
can't verify that by inspecting EXPLAIN output.

--
Amit Langote
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2021-03-12 03:01:33 Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Previous Message Hao Wu 2021-03-12 02:14:42 HotStandbyActive() issue in postgres