From: | gkokolatos(at)pm(dot)me |
---|---|
To: | Amit Langote <amitlangote09(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)fujitsu(dot)com> |
Subject: | Re: Allow batched insert during cross-partition updates |
Date: | 2021-03-12 10:59:19 |
Message-ID: | los9wb7ge4GSExGQxQmLYOhfIFxRUCOTVM9RbFhXnScVtBWWrheAauC9oVo9mpYfeV7wFOn2a-ccPG1ktOMb42sp_tWzTey912lPPGu3-EU=@pm.me |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Friday, March 12, 2021 3:45 AM, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> On Thu, Mar 11, 2021 at 8:36 PM gkokolatos(at)pm(dot)me wrote:
>
> > On Thursday, March 11, 2021 9:42 AM, Amit Langote amitlangote09(at)gmail(dot)com wrote:
> >
> > > On Wed, Mar 10, 2021 at 9:30 PM Georgios gkokolatos(at)protonmail(dot)com wrote:
> > >
> > > > I continued looking a bit at the patch, yet I am either failing to see fix or I am
> > > > looking at the wrong thing. Please find attached a small repro of what my expectetions
> > > > were.
> > > > As you can see in the repro, I would expect the
> > > > UPDATE local_root_remote_partitions SET a = 2;
> > > > to move the tuples to remote_partition_2 on the same transaction.
> > > > However this is not the case, with or without the patch.
> > > > Is my expectation of this patch wrong?
> > >
> > > I think yes. We currently don't have the feature you are looking for
> > > -- moving tuples from one remote partition to another remote
> > > partition. This patch is not for adding that feature.
> >
> > Thank you for correcting me.
> >
> > > What we do support however is moving rows from a local partition to a
> > > remote partition and that involves performing an INSERT on the latter.
> > > This patch is for teaching those INSERTs to use batched mode if
> > > allowed, which is currently prohibited. So with this patch, if an
> > > UPDATE moves 10 rows from a local partition to a remote partition,
> > > then they will be inserted with a single INSERT command containing all
> > > 10 rows, instead of 10 separate INSERT commands.
> >
> > So, if I understand correctly then in my previously attached repro I
> > should have written instead:
> >
> > CREATE TABLE local_root_remote_partitions (a int) PARTITION BY LIST ( a );
> > CREATE TABLE
> > local_root_local_partition_1
> > PARTITION OF
> > local_root_remote_partitions FOR VALUES IN (1);
> >
> > CREATE FOREIGN TABLE
> > local_root_remote_partition_2
> > PARTITION OF
> > local_root_remote_partitions FOR VALUES IN (2)
> > SERVER
> > remote_server
> > OPTIONS (
> > table_name 'remote_partition_2',
> > batch_size '10'
> > );
> >
> > INSERT INTO local_root_remote_partitions VALUES (1), (1);
> > -- Everything should be on local_root_local_partition_1 and on the same transaction
> > SELECT ctid, xmin, xmax, cmax, tableoid::regclass, a FROM local_root_remote_partitions;
> >
> > UPDATE local_root_remote_partitions SET a = 2;
> > -- Everything should be on remote_partition_2 and on the same transaction
> > SELECT ctid, xmin, xmax, cmax, tableoid::regclass, a FROM local_root_remote_partitions;
> >
> >
> > I am guessing that I am still wrong because the UPDATE operation above will
> > fail due to the restrictions imposed in postgresBeginForeignInsert regarding
> > UPDATES.
>
> Yeah, for the move to work without hitting the restriction you
> mention, you will need to write the UPDATE query such that
> local_root_remote_partition_2 is not updated. For example, as
> follows:
>
> UPDATE local_root_remote_partitions SET a = 2 WHERE a <> 2;
>
> With this query, the remote partition is not one of the result
> relations to be updated, so is able to escape that restriction.
Excellent. Thank you for the explanation and patience.
>
> > Would it be too much to ask for the addition of a test case that will
> > demonstrate the change of behaviour found in patch.
>
> Hmm, I don't think there's a way to display whether the INSERT done on
> the remote partition as a part of an (tuple-moving) UPDATE used
> batching or not. That's because that INSERT's state is hidden from
> EXPLAIN. Maybe we should change EXPLAIN to make it show such hidden
> INSERT's state (especially its batch size) under the original UPDATE
> node, but I am not sure.
Yeah, there does not seem to be a way for explain to do show that information
with the current code.
>
> By the way, the test case added by commit 927f453a94106 does exercise
> the code added by this patch, but as I said in the last paragraph, we
> can't verify that by inspecting EXPLAIN output.
I never doubted that. However, there is a difference. The current patch
changes the query to be executed in the remote from:
INSERT INTO <snip> VALUES ($1);
to:
INSERT INTO <snip> VALUES ($1), ($2) ... ($n);
When this patch gets in, it would be very helpful to know that subsequent
code changes will not cause regressions. So I was wondering if there is
a way to craft a test case that would break for the code in 927f453a94106
yet succeed with the current patch.
I attach version 2 of my small reproduction. I am under the impression that
in this version, examining the value of cmin in the remote table should
give an indication of whether the remote received a multiple insert queries
with a single value, or a single insert query with multiple values.
Or is this a wrong assumption of mine?
Cheers,
//Georgios
>
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Amit Langote
> EDB: http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
v2-repro.sql | application/sql | 1.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2021-03-12 11:11:03 | Re: unrecognized configuration parameter "plpgsql.check_asserts" |
Previous Message | Walker | 2021-03-12 10:53:26 | unrecognized configuration parameter "plpgsql.check_asserts" |