ON CONFLICT DO UPDATE for partitioned tables

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: ON CONFLICT DO UPDATE for partitioned tables
Date: 2018-02-28 00:46:02
Message-ID: 20180228004602.cwdyralmg5ejdqkq@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I updated Amit Langote's patch for INSERT ON CONFLICT DO UPDATE[1].
Following the lead of edd44738bc88 ("Be lazier about partition tuple
routing.") this incarnation only does the necessary push-ups for the
specific partition that needs it, at execution time. As far as I can
tell, it works as intended.

I chose to refuse the case where the DO UPDATE clause causes the tuple
to move to another partition (i.e. you're updating the partition key of
the tuple). While it's probably possible to implement that, it doesn't
seem a very productive use of time.

However, there is a shortcoming in the design: it fails if there are
multiple levels of partitioning, because there is no easy (efficient)
way to map the index OID more than one level. I had already mentioned
this shortcoming to Amit's patch. So this case (which I added in the
regression tests) fails unexpectedly:

-- multiple-layered partitioning
create table parted_conflict_test (a int primary key, b text) partition by range (a);
create table parted_conflict_test_1 partition of parted_conflict_test
for values from (0) to (10000) partition by range (a);
create table parted_conflict_test_1_1 partition of parted_conflict_test_1
for values from (0) to (100);
insert into parted_conflict_test values ('10', 'ten');
insert into parted_conflict_test values ('10', 'ten two')
on conflict (a) do update set b = excluded.b;
ERROR: invalid ON CONFLICT DO UPDATE specification
DETAIL: An inferred index was not found in partition "parted_conflict_test_1_1".

So the problem here is that my MapPartitionIndexList() implementation is
too stupid. I think it would be smarter to use the ResultRelInfo
instead of bare Relation, for one. But that still doesn't answer how to
find a "path" from root to leaf partition, which is what I'd need to
verify that there are valid pg_inherits relationships for the partition
indexes. I'm probably missing something about the partitionDesc or
maybe the partitioned_rels lists that helps me do it efficiently, but I
hope figure out soon.

One idea I was toying with is to add RelationData->rd_children as a list
of OIDs of children relations. So it should be possible to walk the
list from the root to the desired descendant, without having to scan
pg_inherits repeatedly ... although that would probably require doing
relation_open() for every index, which sounds undesirable.

(ISTM that having RelationData->rd_children might be a good idea in
general anyway -- I mean to speed up some code that currently scans
pg_inherits via find_inheritance_children. However, since the partition
descriptor itself is already in relcache, maybe this doesn't matter too
much.)

Another idea is to abandon the notion that we need to find a path from
parent index to descendant index, and just run the inference algorithm
again on the partition. I'm not sure how much I like this idea, yet.

Anyway, while this is still WIP, I think it works correctly for the case
where there is a single partition level.

[1] https://postgr.es/m/c1651d5b-7bd6-b7e7-e1cc-16ecfe2c0da5@lab.ntt.co.jp

--
Álvaro Herrer

Attachment Content-Type Size
v1-0001-fix-ON-CONFLICT-DO-UPDATE-for-partitioned-tables.patch text/plain 19.3 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-02-28 00:57:32 Re: Let's remove DSM_INPL_NONE.
Previous Message Thomas Munro 2018-02-28 00:21:03 Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)