Re: [HACKERS] path toward faster partition pruning

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] path toward faster partition pruning
Date: 2018-01-10 04:18:58
Message-ID: CAKJS1f_UBRUtmX29r-3ayysGAT3PGcs-z9DQDqGhuat87dbFrg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 9 January 2018 at 21:40, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> Sorry about the absence in the last few days. I will post a new version
> addressing various review comments by the end of this week.

I'm sorry for the flood of emails today.

Beena's tests on the run-time partition pruning patch also indirectly
exposed a problem with this patch.

Basically, the changes to add_paths_to_append_rel() are causing
duplication in partition_rels.

A test case is:

create table part (a int, b int) partition by list(a);
create table part1 partition of part for values in(1) partition by list (b);
create table part2 partition of part1 for values in(1);

select * from part;

partition_rels ends up with 3 items in the list, but there's only 2
partitions here. The reason for this is that, since planning here is
recursively calling add_paths_to_append_rel, the list for part ends up
with itself and part1 in it, then since part1's list already contains
itself, per set_append_rel_size's "rel->live_partitioned_rels =
list_make1_int(rti);", then part1 ends up in the list twice.

It would be nicer if you could use a RelIds for this, but you'd also
need some way to store the target partition relation since
nodeModifyTable.c does:

/* The root table RT index is at the head of the partitioned_rels list */
if (node->partitioned_rels)
{
Index root_rti;
Oid root_oid;

root_rti = linitial_int(node->partitioned_rels);
root_oid = getrelid(root_rti, estate->es_range_table);
rel = heap_open(root_oid, NoLock); /* locked by InitPlan */
}

You could also fix it by instead of doing:

/*
* Accumulate the live partitioned children of this child, if it's
* itself partitioned rel.
*/
if (childrel->part_scheme)
partitioned_rels = list_concat(partitioned_rels,
childrel->live_partitioned_rels);

do something along the lines of:

if (childrel->part_scheme)
{
ListCell *lc;
ListCell *start = lnext(list_head(childrel->live_partitioned_rels));

for_each_cell(lc, start)
partitioned_rels = lappend_int(partitioned_rels,
lfirst_int(lc));
}

Although it seems pretty fragile. It would probably be better to find
a nicer way of handling all this.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2018-01-10 04:24:48 Re: Rangejoin rebased
Previous Message Pavel Stehule 2018-01-10 04:14:24 Re: [HACKERS] SQL/JSON in PostgreSQL