Re: paths in partitions of a dummy partitioned table

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: paths in partitions of a dummy partitioned table
Date: 2017-07-10 11:35:39
Message-ID: CAFjFpReamHkGzo=xpeuN8kAO5J6=hYa2H9VetX0a6WVDeAXGXQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 10, 2017 at 2:59 PM, Kyotaro HORIGUCHI
<horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> Hello,

Thanks for the review.

>> If a partitioned table is proven dummy, set_rel_pathlist() doesn't mark the
>> partition relations dummy and thus doesn't set any (dummy) paths in the
>> partition relations. The lack of paths in the partitions means that we can
>
> A parent won't be proven dummy directly but be a dummy rel (means
> IS_DUMMY_REL(rel) returns true) if no child is attached as the
> result of constarint exlusion.

In a query like SELECT * FROM parent WHERE 1 = 2; the parent is marked
dummy in set_rel_size()
319 set_rel_size(PlannerInfo *root, RelOptInfo *rel,
320 Index rti, RangeTblEntry *rte)
321 {
322 if (rel->reloptkind == RELOPT_BASEREL &&
323 relation_excluded_by_constraints(root, rel, rte))
324 {
325 /*
326 * We proved we don't need to scan the rel via constraint
exclusion,
327 * so set up a single dummy path for it. Here we only
check this for
328 * regular baserels; if it's an otherrel, CE was already checked in
329 * set_append_rel_size().
330 *
331 * In this case, we go ahead and set up the relation's
path right away
332 * instead of leaving it for set_rel_pathlist to do.
This is because
333 * we don't have a convention for marking a rel as dummy except by
334 * assigning a dummy path to it.
335 */
336 set_dummy_rel_pathlist(rel);
337 }

It's in such cases, that a parent is proven dummy without looking at
the child relations.

>
>> not use partition-wise join while joining this table with some other similarly
>> partitioned table as the partitions do not have any paths that child-joins can
>> use. This means that we can not create partition relations for such a join and
>> thus can not consider the join to be partitioned. This doesn't matter much when
>> the dummy relation is the outer relation, since the resultant join is also
>> dummy. But when the dummy relation is inner relation, the resultant join is not
>> dummy and can be considered to be partitioned with same partitioning scheme as
>> the outer relation to be joined with other similarly partitioned table. Not
>> having paths in the partitions deprives us of this future optimization.
>> Without partition-wise join, not setting dummy paths in the partition relations
>> makes sense, since those do not have any use. But with partition-wise join that
>> changes.
>
> Of course for inner-joins and even for outer-joins, there's no
> point in considering the children of a dummy parent as a side of
> a join. For what reason, or in what case does partition-wise join
> need to consider children of a proven-dummy parent?

Consider a join A LEFT JOIN B LEFT JOIN C where B is proven dummy and
A, B and C are all partitioned with the same partitioning scheme. The
result of A LEFT JOIN B is same as appending Ak LEFT JOIN Bk where Ak
and Bk are matching partitions of A and B resp. When we join this with
C, the result would be same as appending Ak LEFT JOIN Bk LEFT JOIN Ck
where Ck is partition matching Ak and Bk. So, even though B is proven
dummy, we can execute A LEFT JOIN B LEFT JOIN C as a partition-wise
join, if we can execute A LEFT JOIN B as a partition-wise join.

Does that answer your question?

>
>> If we always mark the partitions dummy, that effort may get wasted if the
>> partitioned table doesn't participate in the partition-wise join. A possible
>> solution would be to set the partitions dummy when only the partitioned table
>> participates in the join, but that happens during make_join_rel(), much after
>> we have set up the simple relations. So, I am not sure, whether that's a good
>> option. But I am open to suggestions.
>>
>> What if we always mark the partition relations of a dummy partitioned table
>> dummy? I tried attached path on a thousand partition table, the planning time
>> increased by 3-5%. That doesn't look that bad for a thousand partitions.
>>
>> Any other options/comments?
>
> Since I don't understand the meaning of the patch, the comments
> below are just from a micro-viewpoint.
>
> - if (IS_DUMMY_REL(rel))
> + if (IS_DUMMY_REL(rel) && !rte->inh)
>
> In set_rel_pathlist, if want to exlude the inh==true case from
> the first if, just inverting the first and second if caluses
> would be enough, like this.
>
> | if (rte->inh)
> | set_append_rel_pathlist(root, rel, rti, rte);
> | else if (IS_DUMMY_REL(rel))
> | /* We already proved the relation empty, so nothing more to do *

Right. Done in the attached patch.

>
> + if (IS_DUMMY_REL(rel))
> + set_dummy_rel_pathlist(childrel);
>
> This is equivalent of just returning before looking over
> append_rel_list when rel is a dummy. It doesn't seem to me to add
> marked-as-dummy children to a dummy parent. (I understand that is
> the objective of this patch.)

Not really. It sets the dummy path list in child, which won't be done
if we return without traversing append_rel_list, esp. when the parent
is marked dummy without marking any of its children dummy as explained
above.

But we could avoid marking a child dummy twice, in case the parent was
marked dummy because all of its children were marked dummy. I have
fixed this case in the attached patch.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Attachment Content-Type Size
pg_dummy_part_table.patch text/x-patch 1.6 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message fcs1 2017-07-10 11:39:17 BUG #14738: ALTER SERVER for foregin servers not working
Previous Message Fabien COELHO 2017-07-10 10:54:10 Re: [WIP] Zipfian distribution in pgbench