Re: Partition-wise join for join between (declaratively) partitioned tables

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partition-wise join for join between (declaratively) partitioned tables
Date: 2017-10-06 11:39:51
Message-ID: CAFjFpReEiCDi46PaoLpX_Wf6=+4VGbC6B+Hu_r=hXLNKWOVszQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 5, 2017 at 12:24 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
>> We need to reparameterize any path which contains further paths and/or
>> contains expressions that point to the parent relation. For a given
>> path we need to reparameterize any paths that it contains and
>> translate any expressions that are specific to that path. Expressions
>> common across the paths are translated after the switch case. I have
>> added this rule to the comment just above the switch case
>> /*
>> * Copy of the given path. Reparameterize any paths referenced by the given
>> * path. Replace parent Vars in path specific expressions by corresponding
>> * child Vars.
>> */
>> Does that look fine or we want to add explanation for every node handled here.
>
> No, I don't think we want something for every node, just a general
> explanation at the top of the function. Maybe something like this:
>
> Most fields from the original path can simply be flat-copied, but any
> expressions must be adjusted to refer to the correct varnos, and any
> paths must be recursively reparameterized. Other fields that refer to
> specific relids also need adjustment.

Done.

>
>>> I don't see much point in the T_SubqueryScanPath and T_ResultPath
>>> cases in reparameterize_path_by_child(). It's just falling through to
>>> the default case.
>>
>> I added those cases separately to explain why we should not see those
>> cases in that switch case. I think that explanation is important
>> (esp. considering your comment above) and associating those comment
>> with "case" statement looks better. Are you suggesting that we should
>> add that explanation in default case?
>
> Or leave the explanation out altogether.

Ok. Removed the explanation and the cases.

>
>>> I wonder if reparameterize_path_by_child() ought to default to
>>> returning NULL rather than throwing an error; the caller would then
>>> have to be prepared for that and skip building the path. But that
>>> would be more like what reparameterize_path() does, and it would make
>>> failure to include some relevant path type here a corner-case
>>> performance bug rather than a correctness issue. It seems like
>>> someone adding a new path type could quite easily fail to realize that
>>> it might need to be added here, or might be unsure whether it's
>>> necessary to add it here.
>>
>> I am OK with that. However reparameterize_path_by_child() and
>> reparameterize_paths_by_child() are callers of
>> reparameterize_path_by_child() so they will need to deal with NULL
>> return. I am fine with that too, but making sure that we are on the
>> same page. If we do that, we could simply assert that the switch case
>> doesn't see T_SubqueryScanPath and T_ResultPath.
>
> Or do nothing at all about those cases.
>
> I noticed today that the version of the patchset I have here says in
> the header comments for reparameterize_path_by_child() that it returns
> NULL if it can't reparameterize, but that's not what it actually does.
> If you make this change, the existing comment will become correct.
>
> The problem with the NULL return convention is that it's not very
> convenient when this function is recursing. Maybe we should change
> this function's signature to be bool
> reparameterize_path_by_child(PlannerInfo *root, RelOptInfo *child_rel,
> Path **path); then you could do, e.g. if
> (!reparameterize_path_by_child(root, child_rel, &bhpath->bitmapqual))
> return;
>
> But I don't really like that approach; it's still quite long-winded.
> Instead, I suggest Stupid Macro Tricks:
>
> #define ADJUST_CHILD_ATTRS(val) \
> val = (List *) adjust_appendrel_attrs_multilevel((Node *) val,
> child_rel->relids, child_rel->top_parent_relids);

It so happens that every node we subject to
adjust_appendrel_attrs_multilevel is List, so this is ok. In case we
need to adjust some other type of node in future, we will pass node
type too. For now, I have used the macro with (List *) hardcoded
there. Do we write the whole macro on the same line even if it
overflows? I see that being done for CONSIDER_PATH_STARTUP_COST
defined in the same file and you also seem to suggest the same. But
macros at other places are indented. For now, I have indented the
macro on multiple lines.

>
> #define REPARAMETERIZE_CHILD_PATH(val) \
> val = reparameterize_path_by_child(root, val, child_rel); \
> if (val == NULL) \
> return NULL;
>
> #define REPARAMETERIZE_CHILD_PATH_LIST(val) \
> if (val != NIL) \
> { \
> val = reparameterize_pathlist_by_child(root, val, child_rel); \
> if (val == NIL) \
> return NULL; \
> }

I added do { } while (0) around these code blocks like other places.
Please feel free to remove it if you don't think that's not needed.

>
> With that, a complicated case like T_NestPath becomes just:
>
> JoinPath *jpath;
>
> FLAT_COPY_PATH(jpath, path, NestPath);
> REPARAMETERIZE_CHILD_PATH(jpath->outerjoinpath);
> REPARAMETERIZE_CHILD_PATH(jpath->innerjoinpath);
> ADJUST_CHILD_ATTRS(jpath->joinrestrictinfo);
> new_path = (Path *) jpath;
>
> Now, I admit that hiding stuff inside the macro definitions like that
> is ugly. But I think it's still better than repeating boilerplate
> code with finnicky internal bits lots of times.

I too do not like hiding stuff under macros since that make debugging
hard, but with these macros code looks really elegant. Thanks for the
suggestion.

Also fixed some lines overflowing character limit.

>
>> Yes, I too am thinking about the same. The only reason I have EXPLAIN
>> output there is to check whether partition-wise join is being used or
>> not. The testcase is not interested in the actual shape. It doesn't
>> make sense to just test the output if partition-wise join is not used.
>> May be a function examining the plan tree would help. The function
>> will have to handle Result/Sort nodes on top and make sure that Append
>> has join children. Do you have any other idea to check the shape of
>> the plan tree without the details? Any EXPLAIN switch, existing
>> functions etc.?
>
> No, not really. We may just need to be prepared to fix whatever breaks.

Sure.

>
> Instead of "multi-leveled partitions" it might read better to say
> "multiple levels of partitioning".

Done.

Here's updated set of patches, rebased on top of the latest head.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Attachment Content-Type Size
pg_dp_join_patches_v36.tar.gz application/x-gzip 122.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2017-10-06 11:47:41 Re: Partition-wise join for join between (declaratively) partitioned tables
Previous Message Amit Kapila 2017-10-06 11:08:41 Re: parallelize queries containing initplans