Re: Properly pathify the union planner

From: Andy Fan <zhihuifan1213(at)163(dot)com>
To: Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Properly pathify the union planner
Date: 2024-02-06 22:29:28
Message-ID: 87o7cti48f.fsf@163.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hi,

> * I think we should update truncate_useless_pathkeys() to account for
> the ordering requested by the query's set operation;

Nice catch.

> I'm thinking that maybe it'd be better to move the work of sorting the
> subquery's paths to the outer query level, specifically within the
> build_setop_child_paths() function, just before we stick SubqueryScanPath
> on top of the subquery's paths. I think this is better because:
>
> 1. This minimizes the impact on subquery planning and reduces the
> footprint within the grouping_planner() function as much as possible.
>
> 2. This can help avoid the aforementioned add_path() issue because the
> two involved paths will be structured as:
>
> cheapest_path -> subqueryscan
> and
> cheapest_path -> sort -> subqueryscan
>
> If the two paths cost fuzzily the same and add_path() decides to keep
> the second one due to it having better pathkeys and pfree the first one,
> it would not be a problem.

This is a smart idea, it works because you create a two different
subqueryscan for the cheapest_input_path.

FWIW, I found we didn't create_sort_path during building a merge join
path, instead it just cost the sort and add it to the cost of mergejoin
path only and note this path needs a presorted data. At last during the
create_mergejoin_*plan*, it create the sort_plan really. As for the
mergeappend case, could we use the similar strategy? with this way, we
might simpliy the code to use MergeAppend node since the caller just
need to say I want to try MergeAppend with the given pathkeys without
really creating the sort by themselves.

(Have a quick glance of initial_cost_mergejoin and
create_mergejoin_plan, looks incremental sort doesn't work with mergejoin?)

>
> To assist the discussion I've attached a diff file that includes all the
> changes above.

+ */
+static int
+pathkeys_useful_for_setop(PlannerInfo *root, List *pathkeys)
+{
+ int n_common_pathkeys;
+
+ if (root->setop_pathkeys == NIL)
+ return 0; /* no special setop ordering requested */
+
+ if (pathkeys == NIL)
+ return 0; /* unordered path */
+
+ (void) pathkeys_count_contained_in(root->setop_pathkeys, pathkeys,
+ &n_common_pathkeys);
+
+ return n_common_pathkeys;
+}

The two if-clauses looks unnecessary, it should be handled by
pathkeys_count_contained_in already. The same issue exists in
pathkeys_useful_for_ordering as well. Attached patch fix it in master.

--
Best Regards
Andy Fan

Attachment Content-Type Size
v1-0001-Remove-the-unnecessary-checks-for-pathkey-routine.patch text/x-diff 1.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2024-02-06 23:33:36 Re: Make COPY format extendable: Extract COPY TO format implementations
Previous Message Andres Freund 2024-02-06 22:24:45 Re: confusing / inefficient "need_transcoding" handling in copy