Re: [PATCH] Teach planner to further optimize sort in distinct

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Ankit Kumar Pandey <itsankitkp(at)gmail(dot)com>
Cc: pghackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Teach planner to further optimize sort in distinct
Date: 2023-01-19 13:19:11
Message-ID: CAApHDvqf1HFTfdGQL5Hu61H2j+7S-DRwsw=3orcsiiW90eeuPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 18 Jan 2023 at 08:27, Ankit Kumar Pandey <itsankitkp(at)gmail(dot)com> wrote:
> There is bit confusion in wording here:
>
> "returns a List of pathkeys
> which are in keys1 but not in keys2 and NIL if keys2 has a pathkey
> that does not exist as a pathkey in keys1."
>
> You mean extract common keys without ordering right?

I think you should write a function like:

bool pathkeys_count_contained_in_unordered(List *keys1, List *keys2,
List **reorderedkeys, int *n_common)

which works very similarly to pathkeys_count_contained_in, but
populates *reorderedkeys so it contains all of the keys in keys1, but
put the matching ones in the same order as they are in keys2. If you
find a keys2 that does not exist in keys1 then just add the additional
unmatched keys1 keys to *reorderedkeys. Set *n_common to the number
of common keys excluding any that come after a key2 key that does not
exist as a key1 key.

You can just switch to using that function in
create_final_distinct_paths(). You'll need to consider if the query is
a DISTINCT ON query and not try the unordered version of the function
in that case.

I also just noticed that in build_index_paths() we'll leave the index
path's pathkeys empty if we deem the pathkeys as useless. I'm not
sure what the repercussions of setting those to the return value of
build_index_pathkeys() if useful_pathkeys is otherwise empty. It's
possible that truncate_useless_pathkeys() needs to be modified to
check if the pathkeys might be useful for DISTINCT, but now that I see
we don't populate the IndexPath's pathkeys when we deem them not
useful makes me wonder if this entire patch is a good idea. When I
thought about it I assumed that we always set IndexPath's pathkeys to
whatever (if any) sort order that the index provides.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message tushar 2023-01-19 13:20:33 Re: almost-super-user problems that we haven't fixed yet
Previous Message Amit Kapila 2023-01-19 13:17:22 Re: Time delayed LR (WAS Re: logical replication restrictions)