Re: Todo: Teach planner to evaluate multiple windows in the optimal order

From: Ankit Kumar Pandey <itsankitkp(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: pghackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Vik Fearing <vik(at)postgresfriends(dot)org>
Subject: Re: Todo: Teach planner to evaluate multiple windows in the optimal order
Date: 2023-01-16 06:08:45
Message-ID: 2dfc2b6f-d05f-a11d-7732-563712c454b5@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On 16/01/23 09:48, David Rowley wrote:
> I don't think we should touch this. It could significantly increase
> the number of indexes that we consider when generating paths on base
> relations and therefore *significantly* increase the number of paths
> we consider during the join search. What I had in mind was just
> making better use of existing paths to see if we can find a cheaper
> way to perform the DISTINCT. That'll only possibly increase the
> number of paths for the distinct upper relation which really only
> increases the number of paths which are considered in
> create_ordered_paths(). That's unlikely to cause much of a slowdown in
> the planner.

Okay, I see the issue. Makes sense

> I'm seeing these two things as separate patches. I don't think there's
> any need to add further complexity to the patch that tries to reduce
> the number of sorts for WindowAggs. I think you'd better start a new
> thread for this.

Will be starting new thread for this and separate patch.

> As far as I see it, you shouldn't be touching the distinct_pathkeys.
> Those are set in such a way as to minimise the likelihood of an
> additional sort for the ORDER BY. If you've fiddled with that, then I
> imagine this is why the plan below has an additional Incremental Sort
> that didn't exist before.

> I've not looked at your patch, but all I imagine you need to do for it
> is to invent a function in pathkeys.c which is along the lines of what
> pathkeys_count_contained_in() does, but returns a List of pathkeys
> which are in keys1 but not in keys2 and NIL if keys2 has a pathkey
> that does not exist as a pathkey in keys1. In
> create_final_distinct_paths(), you can then perform an incremental
> sort on any input_path which has a non-empty return list and in
> create_incremental_sort_path(), you'll pass presorted_keys as the
> number of pathkeys in the path, and the required pathkeys the
> input_path->pathkeys + the pathkeys returned from the new function.

Okay, this should be straightforward. Let me try this.

> As an optimization, you might want to consider that the
> distinct_pathkeys list might be long and that the new function, if you
> code the lookup as a nested loop, might be slow. You might want to
> consider hashing the distinct_pathkeys once in
> create_final_distinct_paths(), then for each input_path, perform a
> series of hash lookups to see which of the input_path->pathkeys are in
> the hash table. That might require adding two functions to pathkeys.c,
> one to build the hash table and then another to probe it and return
> the remaining pathkeys list. I'd go and make sure it all works as we
> expect before going to the trouble of trying to optimize this. A
> simple nested loop lookup will allow us to review that this works as
> we expect.

Okay make sense, will start with nested loop while it is in review and then

optimal version once it is all good to go.

Thanks,

Ankit

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2023-01-16 06:09:51 Re: On login trigger: take three
Previous Message Michael Paquier 2023-01-16 05:22:27 Re: [EXTERNAL] Re: [PATCH] Support using "all" for the db user in pg_ident.conf