Quick Links

Re: Todo: Teach planner to evaluate multiple windows in the optimal order

From:	Ankit Kumar Pandey <itsankitkp(at)gmail(dot)com>
To:	David Rowley <dgrowleyml(at)gmail(dot)com>
Cc:	pghackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Vik Fearing <vik(at)postgresfriends(dot)org>
Subject:	Re: Todo: Teach planner to evaluate multiple windows in the optimal order
Date:	2023-01-16 06:08:45
Message-ID:	2dfc2b6f-d05f-a11d-7732-563712c454b5@gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> On 16/01/23 09:48, David Rowley wrote:
> I don't think we should touch this. It could significantly increase
> the number of indexes that we consider when generating paths on base
> relations and therefore *significantly* increase the number of paths
> we consider during the join search. What I had in mind was just
> making better use of existing paths to see if we can find a cheaper
> way to perform the DISTINCT. That'll only possibly increase the
> number of paths for the distinct upper relation which really only
> increases the number of paths which are considered in
> create_ordered_paths(). That's unlikely to cause much of a slowdown in
> the planner.

Okay, I see the issue. Makes sense

> I'm seeing these two things as separate patches. I don't think there's
> any need to add further complexity to the patch that tries to reduce
> the number of sorts for WindowAggs. I think you'd better start a new
> thread for this.

Will be starting new thread for this and separate patch.

> As far as I see it, you shouldn't be touching the distinct_pathkeys.
> Those are set in such a way as to minimise the likelihood of an
> additional sort for the ORDER BY. If you've fiddled with that, then I
> imagine this is why the plan below has an additional Incremental Sort
> that didn't exist before.

> I've not looked at your patch, but all I imagine you need to do for it
> is to invent a function in pathkeys.c which is along the lines of what
> pathkeys_count_contained_in() does, but returns a List of pathkeys
> which are in keys1 but not in keys2 and NIL if keys2 has a pathkey
> that does not exist as a pathkey in keys1. In
> create_final_distinct_paths(), you can then perform an incremental
> sort on any input_path which has a non-empty return list and in
> create_incremental_sort_path(), you'll pass presorted_keys as the
> number of pathkeys in the path, and the required pathkeys the
> input_path->pathkeys + the pathkeys returned from the new function.

Okay, this should be straightforward. Let me try this.

> As an optimization, you might want to consider that the
> distinct_pathkeys list might be long and that the new function, if you
> code the lookup as a nested loop, might be slow. You might want to
> consider hashing the distinct_pathkeys once in
> create_final_distinct_paths(), then for each input_path, perform a
> series of hash lookups to see which of the input_path->pathkeys are in
> the hash table. That might require adding two functions to pathkeys.c,
> one to build the hash table and then another to probe it and return
> the remaining pathkeys list. I'd go and make sure it all works as we
> expect before going to the trouble of trying to optimize this. A
> simple nested loop lookup will allow us to review that this works as
> we expect.

Okay make sense, will start with nested loop while it is in review and then

optimal version once it is all good to go.

Thanks,

Ankit

In response to

Re: Todo: Teach planner to evaluate multiple windows in the optimal order at 2023-01-16 04:18:32 from David Rowley

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Pavel Stehule	2023-01-16 06:09:51	Re: On login trigger: take three
Previous Message	Michael Paquier	2023-01-16 05:22:27	Re: [EXTERNAL] Re: [PATCH] Support using "all" for the db user in pg_ident.conf