Re: Todo: Teach planner to evaluate multiple windows in the optimal order

From: Ankit Kumar Pandey <itsankitkp(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: pghackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Vik Fearing <vik(at)postgresfriends(dot)org>
Subject: Re: Todo: Teach planner to evaluate multiple windows in the optimal order
Date: 2023-01-13 05:36:19
Message-ID: 8a0f0cbc-802f-56b1-5733-d2eed116b4e8@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On 13/01/23 07:48, David Rowley wrote:

> I don't think you can claim that one so easily. The two should have
> quite different scaling characteristics which will be more evident
> with a larger number of input rows. Also, Hash Aggregate makes use of
> work_mem * hash_mem_multiplier, whereas sort uses work_mem. Consider
> a hash_mem_multiplier less than 1.0.

In this case, it would make sense to do push down. I will do testing
around large data to see it myself.

> We could consider adjusting the create_distinct_paths() so that it
> uses some newly invented and less strict pathkey comparison where the
> order of the pathkeys does not matter. It would just care if the
> pathkeys were present and return a list of pathkeys not contained so
> that an incremental sort could be done only on the returned list and a
> Unique on an empty returned list. Something like that might be able
> to apply in more cases, for example:

> select distinct b,a from ab where a < 10;

> the distinct pathkeys would be b,a but if there's an index on (a),
> then we might have a path with pathkeys containing "a".

This would be a very good improvement.

> even if they're not costed quite as
> accurately as we might have liked

This is very exciting piece actually. Once current set of optimizations
gets ahead,

I will be giving this a shot. We need to look at cost models for sorting.

> We might also want to also consider if Pathkey.pk_strategy and
> pk_nulls_first need to be compared too. That makes the check a bit
> more expensive as Pathkeys are canonical and if those fields vary then
> we need to perform more than just a comparison by the memory address
> of the pathkey.

> This very much seems like a separate effort than the
> WindowClause sort reduction work. I think it gives us everything we've
> talked about extra we might want out of reducing WindowClause sorts
> and more.

I will work on this as separate patch (against HEAD). It makes much more

sense to look this as distinct sort related optimizations (which window
sort optimization

can take benefit). We may take a call to combine them or apply in series.

For unit of work perspective, I would prefer later.

Anyways, the forthcoming patch will contain the following:

1. Modify create_distinct_paths with newly invented and less strict
pathkey comparison where the

order of the pathkeys does not matter.

2. Handle Pathkey.pk_strategy and pk_nulls comparison.

3. Test cases

Thanks

Regards,

Ankit Kumar Pandey

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2023-01-13 05:43:21 Re: Perform streaming logical transactions by background workers and parallel apply
Previous Message Justin Pryzby 2023-01-13 05:23:05 Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)