Re: Todo: Teach planner to evaluate multiple windows in the optimal order

From: Ankit Kumar Pandey <itsankitkp(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: pghackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Vik Fearing <vik(at)postgresfriends(dot)org>
Subject: Re: Todo: Teach planner to evaluate multiple windows in the optimal order
Date: 2023-01-19 05:22:00
Message-ID: a1c1f93a-16f7-4f8a-4562-4b89dd2e371c@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On 19/01/23 08:58, David Rowley wrote:

> The problem is that the next item looked at is 1 and the value 2 is skipped.

Okay, I see the issue.

> I think you are misinterpreting the results, but the main point
> remains - it's slower. The explain analyze timing shows the time
> between outputting the first row and the last row. For sort, there's a
> lot of work that needs to be done before you output the first row.

Okay got it.

> I looked into this a bit further and using the same table as you, and
> the attached set of hacks that adjust the ORDER BY path generation to
> split a Sort into a Sort and Incremental Sort when the number of
> pathkeys to sort by is > 2.

Okay, so this is issue with strict sort itself.

I will play around with current patch but it should be fine for

current work and would account for issue in strict sort.

>  I suspect this might be to do with having to perform more swaps in the
> array elements that we'resorting by with the full sort.  When work_mem is
> large this array no longer fits in CPU cache and I suspect those swaps become
> more costly when there are more cache lines having to be fetched from RAM.

Looks like possible reason.

> I think we really should fix tuplesort.c so that it batches sorts into
> about L3 CPU cache-sized chunks in RAM rather than trying to sort much
> larger arrays.

I assume same thing we are doing for incremental sort and that's why it
perform

better here?

Or is it due to shorter tuple width and thus being able to fit into
memory easily?

> On the
> other hand, we already sort WindowClauses with the most strict sort
> order first which results in a full Sort and no additional sort for
> subsequent WindowClauses that can make use of that sort order. It
> would be bizarre to reverse the final few lines of common_prefix_cmp
> so that we sort the least strict order first so we end up injecting
> Incremental Sorts into the plan to make it faster!

I would see this as beneficial when tuple size is too huge for strict sort.

Basically

cost(sort(a,b,c,d,e)) > cost(sort(a)+sort(b)+sort(c)+ sort(d,e))

Don't know if cost based decision is realistic or not in current

state of things.

> I think more benchmarking is required
> so we can figure out if this is a corner case or a common case

I will do few more benchmarks. I assume all scaling issue with strict sort

to remain as it is but no further issue due to this change itself comes up.

> I'm just unsure if we should write this off as the expected behaviour

> of Sort and continue with the patch or delay the whole thing until we

> make some improvements to sort.

Unless we found more issues, we might be good with the former but

you can make better call on this. As much as I would love my first patch
to get

merged, I would want it to be right.

Thanks,

Ankit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Lepikhov 2023-01-19 05:39:27 Re: [PATCH] random_normal function
Previous Message Michael Paquier 2023-01-19 04:16:50 Re: Make use of assign_checkpoint_completion_target() to calculate CheckPointSegments correctly