Re: Todo: Teach planner to evaluate multiple windows in the optimal order

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: Ankit Kumar Pandey <itsankitkp(at)gmail(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, pghackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Vik Fearing <vik(at)postgresfriends(dot)org>
Subject: Re: Todo: Teach planner to evaluate multiple windows in the optimal order
Date: 2023-01-30 05:31:47
Message-ID: CAFBsxsEW-eRv5GephFO-Y=vQs_MjaEKn-1wpf2=6V7EH+SpUjg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jan 29, 2023 at 7:05 PM Ankit Kumar Pandey <itsankitkp(at)gmail(dot)com>
wrote:
>
>
> > On 26/01/23 07:40, David Rowley wrote:

> > Sorting in smaller batches that better fit into CPU cache.
>
> More reading yielded that we are looking for cache-oblivious
> sorting algorithm.

Since David referred to L3 size as the starting point of a possible
configuration parameter, that's actually cache-conscious.

> One the papers[1] mentions that in quick sort, whenever we reach size
which can fit in cache,
> instead of partitioning it further, we can do insertion sort there itself.

> > Memory-tuned quicksort uses insertion sort to sort small subsets while
they are in
> > the cache, instead of partitioning further. This increases the
instruction count,
> > but reduces the cache misses
>
> If this looks step in right direction, I can give it a try and do more
reading and experimentation.

I'm not close enough to this thread to guess at the right direction
(although I hope related work will help), but I have a couple general
remarks:

1. In my experience, academic papers like to test sorting with
register-sized numbers or strings. Our sorttuples are bigger, have
complex comparators, and can fall back to fields in the full tuple.
2. That paper is over 20 years old. If it demonstrated something genuinely
useful, some of those concepts would likely be implemented in the
real-world somewhere. Looking for evidence of that might be a good exercise.
3. 20 year-old results may not carry over to modern hardware.
4. Open source software is more widespread in the academic world now than
20 years ago, so papers with code (maybe even the author's github) are much
more useful to us in my view.
5. It's actually not terribly hard to make sorting faster for some specific
cases -- the hard part is keeping other inputs of interest from regressing.
6. The bigger the change, the bigger the risk of regressing somewhere.

--
John Naylor
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-01-30 05:36:50 Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)
Previous Message wangw.fnst@fujitsu.com 2023-01-30 05:06:48 RE: Logical replication timeout problem