Re: [PATCH] Incremental sort (was: PoC: Partial sort)

From: James Coleman <jtc331(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Shaun Thomas <shaun(dot)thomas(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andreas Karlsson <andreas(at)proxel(dot)se>
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Date: 2020-03-31 23:09:04
Message-ID: CAAaqYe-5V0yu45krUGPipMXaLhiChJD1Xib5=dZVP8uMCGbOcw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 31, 2020 at 6:54 PM Tomas Vondra
<tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>
> On Tue, Mar 31, 2020 at 06:35:32PM -0400, Tom Lane wrote:
> >Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
> >> In general, I think it'd be naive that we can make planner smarter with
> >> no extra overhead spent on planning, and we can never accept patches
> >> adding even tiny overhead. With that approach we'd probably end up with
> >> a trivial planner that generates just a single query plan, because
> >> that's going to be the fastest planner. A realistic approach needs to
> >> consider both the planning and execution phase, and benefits of this
> >> patch seem to be clear - if you have queries that do benefit from it.
> >
> >I think that's kind of attacking a straw man, though. The thing that
> >people push back on, or should push back on IMO, is when a proposed
> >patch adds significant slowdown to queries that it has no or very little
> >hope of improving. The trick is to do expensive stuff only when
> >there's a good chance of getting a better plan out of it.
> >
>
> Yeah, I agree with that. I think the main issue is that we don't really
> know what the "expensive stuff" is in this case, so it's not really
> clear how to be smarter :-(

To add to this: I agree that ideally you'd check cheaply to know
you're in a situation that might help, and then do more work. But here
the question is always going to be simply "would we benefit from an
ordering, and, if so, do we have it already partially sorted". It's
hard to imagine that reducing much conceptually, so we're left with
optimizations of that check.

> One possibility is that it's just one of those regressions due to change
> in binary layout, but I'm not sure know how to verify that.

If we are testing with a case that can't actually add more paths (due
to it checking the guc before building them), doesn't that effectively
leave one of these two options:
1. Binary layout/cache/other untraceable change, or
2. Changes due to refactored function calls.

There's not anything obvious in point (2) that would be a big cost,
but there are definitely changes there. I was surprised that just
eliminating the loop through the pathkeys on the query and the index
was enough to save us ~4%.

Tomas: Earlier you'd wondered about if we should try to shortcut the
changes in costing...I was skeptical of that originally, but maybe
it's worth looking into? I'm going to try backing that out and see
what the numbers look like.

James

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexey Bashtanov 2020-03-31 23:24:08 Re: Less-silly selectivity for JSONB matching operators
Previous Message Tomas Vondra 2020-03-31 22:53:57 Re: [PATCH] Incremental sort (was: PoC: Partial sort)