Re: [PATCH] Incremental sort (was: PoC: Partial sort)

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: James Coleman <jtc331(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Shaun Thomas <shaun(dot)thomas(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Date: 2019-09-13 14:51:53
Message-ID: 20190913145153.xthm7dcbt6vgp67x@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 12, 2019 at 12:49:29PM -0300, Alvaro Herrera wrote:
>On 2019-Jul-30, Tomas Vondra wrote:
>
>> On Sun, Jul 21, 2019 at 01:34:22PM +0200, Tomas Vondra wrote:
>> >
>> > I wonder if we're approaching this wrong. Maybe we should not reverse
>> > engineer queries for the various places, but just start with a set of
>> > queries that we want to optimize, and then identify which places in the
>> > planner need to be modified.
>
>[...]
>
>> I've decided to do a couple of experiments, trying to make my mind about
>> which modified places matter to diffrent queries. But instead of trying
>> to reverse engineer the queries, I've taken a different approach - I've
>> compiled a list of queries that I think are sensible and relevant, and
>> then planned them with incremental sort enabled in different places.
>
>[...]
>
>> The list of queries (synthetic, but hopefully sufficiently realistic)
>> and a couple of scripts to collect the plans is in this repository:
>>
>> https://github.com/tvondra/incremental-sort-tests-2
>>
>> There's also a spreadsheet with a summary of results, with a visual
>> representation of which GUCs affect which queries.
>
>OK, so we have that now. I suppose this spreadsheet now tells us which
>places are useful and which aren't, at least for the queries that you've
>tested. Dowe that mean that we want to get the patch to consider adding
>paths only the places that your spreadsheet says are useful? I'm not
>sure what the next steps are for this patch.
>

Yes. I think the spreadsheet call help us with answering two things:

1) places actually affecting the plan (all but three do)

2) redundant places (there are some cases where two GUCs produce the
same plan in the end)

Of course, this does assume the query set makes sense and is somewhat
realistic, but I've tried to construct queries where that is true. We
may extend it over time, of course.

I think we've agreed to add incremental sort paths different places in
separate patches, to make review easier. So this may be a useful way to
decide which places to address first. I'd probably do it in this order:

- create_ordered_paths
- create_ordered_paths (parallel part)
- add_paths_to_grouping_rel
- ... not sure ...

but that's just a proposal. It'd give us most of the benefits, I think,
and we could also focus on the rest of the patch.

Also, regarding the three GUCs that don't affect any of the queries, we
can't really add them as we wouldn't be able to test them. If we manage
to construct a query that'd benefit from them, we can revisit this.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-09-13 14:54:55 Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Previous Message Tom Lane 2019-09-13 14:43:18 Modest proposal for making bpchar less inconsistent