Re: Group by reordering optimization

From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: Group by reordering optimization
Date: 2020-09-16 14:20:39
Message-ID: 20200916142039.uetla6kgyphkccit@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Tue, Sep 01, 2020 at 11:08:54PM +0200, Tomas Vondra wrote:
> On Tue, Sep 01, 2020 at 01:15:31PM +0200, Dmitry Dolgov wrote:
> > Hi,
> >
> > Better late than never, to follow up on the original thread [1] I would like to
> > continue the discussion with the another version of the patch for group by
> > reordering optimization. To remind, it's about reordering of group by clauses
> > to do sorting more efficiently. The patch is rebased and modified to address
> > (at least partially) the suggestions about making it consider new additional
> > paths instead of changing original ones. It is still pretty much
> > proof-of-concept version though with many blind spots, but I wanted to start
> > kicking it and post at least something, otherwise it will never happen. An
> > incremental approach so to say.
> >
> > In many ways it still contains the original code from Teodor. Changes and notes:
> >
> > * Instead of changing the order directly, now patch creates another patch with
> > modifier order of clauses. It does so for the normal sort as well as for
> > incremental sort. The whole thing is done in two steps: first it finds a
> > potentially better ordering taking into account number of groups, widths and
> > comparison costs; afterwards this information is used to produce a cost
> > estimation. This is implemented via a separate create_reordered_sort_path to
> > not introduce too many changes, I couldn't find any better place.
> >
>
> I haven't tested the patch with any queries, but I agree this seems like
> the right approach in general.
>
> I'm a bit worried about how complex the code in planner.c is getting -
> the incremental sort patch already made it a bit too complex, and this
> is just another step in that direction. I suppose we should refactor
> add_paths_to_grouping_rel() by breaking it into smaller / more readable
> pieces ...

Yes, that was my impression as well. I'll try to make such refactoring
either as a separate patch or a part of the main one.

> > * Function get_func_cost was removed at some point, but unfortunately this
> > patch was implemented before that, so it's still present there.
> >
> > * For simplicity I've removed support in create_partial_grouping_paths, since
> > they were not covered by the existing tests anyway.
> >
>
> Hmmm, OK. I think that's something we'll need to address for the final
> patch, but I agree we can add it after improving the costing etc.

Sure, I plan to return it in time. IIUC in the original patch series
this code wasn't covered with tests, so I've decided to minimize the
changes.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-09-16 14:21:56 Re: PostgreSQL 13 RC 1 release announcement draft
Previous Message James McManus 2020-09-16 14:18:58 ST_AsMVTGeom AND table name as parameter in pl/psql