Re: Removing redundant grouping columns

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Removing redundant grouping columns
Date: 2022-12-30 08:05:55
Message-ID: CAMbWs48G84f9RNg+Ueq-0pytCg-YHMP95t+hQnZ8RaL-_GLBMg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 28, 2022 at 6:18 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> This patch is aimed at being smarter about cases where we have
> redundant GROUP BY entries, for example
>
> SELECT ... WHERE a.x = b.y GROUP BY a.x, b.y;
>
> It's clearly not necessary to perform grouping using both columns.
> Grouping by either one alone would produce the same results,
> assuming compatible equality semantics. I'm not sure how often
> such cases arise in the wild; but we have about ten of them in our
> regression tests, which makes me think it's worth the trouble to
> de-duplicate as long as it doesn't cost too much. And it doesn't,
> because PathKey construction already detects exactly this sort of
> redundancy. We need only do something with the knowledge.

While we are here, I wonder if we can do the same trick for
distinctClause, to cope with cases like

select distinct a.x, b.y from a, b where a.x = b.y;

And there is case from regression test 'select_distinct.sql' that can
benefit from this optimization.

--
-- Check mentioning same column more than once
--

EXPLAIN (VERBOSE, COSTS OFF)
SELECT count(*) FROM
(SELECT DISTINCT two, four, two FROM tenk1) ss;

Thanks
Richard

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message adherent postgres 2022-12-30 08:27:57 Re: Add 64-bit XIDs into PostgreSQL 15
Previous Message Michael Paquier 2022-12-30 05:21:22 Re: Getting rid of SQLValueFunction