Re: POC: GROUP BY optimization

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, David Rowley <dgrowleyml(at)gmail(dot)com>, "a(dot)rybakina" <a(dot)rybakina(at)postgrespro(dot)ru>
Subject: Re: POC: GROUP BY optimization
Date: 2024-02-02 04:06:42
Message-ID: CAMbWs4-NKLa+Ss+X=WR6h0x=T07YBJoAz70ZGHzc-2zcHUHb0A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 2, 2024 at 11:32 AM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:

> On Fri, Feb 2, 2024 at 10:02 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> One of the test cases added by this commit has not been very
>> stable in the buildfarm. Latest example is here:
>>
>>
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&dt=2024-02-01%2021%3A28%3A04
>>
>> and I've seen similar failures intermittently on other machines.
>>
>> I'd suggest building this test atop a table that is more stable
>> than pg_class. You're just waving a red flag in front of a bull
>> if you expect stable statistics from that during a regression run.
>> Nor do I see any particular reason for pg_class to be especially
>> suited to the test.
>
>
> Yeah, it's not a good practice to use pg_class in this place. While
> looking through the test cases added by this commit, I noticed some
> other minor issues that are not great. Such as
>
> * The table 'btg' is inserted with 10000 tuples, which seems a bit
> expensive for a test. I don't think we need such a big table to test
> what we want.
>
> * I don't see why we need to manipulate GUC max_parallel_workers and
> max_parallel_workers_per_gather.
>
> * I think we'd better write the tests with the keywords being all upper
> or all lower. A mixed use of upper and lower is not great. Such as in
>
> explain (COSTS OFF) SELECT x,y FROM btg GROUP BY x,y,z,w;
>
> * Some comments for the test queries are not easy to read.
>
> * For this statement
>
> CREATE INDEX idx_y_x_z ON btg(y,x,w);
>
> I think the index name would cause confusion. It creates an index on
> columns y, x and w, but the name indicates an index on y, x and z.
>
> I'd like to write a draft patch for the fixes.
>

Here is the draft patch that fixes the issues I complained about in
upthread.

Thanks
Richard

Attachment Content-Type Size
v1-0001-Multiple-revises-for-the-GROUP-BY-reordering-tests.patch application/octet-stream 15.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2024-02-02 04:20:17 Re: Synchronizing slots from primary to standby
Previous Message Andrei Lepikhov 2024-02-02 03:42:45 Re: POC: GROUP BY optimization