Re: POC: GROUP BY optimization

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: POC: GROUP BY optimization
Date: 2021-07-20 23:08:40
Message-ID: ba2eccf2-d623-c12c-f7c2-a58ac4917c83@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7/20/21 3:12 PM, Tomas Vondra wrote:
> ...
>
> The other comments from the review still apply - I'm particularly
> concerned about the (1) point, i.e. plan changes in postgres_fdw. Those
> seem to be rather strange (LIMIT not being pushed down in queries
> without any grouping). I'd bet this is due to changes in sort costing
> and does not seem very desirable.
>

I did look at this a bit closer, and yes - this very much seems like a
costing issue in the patch. The first query in postgres_fdw that changes
switches from a query with LIMIT pushed to remote

QUERY PLAN
------------------------------------------------------------------------
Foreign Scan (cost=196.29..196.52 rows=10 width=14)
Output: t1.c1, t2.c1, t1.c3
Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
Remote SQL: SELECT r1."C 1", r2."C 1", r1.c3 FROM ("S 1"."T 1" r1
INNER JOIN "S 1"."T 1" r2 ON ... LIMIT ... OFFSET ...
(4 rows)

to the LIMIT executed locally

QUERY PLAN
-------------------------------------------------------------------------
Limit (cost=241.72..241.94 rows=10 width=14)
Output: t1.c1, t2.c1, t1.c3
-> Foreign Scan (cost=239.47..261.97 rows=1000 width=14)
Output: t1.c1, t2.c1, t1.c3
Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
Remote SQL: SELECT r1."C 1", r2."C 1", r1.c3 FROM ("S 1"."T 1"
r1 INNER JOIN ...
(6 rows)

The FDW code runs explain on both queries - with and without LIMIT
pushed to remote side, to get estimates, and without the patch it gets this:

QUERY PLAN
------------------------------------------------------------------------------
Sort (cost=106.97..109.47 rows=1000 width=14)
Sort Key: r1.c3, r1."C 1"
-> Hash Join (cost=33.50..57.14 rows=1000 width=14)
Hash Cond: (r1."C 1" = r2."C 1")
-> Seq Scan on "T 1" r1 (cost=0.00..21.00 rows=1000 width=10)
-> Hash (cost=21.00..21.00 rows=1000 width=4)
-> Seq Scan on "T 1" r2 (cost=0.00..21.00 rows=1000
width=4)
(7 rows)

QUERY PLAN
------------------------------------------------------------------------------------
Limit (cost=96.29..96.32 rows=10 width=14)
-> Sort (cost=96.04..98.54 rows=1000 width=14)
Sort Key: r1.c3, r1."C 1"
-> Hash Join (cost=33.50..57.14 rows=1000 width=14)
Hash Cond: (r1."C 1" = r2."C 1")
-> Seq Scan on "T 1" r1 (cost=0.00..21.00 rows=1000
width=10)
-> Hash (cost=21.00..21.00 rows=1000 width=4)
-> Seq Scan on "T 1" r2 (cost=0.00..21.00
rows=1000 width=4)
(8 rows)

while with the patch it gets this:

QUERY PLAN
------------------------------------------------------------------------------
Sort (cost=139.47..141.97 rows=1000 width=14)
Sort Key: r1.c3, r1."C 1"
-> Hash Join (cost=33.50..57.14 rows=1000 width=14)
Hash Cond: (r1."C 1" = r2."C 1")
-> Seq Scan on "T 1" r1 (cost=0.00..21.00 rows=1000 width=10)
-> Hash (cost=21.00..21.00 rows=1000 width=4)
-> Seq Scan on "T 1" r2 (cost=0.00..21.00 rows=1000
width=4)
(7 rows)

QUERY PLAN
------------------------------------------------------------------------------------
Limit (cost=145.75..145.77 rows=10 width=14)
-> Sort (cost=145.50..148.00 rows=1000 width=14)
Sort Key: r1.c3, r1."C 1"
-> Hash Join (cost=33.50..57.14 rows=1000 width=14)
Hash Cond: (r1."C 1" = r2."C 1")
-> Seq Scan on "T 1" r1 (cost=0.00..21.00 rows=1000
width=10)
-> Hash (cost=21.00..21.00 rows=1000 width=4)
-> Seq Scan on "T 1" r2 (cost=0.00..21.00
rows=1000 width=4)
(8 rows)

Notice that the costs get "inverted"

master patched
----------------------------------------------
no limit 106.97..109.47 139.47..141.97
limit 96.29.. 96.32 145.75..145.77

so the limit looks a bit more expensive - just enough so that it seems
cheaper to transfer more rows and execute the limit locally.

IMO this looks rather suspicious (or wrong), and compute_cpu_sort_cost
may need some fixes. I wonder why differs from the old code so much ...

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2021-07-20 23:10:41 Bitmap reuse
Previous Message Bruce Momjian 2021-07-20 22:53:50 Re: Have I found an interval arithmetic bug?