Re: Our trial to TPC-DS but optimizer made unreasonable plan

From: Qingqing Zhou <zhouqq(dot)postgres(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Our trial to TPC-DS but optimizer made unreasonable plan
Date: 2015-08-27 20:01:46
Message-ID: CAJjS0u0JaSrADhyYDB50=0oY1cijJMTMUPxHrGVg-XP-7+81SQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 26, 2015 at 5:28 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> After looking at the code a bit, IMO the most reasonable thing to do is to
> include this transformation in inline_set_returning_functions(), perhaps
> renaming it to something like inline_srfs_and_ctes().
>

This is essentially the same as my current implementation (revised
patch attached):
1. There are two call sites of inline_set_returning_functions(), and
one place is guarded with Assert(subquery->cteList == NIL). This means
transformation in subquery_planner() is effective.
2. A problem with revised patch is that we can't get rid of non-used
CTEs show up in EXPLAIN.

IMHO, here the problem is not "multiple levels" but "multiple
references". "levels" is handled well by recursion but references are
not: set returning function seems does not have the this issue because
you don't define a function along the query.

Regards,
Qingqing

---

Two testing queries results with revised patch:
1. Extra CTE q and p prints in EXPLAIN:
postgres=# explain WITH q AS (
postgres(# WITH p AS (SELECT * from a) SELECT p.* FROM p JOIN p p1
on p.i>=p1.i)
postgres-# SELECT * FROM q WHERE i <= 5;
QUERY PLAN
-----------------------------------------------------------------------------------------
Nested Loop (cost=1443.59..7423.16 rows=133333 width=8)
CTE q
-> Nested Loop (cost=1443.29..91700762.00 rows=3333333333 width=8)
CTE p
-> Seq Scan on a a_2 (cost=0.00..1443.00 rows=100000 width=8)
-> Seq Scan on a a_3 (cost=0.00..1443.00 rows=100000 width=8)
-> Index Only Scan using ai on a a_4 (cost=0.29..583.65
rows=33333 width=4)
Index Cond: (i <= a_3.i)
CTE p
-> Seq Scan on a a_5 (cost=0.00..1443.00 rows=100000 width=8)
-> Index Scan using ai on a (cost=0.29..8.36 rows=4 width=8)
Index Cond: (i <= 5)
-> Index Only Scan using ai on a a_1 (cost=0.29..1159.62
rows=33333 width=4)
Index Cond: (i <= a.i)
(14 rows)

2. Extra m1 show up and same problem still there:
postgres=# explain WITH q AS (
postgres(# WITH p AS (SELECT * from a) SELECT p.* FROM p JOIN p p1 on
postgres(# p.i>=p1.i), m as (select * from q), m1 as (select * from m)
postgres-# SELECT * FROM m1 WHERE i <= 5;
QUERY PLAN
-----------------------------------------------------------------------------------------
CTE Scan on m (cost=225034095.32..300034095.31 rows=1111111111 width=8)
Filter: (i <= 5)
CTE q
-> Nested Loop (cost=1443.29..91700762.00 rows=3333333333 width=8)
CTE p
-> Seq Scan on a (cost=0.00..1443.00 rows=100000 width=8)
-> Seq Scan on a a_1 (cost=0.00..1443.00 rows=100000 width=8)
-> Index Only Scan using ai on a a_2 (cost=0.29..583.65
rows=33333 width=4)
Index Cond: (i <= a_1.i)
CTE m
-> CTE Scan on q (cost=0.00..66666666.66 rows=3333333333 width=8)
CTE m1
-> CTE Scan on m m_1 (cost=0.00..66666666.66 rows=3333333333 width=8)
(13 rows)

Attachment Content-Type Size
ctes2.diff text/plain 5.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-08-27 21:11:48 Re: Raising our compiler requirements for 9.6
Previous Message Dean Rasheed 2015-08-27 18:37:51 Re: missing locking in at least INSERT INTO view WITH CHECK