Re: BUG #15007: LIMIT not respected in sub-queries

From: Will Storey <will(at)summercat(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15007: LIMIT not respected in sub-queries
Date: 2018-01-19 01:55:06
Message-ID: 20180119015506.wdohq7r7h4rj2jfq@dev.null
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu 2018-01-18 00:08:31 +0100, Tomas Vondra wrote:

<snip>

> I went through the test case, and I think I know what's going on. The
> script did not reproduce the issue for me, but I think I've been able to
> tweak the query to use a plan producing more issues.
>
> The key is to force a particular join order, and disable operations that
> would materialize intermediate results of the t1 scan. That is, we want
> something like this:
>
> -> nestloop
> -> nestloop t2 + t3
> -> seq scan t1
>
> Firstly, I've replaced the dynamic random() condition with a static one:
>
> random() <= 0.5
>
> this is not strictly necessary, but it simplifies the plan. We need the
> random() call though, as I'll explain later.
>
> Then I've disabled a bunch of plan nodes that would materialize results
> of the t1 seq scan:
>
> set enable_material = off;
> set enable_sort = off;
> set enable_hashjoin = off;
>
> And finally, I've disabled join reordering by setting
>
> set join_collapse_limit = 1;
>
> Now, if you rewrite the query like this (which essentially just forces a
> particular join order, when combined with join_collapse_limit=1, nothing
> else):
>
> explain analyze SELECT * FROM
> (t3 JOIN t2 USING (t2_id)) JOIN
> (SELECT * FROM t1
> WHERE t1_id IS NOT NULL AND
> t1_id < 100 AND
> t1_val LIKE 'h%' AND
> random() <= 0.5
> LIMIT 5
> ) AS t1
> ON t3.t1_id = t1.t1_id
> WHERE t2.t2_val LIKE 'he%';
>
> you will get plans like this:
>
>
> QUERY PLAN
> ------------------------------------------------------------------------
> Nested Loop (cost=0.00..31.41 rows=5 width=21)
> (actual time=0.052..1.825 rows=7 loops=1)
> Join Filter: (t3.t1_id = t1.t1_id)
> Rows Removed by Join Filter: 73
> -> Nested Loop (cost=0.00..16.10 rows=16 width=14)
> (actual time=0.030..0.917 rows=16 loops=1)
> Join Filter: (t3.t2_id = t2.t2_id)
> Rows Removed by Join Filter: 160
> -> Seq Scan on t2 (cost=0.00..1.14 rows=11 width=10)
> (actual time=0.015..0.041 rows=11 loops=1)
> Filter: ((t2_val)::text ~~ 'he%'::text)
> -> Seq Scan on t3 (cost=0.00..1.16 rows=16 width=8)
> (actual time=0.003..0.037 rows=16 loops=11)
> -> Limit (cost=0.00..0.84 rows=5 width=7)
> (actual time=0.008..0.037 rows=5 loops=16)
> -> Seq Scan on t1 (cost=0.00..1.52 rows=9 width=7)
> (actual time=0.004..0.016 rows=5 loops=16)
> Filter: ((t1_id IS NOT NULL) AND (t1_id < 100) AND
> ((t1_val)::text ~~ 'h%'::text) AND
> (random() <= '0.5'::double precision))
> Rows Removed by Filter: 5
> Planning time: 0.625 ms
> Execution time: 1.911 ms
> (15 rows)
>
>
> This should be equivalent to the original query, and should produce the
> same results (modulo random() of course).
>
> But notice it actually does produce 7 rows!
>
> Nested Loop (cost=0.00..31.41 rows=5 width=21)
> (actual time=0.052..1.825 rows=7 loops=1)
> ^
>
> The problem is that it ends up executing the sequential scan on t1
> repeatedly (because it's the inner relation in a nested loop), and
> because random() is volatile, the results of the scan are likely different.
>
> Each rescan individually still respects the LIMIT, but when combined
> result may be larger - there may be more unique IDs, matching additional
> rows from the other tables.
>
> >
> > I believe a better way to write this query would be to use a CTE.
> >
>
> Yes, that will stabilize the output of the random() function,
> eliminating the nondeterminism during rescans.

Wow, thank you for the great explanation of what is going on!

It sounds like this is not really a bug then and is just something to be
expected when using LIMIT with such volatile joins.

I suppose I expected that if there is a LIMIT then that would be the
maximum number of rows the subquery would ever provide. The planner would
have to force materializing/stabilizing in such cases it sounds like. Maybe
that is not possible or not a good idea. It does seem like a pretty edge
case.

Thanks again!

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Sandeep Thakkar 2018-01-19 04:12:03 Re: BUG #15015: Zilib1.dll library missing if only command line tools installed
Previous Message Kyotaro HORIGUCHI 2018-01-19 01:54:53 Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least 9.5)?