Re: Doubts about pushing LIMIT to MergeAppendPath

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Antonin Houska <ah(at)cybertec(dot)at>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Doubts about pushing LIMIT to MergeAppendPath
Date: 2018-11-01 20:02:46
Message-ID: 1a0dca9f-f288-dc11-9506-97f34c1bb3dc@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/01/2018 08:51 PM, Antonin Houska wrote:
> Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>
>> On 11/01/2018 12:48 PM, Antonin Houska wrote:
>>> Review of [1] made me think of this optimization, currently used only in
>>> create_merge_append_path():
>>>
>>> /*
>>> * Apply query-wide LIMIT if known and path is for sole base relation.
>>> * (Handling this at this low level is a bit klugy.)
>>> */
>>> if (bms_equal(rel->relids, root->all_baserels))
>>> pathnode->limit_tuples = root->limit_tuples;
>>> else
>>> pathnode->limit_tuples = -1.0;
>>>
>>> Currently it's not a problem because the output of MergeAppend plan is not
>>> likely to be re-sorted, but I don't think data correctness should depend on
>>> cost evaluation. Instead, -1 should be set here if there's any chance that the
>>> output will be sorted again.
>>>
>>
>> So you're saying we might end up with a plan like this:
>>
>> Limit
>> -> Sort
>> -> MergeAppend
>> -> SeqScan on t
>>
>> In which case we'd pass the wrong limit_tuples to the MergeAppend?
>>
>> Hmmm, that would depend on us building MergeAppend node that does not
>> match the expected pathkeys, and pick it instead of plain Append node.
>> I'm not sure that's actually possible, but maybe it is ...
>
> Yes, if there should be Sort node above MergeAppend, then it'll be probably
> cheaper to use Append. So the problem should not happen in practice, but in
> theory it seems to be possible.
>

Actually no - see my other response. We don't build child paths with
pathkeys that are not useful (do not match the expected ordering), and
we only build MergeAppend for pathkeys from child paths.

So I believe we currently won't build a MergeAppend that would then
require additional Sort node, irrespectedly of the costing.

>>
>>> I tried to reproduce the issue by applying the "Incremental sort" [2] patch
>>> and running the following example:
>>>
>>> CREATE TABLE t(i int, j int);
>>> CREATE TABLE t1() INHERITS (t);
>>> CREATE INDEX ON t1(i, j);
>>> INSERT INTO t1(i, j) VALUES (1, 0), (1, 1);
>>> CREATE TABLE t2() INHERITS (t);
>>> CREATE INDEX ON t2(i, j);
>>> INSERT INTO t2(i, j) VALUES (1, 0), (1, 1);
>>>
>>> ANALYZE;
>>>
>>> SELECT * FROM t ORDER BY i, j DESC LIMIT 1;
>>>
>>
>> So, what query plan did this use?
>
> Limit
> -> Incremental Sort
> Sort Key: t.i, t.j DESC
> Presorted Key: t.i
> -> Merge Append
> Sort Key: t.i
> -> Sort
> Sort Key: t.i
> -> Seq Scan on t
> -> Index Only Scan using t1_i_j_idx on t1
> -> Index Only Scan using t2_i_j_idx on t2
>

Interesting. That might be broken, not sure if the LIMIT optimization
triggered or not.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Finnerty 2018-11-01 20:23:06 Re: Hash Joins vs. Bloom Filters / take 2
Previous Message Peter Eisentraut 2018-11-01 20:00:45 Re: Remove obsolete pg_attrdef.adsrc column