Re: FETCH FIRST clause PERCENT option

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: surafel3000(at)gmail(dot)com, vik(dot)fearing(at)2ndquadrant(dot)com, hornschnorter(at)gmail(dot)com, andres(at)anarazel(dot)de, pgsql-hackers(at)postgresql(dot)org, andrew(at)tao11(dot)riddles(dot)org(dot)uk
Subject: Re: FETCH FIRST clause PERCENT option
Date: 2019-02-28 20:16:25
Message-ID: fbd08ad3-5dd8-3169-6cba-38d610d7be7f@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/28/19 12:26 PM, Kyotaro HORIGUCHI wrote:
> Hello.
>
> At Sat, 23 Feb 2019 22:27:44 +0100, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote in <81a5c0e9-c17d-28f3-4647-8a4659cdfdb1(at)2ndquadrant(dot)com>
>>
>>
>> On 2/23/19 8:53 AM, Surafel Temesgen wrote:
>>>
>>>
>>> On Sun, Feb 10, 2019 at 2:22 AM Tomas Vondra
>>> <tomas(dot)vondra(at)2ndquadrant(dot)com <mailto:tomas(dot)vondra(at)2ndquadrant(dot)com>> wrote:
>>>  
>>>
>>>
>>> I'm not sure I understand - are you saying every time the user does a
>>> FETCH, we have to run the outer plan from scratch? I don't see why would
>>> that be necessary? And if it is, how come there's no noticeable
>>> performance difference?
>>>
>>> Can you share a patch implementing the incremental approach, and a query
>>> demonstrating the issue?
>>>
>>>
>>> I didn't implement it but its obvious that it doesn't work similarly
>>> with previous approach.
>>>
>>
>> Sure, but that's hardly a sufficient argument for the current approach.
>>
>>> We need different implementation and my plan was to use tuplestore per
>>> call and clear
>>>
>>> it after returning tuple but I see that the plan will not go far because
>>> mainly the last returned
>>>
>>> slot is not the last slot we get from outerPlan execution
>>>
>>
>> I'm sorry, I still don't understand what the supposed problem is. I
>> don't think it's all that different from what nodeMaterial.c does, for
>> example.
>>
>> As I explained before, having to execute the outer plan till completion
>> before returning any tuples is an issue. So either it needs fixing or an
>> explanation why it's not an issue.
>
> One biggest issue seems to be we don't know the total number of
> outer tuples before actually reading a null tuple. I doubt of
> general shortcut for that. It also seems preventing limit node
> from just using materialized outer.
>

Sure, if you actually want all tuples, you'll have to execute the outer
plan till completion. But that's not what I'm talking about - what if we
only ever need to read one row from the limit?

To give you a (admittedly, somewhat contrived and artificial example):

SELECT * FROM t1 WHERE id IN (
SELECT id FROM t2 ORDER BY x FETCH FIRST 10 PERCENT ROWS ONLY
);

Maybe this example is bogus and/or does not really matter in practice. I
don't know, but I've been unable to convince myself that's the case.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-02-28 20:18:39 Re: Protect syscache from bloating with negative cache entries
Previous Message Tomas Vondra 2019-02-28 19:56:08 Re: [HACKERS] PATCH: multivariate histograms and MCV lists