Re: FETCH FIRST clause PERCENT option

From: Surafel Temesgen <surafel3000(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, vik(dot)fearing(at)2ndquadrant(dot)com, Mark Dilger <hornschnorter(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, andrew(at)tao11(dot)riddles(dot)org(dot)uk
Subject: Re: FETCH FIRST clause PERCENT option
Date: 2019-03-01 12:47:43
Message-ID: CALAY4q950E1veZENqCB9akHqK8KbC0KkCjOXVWRZaa0-ku8R7w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 1, 2019 at 4:33 AM Kyotaro HORIGUCHI <
horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:

> Hello.
>
> At Thu, 28 Feb 2019 21:16:25 +0100, Tomas Vondra <
> tomas(dot)vondra(at)2ndquadrant(dot)com> wrote in <
> fbd08ad3-5dd8-3169-6cba-38d610d7be7f(at)2ndquadrant(dot)com>
> > > One biggest issue seems to be we don't know the total number of
>
> # One *of* the biggest *issues*?
>
> > > outer tuples before actually reading a null tuple. I doubt of
> > > general shortcut for that. It also seems preventing limit node
> > > from just using materialized outer.
> > >
> >
> > Sure, if you actually want all tuples, you'll have to execute the outer
> > plan till completion. But that's not what I'm talking about - what if we
> > only ever need to read one row from the limit?
>
> We have no choice than once reading all tuples just to find we
> are to return just one row, since estimator is not guaranteed to
> be exact as required for this purpose.
>
> > To give you a (admittedly, somewhat contrived and artificial example):
> >
> > SELECT * FROM t1 WHERE id IN (
> > SELECT id FROM t2 ORDER BY x FETCH FIRST 10 PERCENT ROWS ONLY
> > );
> >
> > Maybe this example is bogus and/or does not really matter in practice. I
> > don't know, but I've been unable to convince myself that's the case.
>
> I see such kind of idiom common. Even in the quite simple example
> above, *we* cannot tell how many tuples the inner should return
> unless we actually fetch all tuples in t2. This is the same
> problem with count(*).
>
> The query is equivalent to the folloing one.
>
> SELECT * FROM t1 WHERE id IN (
> SELECT id FROM t2 ORDER BY x
> FETCH FIRST (SELECT ceil(count(*) * 0.1) FROM t2) ROWS ONLY
> );
>
> This scans t2 twice, but this patch does only one full scan
> moving another partial scan to tuplestore. We would win if the
> outer is complex enough.
>

Okay here is the previous implementation with uptread review comment
included and it also consider OFFSET clause in percentage calculation

regards
Surafel

Attachment Content-Type Size
fetch-with-percent-v9.patch text/x-patch 40.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2019-03-01 12:53:03 Re: NOT IN subquery optimization
Previous Message Etsuro Fujita 2019-03-01 12:42:47 Re: Problems with plan estimates in postgres_fdw