Re: FETCH FIRST clause PERCENT option

From: Surafel Temesgen <surafel3000(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: Ryan Lambert <ryan(at)rustprooflabs(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Vik Fearing <vik(dot)fearing(at)2ndquadrant(dot)com>, Mark Dilger <hornschnorter(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Subject: Re: FETCH FIRST clause PERCENT option
Date: 2019-08-07 07:20:09
Message-ID: CALAY4q98xbVHtZ4yj9DcCMG2-s1_JuRr7fYaNfW+bKmr22OiVQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi
On Wed, Aug 7, 2019 at 6:11 AM Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
wrote:

>
> I have some comments.
>
> This patch uses distinct parameters for exact number and
> percentage. On the othe hand planner has a notion of
> tuple_fraction covering the both. The same handling is also used
> for tuple number estimation. Using the same scheme will make
> things far simpler. See the comment of grouping_planner().
>
>
Its because of data type difference .In planner the data type is the same

In executor part, addition to LimiteState.position, this patch
> uses LimiteState.backwardPosition to count how many tuples we're
> back from the end of the current tuplestore. But things will get
> simpler by just asking the tuplestore for the number of holding
> tuples.
>
>
backwardPosition hold how many tuple returned in backward scan

>
> + slot = node->subSlot;
>
> Why is this needed? The variable is properly set before use and
> the assignment is bogus in the first place.
>
>
its because Tuplestore needs initialized slot.

> The new code block in LIMIT_RESCAN in ExecLimit is useless since
> it is exatctly the same with existing code block. Why didn't you
> use the existing if block?
>

But they test different scenario

>
> > if (node->limitOption == PERCENTAGE)
> + {
> + node->perExactCount = ceil(node->percent *
> node->position / 100.0);
> +
> +
>
> node->position is the number of tuples returned to upper node so
> far (not the number of tuples this node has read from the lower
> node so far). I don't understand what the expression means.
>

node->position hold the number of tuples this node has read from the lower
node so far. see LIMIT_RESCAN state

>
>
> + if (node->perExactCount == node->perExactCount + 1)
> + node->perExactCount++;
>
> What? The condition never be true. As the result, the following
> if block below won't run.
>

it became true according to the number of tuple returned in from the lower
node so far
and percentage specification.

>
> > /*
> + * Return the tuple up to the number of exact count in
> OFFSET
> + * clause without percentage value consideration.
> + */
> + if (node->perExactCount > 0)
> + {
> +
>
>
>
>
> + /*
> + * We may needed this tuple in backward scan so put it into
> + * tuplestore.
> + */
> + if (node->limitOption == PERCENTAGE)
> + {
> + tuplestore_puttupleslot(node->tupleStore, slot);
> + tuplestore_advance(node->tupleStore, true);
> + }
>
> "needed"->"need" ? The comment says that this is needed for
> backward scan, but it is actually required even in forward
> scan. More to the point, tuplestore_advance lacks comment.

ok

>
> Anyway, the code in LIMIT_RESCAN is broken in some ways. For
> example, if it is used as the inner scan of a NestLoop, the
> tuplestore accumulates tuples by every scan. You will see that
> the tuplestore stores up to 1000 tuples (10 times of the inner)
> by the following query.
>

It this because in percentage we scan the whole table

regards
Surafel

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Konstantin Knizhnik 2019-08-07 07:24:17 Handling RestrictInfo in expression_tree_walker
Previous Message Amit Langote 2019-08-07 06:38:53 Re: Problem with default partition pruning