Re: [POC] Faster processing at Gather node

From: Andres Freund <andres(at)anarazel(dot)de>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [POC] Faster processing at Gather node
Date: 2017-11-07 19:32:00
Message-ID: 20171107193200.5jkjvuhqclcg4bj2@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2017-11-06 10:56:43 +0530, Amit Kapila wrote:
> On Sun, Nov 5, 2017 at 6:54 AM, Andres Freund <andres(at)anarazel(dot)de> wrote
> > On 2017-11-05 01:05:59 +0100, Robert Haas wrote:
> >> skip-gather-project-v1.patch does what it says on the tin. I still
> >> don't have a test case for this, and I didn't find that it helped very
> >> much,
>
> I am also wondering in which case it can help and I can't think of the
> case.

I'm confused? Isn't it fairly obvious that unnecessarily projecting
at the gather node is wasteful? Obviously depending on the query you'll
see smaller / bigger gains, but that there's beenfdits should be fairly obvious?

> Basically, as part of projection in the gather, I think we are just
> deforming the tuple which we anyway need to perform before sending the
> tuple to the client (printtup) or probably at the upper level of the
> node.

But in most cases you're not going to print millions of tuples, instead
you're going to apply some further operators ontop (e.g. the
OFFSET/LIMIT in my example).

> >> and you said this looked like a big bottleneck in your
> >> testing, so here you go.

> Is it possible that it shows the bottleneck only for 'explain analyze'
> statement as we don't deform the tuple for that at a later stage?

Doesn't matter, there's a OFFSET/LIMIT ontop of the query. Could just as
well be a sort node or something.

> > The query where that showed a big benefit was
> >
> > SELECT * FROM lineitem WHERE l_suppkey > '5012' OFFSET 1000000000 LIMIT 1;
> >
> > (i.e a not very selective filter, and then just throwing the results away)
> >
> > still shows quite massive benefits:
>
> Do you see the benefit if the query is executed without using Explain Analyze?

Yes.

Before:
tpch_5[11878][1]=# SELECT * FROM lineitem WHERE l_suppkey > '5012' OFFSET 1000000000 LIMIT 1;^[[A
...
Time: 7590.196 ms (00:07.590)

After:
Time: 3862.955 ms (00:03.863)

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Dolgov 2017-11-07 20:00:43 Re: [PATCH] Generic type subscripting
Previous Message Jacob Champion 2017-11-07 17:26:58 Re: [PATCH] Assert that the correct locks are held when calling PageGetLSN()