Re: [HACKERS] why not parallel seq scan for slow functions

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Marina Polyakova <m(dot)polyakova(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Subject: Re: [HACKERS] why not parallel seq scan for slow functions
Date: 2018-03-24 13:40:30
Message-ID: CAA4eK1LNxzZLKme6eTj=svJ245H5z_w5CvCOSuxCr_oJ1HuXqQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 24, 2018 at 8:41 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Mar 23, 2018 at 12:12 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> Yeah, sometimes that kind of stuff change performance characteristics,
>> but I think what is going on here is that create_projection_plan is
>> causing the lower node to build physical tlist which takes some
>> additional time. I have tried below change on top of the patch series
>> and it brings back the performance for me.
>
> I tried another approach inspired by this, which is to altogether skip
> building the child scan tlist if it will just be replaced. See 0006.
> In testing here, that seems to be a bit better than your proposal, but
> I wonder what your results will be.
>
..
>
> It looks in my testing like this still underperforms master on your
> test case. Do you get the same result?
>

For me, it is equivalent to the master. The average of ten runs on
the master is 20664.3683 and with all the patches applied it is
20590.4734. I think there is some run-to-run variation, but more or
less there is no visible degradation. I think we have found the root
cause and eliminated it. OTOH, I have found another case where new
patch series seems to degrade.

Test case
--------------
DO $$
DECLARE count integer;
BEGIN
For count In 1..1000000 Loop
Execute 'explain Select count(ten) from tenk1';
END LOOP;
END;
$$;

The average of ten runs on the master is 31593.9533 and with all the
patches applied it is 34008.7341. The patch takes approximately 7.6%
more time. I think this patch series is doing something costly in the
common code path. I am also worried that the new code proposed by you
in 0003* patch might degrade planner performance for partitioned rels,
though I have not tested it yet. It is difficult to say without
testing it, but before going there, I think we should first
investigate whats happening in the non-partitioned case.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-03-24 13:56:42 Re: PQHost() undefined behavior if connecting string contains both host and hostaddr types
Previous Message Pavel Stehule 2018-03-24 12:41:21 Re: Re: csv format for psql