Re: [HACKERS] why not parallel seq scan for slow functions

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Marina Polyakova <m(dot)polyakova(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Subject: Re: [HACKERS] why not parallel seq scan for slow functions
Date: 2018-03-13 05:03:45
Message-ID: CAFjFpRferOawbdTc35ebm+uD=KjW5Vgfoj+5vHv8-OyWOW4_gw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 11, 2018 at 5:49 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

>
>> + /* Add projection step if needed */
>> + if (target && simple_gather_path->pathtarget != target)
>>
>> If the target was copied someplace, this test will fail. Probably we want to
>> check containts of the PathTarget structure? Right now copy_pathtarget() is not
>> called from many places and all those places modify the copied target. So this
>> isn't a problem. But we can't guarantee that in future. Similar comment for
>> gather_merge path creation.
>>
>
> I am not sure if there is any use of copying the path target unless
> you want to modify it , but anyway we use the check similar to what is
> used in patch in the multiple places in code. See
> create_ordered_paths. So, we need to change all those places first if
> we sense any such danger.

Even if the test fails and we add a projection path here, while
creating the plan we avoid adding a Result node when the projection
target and underlying plan's target look same
(create_projection_plan()), so this works. An advantage with this
simple check (although it miscosts the projection) is that we don't do
expensive target equality check for every path. The expensive check
happens only on the chosen path.

>
>>
>> deallocate tenk1_count;
>> +explain (costs off) select ten, costly_func(ten) from tenk1;
>>
>> verbose output will show that the parallel seq scan's targetlist has
>> costly_func() in it. Isn't that what we want to test here?
>>
>
> Not exactly, we want to just test whether the parallel plan is
> selected when the costly function is used in the target list.

Parallel plan may be selected whether or not costly function exists in
the targetlist, if the underlying scan is optimal with parallel scan.
AFAIU, this patch is about pushing down the costly functions into the
parllel scan's targetlist. I think that can be verified only by
looking at the targetlist of parallel seq scan plan.

The solution here addresses only parallel scan requirement. In future
if we implement a solution which also addresses requirements of FDW
and custom plan (i.e. ability to handle targetlists by FDW and custom
plan), the changes made here will need to be reverted. That would be a
painful exercsize.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Gould 2018-03-13 06:14:17 Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.
Previous Message Ashutosh Bapat 2018-03-13 04:35:54 Re: parallel append vs. simple UNION ALL