Re: [HACKERS] why not parallel seq scan for slow functions

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Marina Polyakova <m(dot)polyakova(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Subject: Re: [HACKERS] why not parallel seq scan for slow functions
Date: 2018-03-11 12:19:08
Message-ID: CAA4eK1+EGQ-2jBoNP4q8BTUMbbiLe51QBBgJo0jeMOmTcwCPVw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 15, 2018 at 4:18 PM, Ashutosh Bapat
<ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>
>

After recent commits, the patch doesn't get applied cleanly, so
rebased it and along the way addressed the comments raised by you.

> Here are some comments on the patch.
>
> + /*
> + * Except for the topmost scan/join rel, consider gathering
> + * partial paths. We'll do the same for the topmost scan/join
> This function only works on join relations. Mentioning scan rel is confusing.
>

Okay, removed the 'scan' word from the comment.

> index 6e842f9..5206da7 100644
> --- a/src/backend/optimizer/path/allpaths.c
> +++ b/src/backend/optimizer/path/allpaths.c
> @@ -481,14 +481,21 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
> }
>
> + *
> + * Also, if this is the topmost scan/join rel (that is, the only baserel),
> + * we postpone this until the final scan/join targelist is available (see
>
> Mentioning join rel here is confusing since we deal with base relations here.
>

Okay, removed the 'join' word from the comment.

> + bms_membership(root->all_baserels) != BMS_SINGLETON)
>
> set_tablesample_rel_pathlist() is also using this method to decide whether
> there are any joins in the query. May be macro-ize this and use that macro at
> these two places?
>

maybe, but I am not sure if it improves the readability. I am open to
changing it if somebody else also feels it is better to macro-ize this
usage.

> - * for the specified relation. (Otherwise, add_partial_path might delete a
> + * for the specified relation. (Otherwise, add_partial_path might delete a
>
> Unrelated change?
>

oops, removed.

> + /* Add projection step if needed */
> + if (target && simple_gather_path->pathtarget != target)
>
> If the target was copied someplace, this test will fail. Probably we want to
> check containts of the PathTarget structure? Right now copy_pathtarget() is not
> called from many places and all those places modify the copied target. So this
> isn't a problem. But we can't guarantee that in future. Similar comment for
> gather_merge path creation.
>

I am not sure if there is any use of copying the path target unless
you want to modify it , but anyway we use the check similar to what is
used in patch in the multiple places in code. See
create_ordered_paths. So, we need to change all those places first if
we sense any such danger.

> + simple_gather_path = apply_projection_to_path(root,
> + rel,
> + simple_gather_path,
> + target);
> +
>
> Why don't we incorporate those changes in create_gather_path() by passing it
> the projection target instead of rel->reltarget? Similar comment for
> gather_merge path creation.
>

Nothing important, just for the sake of code consistency, some other
places in code uses it this way. See create_ordered_paths. Also, I am
not sure if there is any advantage of doing it inside
create_gather_path.

> + /*
> + * Except for the topmost scan/join rel, consider gathering
> + * partial paths. We'll do the same for the topmost scan/join rel
>
> Mentioning scan rel is confusing here.
>

Okay, changed.

>
> deallocate tenk1_count;
> +explain (costs off) select ten, costly_func(ten) from tenk1;
>
> verbose output will show that the parallel seq scan's targetlist has
> costly_func() in it. Isn't that what we want to test here?
>

Not exactly, we want to just test whether the parallel plan is
selected when the costly function is used in the target list.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
parallel_paths_include_tlist_cost_v9.patch application/octet-stream 13.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2018-03-11 13:05:39 Re: disable SSL compression?
Previous Message Alexander Korotkov 2018-03-11 11:19:34 Re: [HACKERS] GUC for cleanup indexes threshold.