Re: [HACKERS] why not parallel seq scan for slow functions

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Marina Polyakova <m(dot)polyakova(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Subject: Re: [HACKERS] why not parallel seq scan for slow functions
Date: 2018-03-19 19:53:48
Message-ID: CA+TgmoaQMeuLvARpzNbWkUOd+4LgAjTTvWdEbVhPA5Y68tYSRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 17, 2018 at 1:16 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> Test-1
> ----------
> DO $$
> DECLARE count integer;
> BEGIN
> For count In 1..1000000 Loop
> Execute 'explain Select ten from tenk1';
> END LOOP;
> END;
> $$;
>
> In the above block, I am explaining the simple statement which will
> have just one path, so there will be one additional path projection
> and removal cycle for this statement. I have just executed the above
> block in psql by having \timing option 'on' and the average timing for
> ten runs on HEAD is 21292.388 ms, with patches (0001.* ~ 0003) is
> 22405.2466 ms and with patches (0001.* ~ 0005.*) is 22537.1362. These
> results indicate that there is approximately 5~6% of the increase in
> planning time.

Ugh. I'm able to reproduce this, more or less -- with master, this
test took 42089.484 ms, 41935.849 ms, 42519.336 ms on my laptop, but
with 0001-0003 applied, 43925.959 ms, 43619.004 ms, 43648.426 ms.
However I have a feeling there's more going on here, because the
following patch on top of 0001-0003 made the time go back down to
42353.548, 41797.757 ms, 41891.194 ms.

diff --git a/src/backend/optimizer/plan/planner.c
b/src/backend/optimizer/plan/planner.c
index bf0b3e75ea..0542b3e40b 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -1947,12 +1947,19 @@ grouping_planner(PlannerInfo *root, bool
inheritance_update,
{
Path *subpath = (Path *) lfirst(lc);
Path *path;
+ Path *path2;

Assert(subpath->param_info == NULL);
- path = (Path *) create_projection_path(root, tlist_rel,
+ path2 = (Path *) create_projection_path(root, tlist_rel,
subpath, scanjoin_target);
- add_path(tlist_rel, path);
+ path = (Path *) apply_projection_to_path(root, tlist_rel,
+ subpath, scanjoin_target);
+ if (path == path2)
+ elog(ERROR, "oops");
+ lfirst(lc) = path;
}
+ tlist_rel->pathlist = current_rel->pathlist;
+ current_rel->pathlist = NIL;

/*
* If we can produce partial paths for the tlist rel, for possible use

It seems pretty obvious that creating an extra projection path that is
just thrown away can't "really" be making this faster, so there's
evidently some other effect here involving how the code is laid out,
or CPU cache effects, or, uh, something.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeremy Finzel 2018-03-19 19:53:58 Re: found xmin from before relfrozenxid on pg_catalog.pg_authid
Previous Message Tomas Vondra 2018-03-19 19:50:46 Re: [PROPOSAL] Shared Ispell dictionaries