Re: Top-N sorts verses parallelism

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Top-N sorts verses parallelism
Date: 2017-12-17 20:29:08
Message-ID: CA+TgmoZrghu=F=o8D6LR9MYhibEzjHhwWa_ZHkjtZrThsdsM1A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 15, 2017 at 4:13 PM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> Looks right to me. Commit 3452dc52 just forgot to tell the planner.
> I'm pleased about that because it makes this a slam-dunk bug-fix and
> not some confusing hard to justify costing problem.

Jeff Janes inquired off-list about other places where cost_sort() gets called.

In costsize.c, it is called from initial_cost_mergejoin, which seems
like it does not present an opportunity for pushing down limits, since
we don't know how many rows we'll have to join to get a certain number
of outputs.

In createplan.c, it is called only from label_sort_with_costsize().
That in turn called from create_merge_append_plan(), passing the tuple
limit from the path being converted to a plan, which seems
unimpeachable. It's also called from make_unique_plan() and
create_mergejoin_plan() with -1, which seems OK since in neither case
do we know how many input rows we need to read.

In planner.c, it's called from plan_cluster_use_sort() with -1;
CLUSTER has to read the whole input, so that's fine.

In prepunion.c, it's called from choose_hashed_setop() with -1. I
think set operations also need to read the whole input.

In pathnode.c, it's called from create_merge_append_path,
create_unique_path, create_gather_merge_path, create_groupingset_path,
and create_sort_path. create_merge_append_path passes any limit
applicable to the subpath. create_unique_path passes -1.
create_gather_merge_path also passes -1, which as Jeff also pointed
out seems possibly wrong. create_sort_path also passes -1, and so
does create_groupingsets_path.

I went through the callers to create_sort_path and the only one that
looks like it can pass a limit is the one you and Jeff already
identified. So I think the question is just whether
create_gather_merge_path needs a similar fix.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeremy Finzel 2017-12-17 21:27:54 Re: Backfill bgworker Extension?
Previous Message Tom Lane 2017-12-17 16:03:49 Re: pgsql: Provide overflow safe integer math inline functions.