Re: Gather Merge

From: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Gather Merge
Date: 2016-11-16 09:40:57
Message-ID: CAGPqQf2vXWuCWTECwiMhpPOo4cm1y_RUew0D2Cj=tqXunwUiWQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 14, 2016 at 3:51 PM, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com
> wrote:

> On Sat, Nov 12, 2016 at 1:56 AM, Rushabh Lathia
> <rushabh(dot)lathia(at)gmail(dot)com> wrote:
> > On Fri, Nov 4, 2016 at 8:30 AM, Thomas Munro <
> thomas(dot)munro(at)enterprisedb(dot)com>
> > wrote:
> >> + * Portions Copyright (c) 1996-2015, PostgreSQL Global Development
> Group
> >> + * Portions Copyright (c) 1994, Regents of the University of California
> >>
> >> Shouldn't this say just "(c) 2016, PostgreSQL Global Development
> >> Group"?
> >
> > Fixed.
>
> The year also needs updating to 2016 in nodeGatherMerge.h.
>

Oops sorry, fixed now.

>
> >> + /* Per-tuple heap maintenance cost */
> >> + run_cost += path->path.rows * comparison_cost * 2.0 * logN;
> >>
> >> Why multiply by two? The comment above this code says "about log2(N)
> >> comparisons to delete the top heap entry and another log2(N)
> >> comparisons to insert its successor". In fact gather_merge_getnext
> >> calls binaryheap_replace_first, which replaces the top element without
> >> any comparisons at all and then performs a sift-down in log2(N)
> >> comparisons to find its new position. There is no per-tuple "delete"
> >> involved. We "replace" the top element with the value it already had,
> >> just to trigger the sift-down, because we know that our comparator
> >> function might have a new opinion of the sort order of this element.
> >> Very clever! The comment and the 2.0 factor in cost_gather_merge seem
> >> to be wrong though -- or am I misreading the code?
> >>
> > See cost_merge_append.
>
> That just got tweaked in commit 34ca0905.
>

Fixed.

>
> > Looking at the plan I realize that this is happening because wrong
> costing
> > for Gather Merge. Here in the plan we can see the row estimated by
> > Gather Merge is wrong. This is because earlier patch GM was considering
> > rows = subpath->rows, which is not true as subpath is partial path. So
> > we need to multiple it with number of worker. Attached patch also fixed
> > this issues. I also run the TPC-H benchmark with the patch and results
> > are same as earlier.
>
> In create_grouping_paths:
> + double total_groups = gmpath->rows *
> gmpath->parallel_workers;
>
> This hides a variable of the same name in the enclosing scope. Maybe
> confusing?
>
> In some other places like create_ordered_paths:
> + double total_groups = path->rows * path->parallel_workers;
>
> Though it probably made sense to use this variable name in
> create_grouping_paths, wouldn't total_rows be better here?
>
>
Initially I just copied from the other places. I agree with you that
create_ordered_paths - total_rows make more sense.

> It feels weird to be working back to a total row count estimate from
> the partial one by simply multiplying by path->parallel_workers.
> Gather Merge will underestimate the total rows when parallel_workers <
> 4, if using partial row estimates ultimately from cost_seqscan which
> assume some leader contribution. I don't have a better idea though.
> Reversing cost_seqscan's logic certainly doesn't seem right. I don't
> know how to make them agree on the leader's contribution AND give
> principled answers, since there seems to be some kind of cyclic
> dependency in the costing logic (cost_seqscan really needs to be given
> a leader contribution estimate from its superpath which knows whether
> it will allow the leader to pull tuples greedily/fairly or not, but
> that superpath hasn't been created yet; cost_gather_merge needs the
> row count from its subpath). Or maybe I'm just confused.
>
>
Yes, I agree with you. But we can't really do changes into cost_seqscan.
Another option I can think of is just calculate the rows for gather merge,
by using the reverse formula which been used into cost_seqscan. So
we can completely remote the rows argument from the
create_gather_merge_path()
and then inside create_gather_merge_path() - calculate the total_rows
using same formula which been used into cost_seqscan. This is working
fine - but not quite sure about the approach. So I attached that part of
changes
as separate patch. Any suggestions?

--
Rushabh Lathia
www.EnterpriseDB.com

Attachment Content-Type Size
gather_merge_v4_minor_changes.patch application/x-download 50.5 KB
gm_v4_plus_rows_estimate.patch application/x-download 4.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2016-11-16 11:10:17 Re: pg_hba_file_settings view patch
Previous Message Magnus Hagander 2016-11-16 09:38:19 Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default