Re: Gather Merge

From: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andreas Joseph Krogh <andreas(at)visena(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
Subject: Re: Gather Merge
Date: 2017-03-14 09:47:20
Message-ID: CAGPqQf0pFPMYVTs-m77zRSN8zUdnByJCqVR+3CrzNyALxeg0WQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 13, 2017 at 10:56 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Fri, Mar 10, 2017 at 7:59 AM, Rushabh Lathia
> <rushabh(dot)lathia(at)gmail(dot)com> wrote:
> > Error coming from create_gather_merge_plan() from below condition:
> >
> > if (memcmp(sortColIdx, gm_plan->sortColIdx,
> > numsortkeys * sizeof(AttrNumber)) != 0)
> > elog(ERROR, "GatherMerge child's targetlist doesn't match
> > GatherMerge");
> >
> > Above condition checks the sort column numbers explicitly, to ensure the
> > tlists
> > really do match up. This been copied from the create_merge_append_plan().
> > Now
> > this make sense as for MergeAppend as its not projection capable plan
> (see
> > is_projection_capable_plan()). But like Gather, GatherMerge is the
> > projection
> > capable and its target list can be different from the subplan, so I don't
> > think this
> > condition make sense for the GatherMerge.
> >
> > Here is the some one the debugging info, through which I was able to
> reach
> > to this conclusion:
> >
> > - targetlist for the GatherMerge and subpath is same during
> > create_gather_merge_path().
> >
> > - targetlist for the GatherMerge is getting changed into
> > create_gather_merge_plan().
> >
> > postgres=# explain (analyze, verbose) select t2.j from t1 JOIN t2 ON
> > t1.k=t2.k where t1.i=1 order by t1.j desc;
> > NOTICE: path parthtarget: {PATHTARGET :exprs ({VAR :varno 2 :varattno 2
> > :vartype 23 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 2
> :varoattno
> > 2 :location 34} {VAR :varno 1 :varattno 2 :vartype 23 :vartypmod -1
> > :varcollid 0 :varlevelsup 0 :varnoold 1 :varoattno 2 :location 90})
> > :sortgrouprefs 0 1 :cost.startup 0.00 :cost.per_tuple 0.00 :width 8}
> >
> > NOTICE: subpath parthtarget: {PATHTARGET :exprs ({VAR :varno 1
> :varattno 2
> > :vartype 23 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 1
> :varoattno
> > 2 :location 90} {VAR :varno 2 :varattno 2 :vartype 23 :vartypmod -1
> > :varcollid 0 :varlevelsup 0 :varnoold 2 :varoattno 2 :location 34})
> > :cost.startup 0.00 :cost.per_tuple 0.00 :width 8}
> >
> > - Attached memory watch point and found that target list for GatherMerge
> is
> > getting
> > changed into groupping_planner() -> apply_projection_to_path().
> >
> > PFA patch to fix this issue.
>
> I don't think this fix is correct, partly on theoretical grounds and
> partly because I managed to make it crash. The problem is that
> prepare_sort_for_pathkeys() actually alters the output tlist of Gather
> Merge, which is inconsistent with the idea that Gather Merge can do
> projection. It's going to produce whatever
> prepare_sort_for_pathkeys() says it's going to produce, which may or
> may not be what was there before. Using Kuntal's dump file and your
> patch:
>
> set min_parallel_table_scan_size = 0;
> set parallel_setup_cost = 0;
> set parallel_tuple_cost = 0;
> set enable_sort = false;
> set enable_bitmapscan = false;
> alter table t1 alter column j type text;
> select t2.i from t1 join t2 on t1.k=t2.k where t1.i=1 order by t1.j desc;
>
> Crash. Abbreviated stack trace:
>
> #0 pg_detoast_datum_packed (datum=0xbc) at fmgr.c:2176
> #1 0x000000010160e707 in varstrfastcmp_locale (x=188, y=819,
> ssup=0x7fe1ea06a568) at varlena.c:1997
> #2 0x00000001013efc73 in ApplySortComparator [inlined] () at
> /Users/rhaas/pgsql/src/include/utils/sortsupport.h:225
> #3 0x00000001013efc73 in heap_compare_slots (a=<value temporarily
> unavailable, due to optimizations>, b=<value temporarily unavailable,
> due to optimizations>, arg=0x7fe1ea04e590) at sortsupport.h:681
> #4 0x00000001014057b2 in sift_down (heap=0x7fe1ea079458,
> node_off=<value temporarily unavailable, due to optimizations>) at
> binaryheap.c:274
> #5 0x000000010140573a in binaryheap_build (heap=0x7fe1ea079458) at
> binaryheap.c:131
> #6 0x00000001013ef771 in gather_merge_getnext [inlined] () at
> /Users/rhaas/pgsql/src/backend/executor/nodeGatherMerge.c:421
> #7 0x00000001013ef771 in ExecGatherMerge (node=0x7fe1ea04e590) at
> nodeGatherMerge.c:240
>
> Obviously, this is happening because we're trying to apply a
> comparator for text to a value of type integer. I propose the
> attached, slightly more involved fix, which rips out the first call to
> prepare_sort_from_pathkeys() altogether.
>
> Thanks Robert for the patch and the explanation.

I studied the patch and that look right to me. I performed manual testing,
run the scripts which I created during the gather merge patch also run
the tpch queries to make sure that it all working good.

I haven't found any regression the that changes.

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

--
Rushabh Lathia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2017-03-14 09:50:04 Re: parallelize queries containing initplans
Previous Message Mengxing Liu 2017-03-14 09:34:36 Re: [GSOC 17] Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions