Re: Gather Merge

From: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Gather Merge
Date: 2017-03-09 13:21:41
Message-ID: CAGPqQf2ojLdYhvDy3pRomyBpapWiiYM1W6q-rB7t-sMZ_UNfSQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 9, 2017 at 6:19 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Wed, Mar 8, 2017 at 11:59 PM, Rushabh Lathia
> <rushabh(dot)lathia(at)gmail(dot)com> wrote:
> > Here is another version of patch with the suggested changes.
>
> Committed.
>
>
Thanks Robert for committing this.

My colleague Neha Sharma found one regression with the patch. I was about
to send this mail and noticed that you committed the patch.

Here is the small example:

Test setup:

1) ./db/bin/pgbench postgres -i -F 100 -s 20

2) update pgbench_accounts set filler = 'foo' where aid%10 = 0;

3) vacuum analyze pgbench_accounts

4)

postgres=# set max_parallel_workers_per_gather = 4;
SET

postgres=# explain select aid from pgbench_accounts where aid % 25= 0 group
by aid;
ERROR: ORDER/GROUP BY expression not found in targetlist

postgres=# set enable_indexonlyscan = off;
SET
postgres=# explain select aid from pgbench_accounts where aid % 25= 0 group
by aid;
QUERY
PLAN
--------------------------------------------------------------------------------------------------------
Group (cost=44708.21..45936.81 rows=10001 width=4)
Group Key: aid
-> Gather Merge (cost=44708.21..45911.81 rows=10000 width=0)
Workers Planned: 4
-> Group (cost=43708.15..43720.65 rows=2500 width=4)
Group Key: aid
-> Sort (cost=43708.15..43714.40 rows=2500 width=4)
Sort Key: aid
-> Parallel Seq Scan on pgbench_accounts
(cost=0.00..43567.06 rows=2500 width=4)
Filter: ((aid % 25) = 0)
(10 rows)

- Index only scan with GM do work - but with ORDER BY clause
postgres=# set enable_seqscan = off;
SET
postgres=# explain select aid from pgbench_accounts where aid % 25= 0 order
by aid;
QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------
Gather Merge (cost=1000.49..113924.61 rows=10001 width=4)
Workers Planned: 4
-> Parallel Index Scan using pgbench_accounts_pkey on pgbench_accounts
(cost=0.43..111733.33 rows=2500 width=4)
Filter: ((aid % 25) = 0)
(4 rows)

Debugging further I found that problem only when IndexOnlyScan under GM and
that to only with grouping. Debugging problem I found that ressortgroupref
is
not getting set. That lead me thought that it might be because
create_gather_merge_plan()
is building tlist, with build_path_tlist. Another problem I found is that
create_grouping_paths() is passing NULL for the targets while calling
create_gather_merge_path(). (fix_target_gm.patch)

With those changes above test is running fine, but it broke other things.
Like

postgres=# explain select distinct(bid) from pgbench_accounts where filler
like '%foo%' group by aid;
ERROR: GatherMerge child's targetlist doesn't match GatherMerge

Another thing I noticed that, if we stop adding gathermerge during the
generate_gather_paths() then all the test working fine.
(comment_gm_from_generate_gather.patch)

I will continue looking at this problem.

Attaching both the patch for reference.

regards,
Rushabh Lathia
www.EnterpriseDB.com

Attachment Content-Type Size
comment_gm_from_generate_gather.patch application/x-download 987 bytes
fix_target_gm.patch application/x-download 1.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Artur Zakirov 2017-03-09 13:25:00 Re: Should buffer of initialization fork have a BM_PERMANENT flag
Previous Message Michael Paquier 2017-03-09 12:59:09 Re: CREATE/ALTER ROLE PASSWORD ('value' USING 'method')