Re: [BUGS] [postgresql 10 beta3] unrecognized node type: 90

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, "Adam, Etienne (Nokia-TECH/Issy Les Moulineaux)" <etienne(dot)adam(at)nokia(dot)com>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>, "Duquesne, Pierre (Nokia-TECH/Issy Les Moulineaux)" <pierre(dot)duquesne(at)nokia(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUGS] [postgresql 10 beta3] unrecognized node type: 90
Date: 2017-08-17 04:37:42
Message-ID: CAA4eK1J_dRv+pGt2bquwmwd_K9+Nght2=tbAc3uUzCinJTMmLg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Tue, Aug 15, 2017 at 7:16 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Tue, Aug 15, 2017 at 7:31 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>> Attached patch fixes the issue for me. I have locally verified that
>>> the gather merge gets executed in rescan path. I haven't added a test
>>> case for the same as having gather or gather merge on the inner side
>>> of join can be time-consuming. However, if you or others feel that it
>>> is important to have a test to cover this code path, then I can try to
>>> produce one.
>
>> Committed.
>
>> I believe that between this commit and the test-coverage commit from
>> Andres, this open item is reasonably well addressed. If someone
>> thinks more needs to be done, please specify. Thanks.
>
> How big a deal do we think test coverage is? It looks like
> ExecReScanGatherMerge is identical logic to ExecReScanGather,
> which *is* covered according to coverage.postgresql.org, but
> it wouldn't be too surprising if they diverge in future.
>
> I should think it wouldn't be that expensive to create a test
> case, if you already have test cases that invoke GatherMerge.
> Adding a right join against a VALUES clause with a small number of
> entries, and a non-mergeable/hashable join clause, ought to do it.
>

I have done some experiments based on this idea to generate a test,
but I think it is not as straightforward as it appears. Below are
some of my experiments:

Query that uses GatherMerge in regression tests
---------------------------------------------------------------------

regression=# explain (costs off) select string4, count((unique2))
from tenk1 group by string4 order by string4;
QUERY PLAN
----------------------------------------------------
Finalize GroupAggregate
Group Key: string4
-> Gather Merge
Workers Planned: 2
-> Partial GroupAggregate
Group Key: string4
-> Sort
Sort Key: string4
-> Parallel Seq Scan on tenk1
(9 rows)

Modified Query
----------------------
regression=# explain (costs off) select tenk1.hundred,
count((unique2)) from tenk1 right join (values (1,100), (2,200)) as t
(two, hundred) on t.two
> 1 and tenk1.hundred > 1 group by tenk1.hundred order by tenk1.hundred;
QUERY PLAN
--------------------------------------------------------------------------
Sort
Sort Key: tenk1.hundred
-> HashAggregate
Group Key: tenk1.hundred
-> Nested Loop Left Join
Join Filter: ("*VALUES*".column1 > 1)
-> Values Scan on "*VALUES*"
-> Gather
Workers Planned: 4
-> Parallel Index Scan using tenk1_hundred on tenk1
Index Cond: (hundred > 1)
(11 rows)

The cost of GatherMerge is always higher than Gather in this case
which is quite obvious as GatherMerge has to perform some additional
task. I am not aware of a way such that Grouping and Sorting can be
pushed below parallel node (Gather/GatherMerge) in this case, if there
is any such possibility, then it might prefer GatherMerge.

Another way to make it parallel is, add a new guc enable_gather
similar to enable_gathermerge and then set that to off, it will prefer
GatherMerge in that case. I think it is anyway good to have such a
guc. I will go and do it this way unless you have a better idea.

Note - enable_gathermerge is not present in postgresql.conf. I think
we should add it in the postgresql.conf.sample file.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2017-08-17 04:53:05 Re: \copy produces CSV output that cannot be read by \copy
Previous Message Tatsuo Ishii 2017-08-17 01:50:30 Re: [HACKERS] Replication to Postgres 10 on Windows is broken

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2017-08-17 04:56:53 Re: expanding inheritance in partition bound order
Previous Message Tom Lane 2017-08-17 04:28:31 Re: recovery_target_time = 'now' is not an error but still impractical setting