Re: Fix BUG #17335: Duplicate result rows in Gather node

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject: Re: Fix BUG #17335: Duplicate result rows in Gather node
Date: 2021-12-30 12:29:51
Message-ID: CAFiTN-ukx33XeaPJwo9BgdTyPWv3Ue8nLj3uwre6kubuxEXNPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 30, 2021 at 4:44 PM Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
wrote:

> Good day, hackers.
>
> Problem:
> - Append path is created with explicitely parallel_aware = true
> - It has two child, one is trivial, other is parallel_aware = false .
> Trivial child is dropped.
> - Gather/GatherMerge path takes Append path as a child and thinks
> its child is parallel_aware = true.
> - But Append path is removed at the last since it has only one child.
> - Now Gather/GatherMerge thinks its child is parallel_aware, but it
> is not.
> Gather/GatherMerge runs its child twice: in a worker and in a leader,
> and gathers same rows twice.
>
> Reproduction code attached (repro.sql. Included as a test as well).
>

Yeah, this is a problem.

>
> Suggested quick (and valid) fix in the patch attached:
> - If Append has single child, then copy its parallel awareness.
>
> Bug were introduced with commit 8edd0e79460b414b1d971895312e549e95e12e4f
> "Suppress Append and MergeAppend plan nodes that have a single child."
>
> During discussion, it were supposed [1] those fields should be copied:
>
> > I haven't looked into whether this does the right things for parallel
> > planning --- possibly create_[merge]append_path need to propagate up
> > parallel-related path fields from the single child?
>
> But it were not so obvious [2].
>
> Better fix could contain removing Gather/GatherMerge node as well if
> its child is not parallel aware.
>

The Gather path will only be created if we have an underlying partial path,
so I think if we are generating the append path only from the non-partial
paths then we can see if the number of child nodes is just 1 then don't
generate the partial append path, so from that you will node generate the
partial join and eventually gather will be avoided.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Guillaume Lelarge 2021-12-30 13:18:42 Re: Autovacuum and idle_session_timeout
Previous Message Maxim Orlov 2021-12-30 12:15:16 Add 64-bit XIDs into PostgreSQL 15