Re: Fix BUG #17335: Duplicate result rows in Gather node

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Fix BUG #17335: Duplicate result rows in Gather node
Date: 2022-01-19 20:32:17
Message-ID: CAApHDvqMUshyci-F2CZ-jkSbmuhzCudhF5t5_z0shFxd9D_hVg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 31 Dec 2021 at 00:14, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> wrote:
> Suggested quick (and valid) fix in the patch attached:
> - If Append has single child, then copy its parallel awareness.

I've been looking at this and I've gone through changing my mind about
what's the right fix quite a number of times.

My current thoughts are that I don't really like the fact that we can
have plans in the following shape:

Finalize Aggregate
-> Gather
Workers Planned: 1
-> Partial Aggregate
-> Parallel Hash Left Join
Hash Cond: (gather_append_1.fk = gather_append_2.fk)
-> Index Scan using gather_append_1_ix on gather_append_1
Index Cond: (f = true)
-> Parallel Hash
-> Parallel Seq Scan on gather_append_2

It's only made safe by the fact that Gather will only use 1 worker.
To me, it just seems too fragile to assume that's always going to be
the case. I feel like this fix just relies on the fact that
create_gather_path() and create_gather_merge_path() do
"pathnode->num_workers = subpath->parallel_workers;". If someone
decided that was to work a different way, then we risk this breaking
again. Additionally, today we have Gather and GatherMerge, but we may
one day end up with more node types that gather results from parallel
workers, or even a completely different way of executing plans.

I think a safer way to fix this is to just not remove the
Append/MergeAppend node if the parallel_aware flag of the only-child
and the Append/MergeAppend don't match. I've done that in the
attached.

I believe the code at the end of add_paths_to_append_rel() can remain as is.

David

Attachment Content-Type Size
dont_remove_singlechild_appends_with_mismatching_parallel_awareness.patch application/octet-stream 2.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-01-19 20:40:12 Re: Replace uses of deprecated Python module distutils.sysconfig
Previous Message Andres Freund 2022-01-19 20:29:26 Re: Adding CI to our tree