Allow parallel DISTINCT

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Allow parallel DISTINCT
Date: 2021-08-11 04:51:02
Message-ID: CAApHDvrjRxVKwQN0he79xS+9wyotFXL=RmoWqGGO2N45Farpgw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Back in March 2016, e06a38965 added support for parallel aggregation.
IIRC, because it was fairly late in the release cycle, I dropped
parallel DISTINCT to reduce the scope a little. It's been on my list
of things to fix since then. I just didn't get around to it until
today.

The patch is just some plumbing work to connect all the correct paths
up to make it work. It's all fairly trivial.

I thought about refactoring things a bit more to get rid of the
additional calls to grouping_is_sortable() and grouping_is_hashable(),
but I just don't think it's worth making the code ugly for. We'll
only call them again if we're considering a parallel plan, in which
case it's most likely not a trivial query. Those functions are pretty
cheap anyway.

I understand that there's another patch in the September commitfest
that does some stuff with Parallel DISTINCT, but that goes about
things a completely different way by creating multiple queues to
distribute values by hash. I don't think there's any overlap here.
We'd likely want to still have the planner consider both methods if we
get that patch sometime.

David

Attachment Content-Type Size
parallel_distinct.patch application/octet-stream 10.6 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2021-08-11 04:56:27 Re: Fix around conn_duration in pgbench
Previous Message David G. Johnston 2021-08-11 04:29:49 Re: use-regular-expressions-to-simplify-less_greater-and-not_equals.patch