From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Allow parallel DISTINCT |
Date: | 2021-08-11 04:51:02 |
Message-ID: | CAApHDvrjRxVKwQN0he79xS+9wyotFXL=RmoWqGGO2N45Farpgw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Back in March 2016, e06a38965 added support for parallel aggregation.
IIRC, because it was fairly late in the release cycle, I dropped
parallel DISTINCT to reduce the scope a little. It's been on my list
of things to fix since then. I just didn't get around to it until
today.
The patch is just some plumbing work to connect all the correct paths
up to make it work. It's all fairly trivial.
I thought about refactoring things a bit more to get rid of the
additional calls to grouping_is_sortable() and grouping_is_hashable(),
but I just don't think it's worth making the code ugly for. We'll
only call them again if we're considering a parallel plan, in which
case it's most likely not a trivial query. Those functions are pretty
cheap anyway.
I understand that there's another patch in the September commitfest
that does some stuff with Parallel DISTINCT, but that goes about
things a completely different way by creating multiple queues to
distribute values by hash. I don't think there's any overlap here.
We'd likely want to still have the planner consider both methods if we
get that patch sometime.
David
Attachment | Content-Type | Size |
---|---|---|
parallel_distinct.patch | application/octet-stream | 10.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Fujii Masao | 2021-08-11 04:56:27 | Re: Fix around conn_duration in pgbench |
Previous Message | David G. Johnston | 2021-08-11 04:29:49 | Re: use-regular-expressions-to-simplify-less_greater-and-not_equals.patch |