Re: Discard ORDER BY/DISTINCT when an ANY/IN sublink is pulled up to a join

From: Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Discard ORDER BY/DISTINCT when an ANY/IN sublink is pulled up to a join
Date: 2026-06-10 22:33:47
Message-ID: CAN4CZFPSMmef7+j32SW017wgknhKiA4CSQHC8NOtaMy9DOHMQA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello

I verified that the patch is generally faster in my benchmarks, with
one exception:
anti joins with heavy duplication end up being significantly slower,
for example:

create table ao (a int not null);
create table ai (k int not null);
insert into ao select g from generate_series(1,100000) g;
insert into ai select g % 50 from generate_series(1,2000000) g;
analyze ao;
analyze ai;
\timing on
explain (analyze, costs off, timing off, summary off)
select count(*) from ao where a not in (select distinct k from ai);

Which seems related to parallelization, as in these scenarios the
patched version chooses a serial execution compared to the
parallelized deduplication on master, and ends up being 2-4x slower.
If I force it to use parallel workers, it ends up being faster even in
these cases.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2026-06-10 23:26:26 Re: Heads Up: cirrus-ci is shutting down June 1st
Previous Message Michael Paquier 2026-06-10 22:30:42 Re: [(known) BUG] DELETE/UPDATE more than one row in partitioned foreign table