Re: Using each rel as both outer and inner for JOIN_ANTI

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Emre Hasegeli <emre(at)hasegeli(dot)com>
Subject: Re: Using each rel as both outer and inner for JOIN_ANTI
Date: 2022-08-11 02:58:32
Message-ID: CAMbWs49mh1Fsnw1Zu5MzOzhHsh_Gcrz0egVk0MGAOq16ezXZPw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 10, 2022 at 4:40 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
wrote:

> On 2022-Aug-10, Richard Guo wrote:
>
> > The right-anti join plan has the same cost estimation with right join
> > plan in this case. So would you please help to test what the right join
> > plan looks like in your env for the query below?
> >
> > select * from foo left join bar on foo.a = bar.c;
>
> You're right, it does.
>
> 55432 16devel 475322=# explain (analyze, buffers) select * from foo left
> join bar on foo.a = bar.c;
> QUERY PLAN
>
>
> ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
> Hash Right Join (cost=1.23..90875.24 rows=10 width=20) (actual
> time=456.410..456.415 rows=10 loops=1)
> Hash Cond: (bar.c = foo.a)
> Buffers: shared hit=15852 read=6273
> -> Seq Scan on bar (cost=0.00..72124.00 rows=5000000 width=12)
> (actual time=0.036..210.468 rows=5000000 loops=1)
> Buffers: shared hit=15852 read=6272
> -> Hash (cost=1.10..1.10 rows=10 width=8) (actual time=0.037..0.038
> rows=10 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 9kB
> Buffers: shared read=1
> -> Seq Scan on foo (cost=0.00..1.10 rows=10 width=8) (actual
> time=0.022..0.026 rows=10 loops=1)
> Buffers: shared read=1
> Planning:
> Buffers: shared hit=92 read=13
> Planning Time: 1.077 ms
> Execution Time: 456.458 ms
> (14 filas)

Thanks for help testing. Comparing the anti join plan and the right join
plan, the estimated cost and the execution time mismatch a lot. Seems
the cost estimate of hashjoin path is not that precise for this case
even in the unpatched codes. Maybe this is something we need to improve.

Thanks
Richard

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2022-08-11 03:33:39 Re: [RFC] building postgres with meson
Previous Message John Naylor 2022-08-11 02:50:54 Re: optimize lookups in snapshot [sub]xip arrays