Re: Join optimisation Quandry

From: Ceri Storey <cez-misc(dot)pgsql-perf(at)necrofish(dot)org(dot)uk>
To: Ceri Storey <cez(at)necrofish(dot)org(dot)uk>
Cc: Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Join optimisation Quandry
Date: 2004-01-17 11:58:26
Message-ID: 20040117115826.GD43530@mrtall.compsoc.man.ac.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sat, Jan 17, 2004 at 01:03:34AM +0000, Ceri Storey wrote:
> Okay, from top to bottom:
>
> SELECT p1.chan_name, p1.prog_start AS now_start, p1.prog_id, p1.title_text,
> p2.prog_start AS next_start, p2.prog_id, p2.title_text,
> p1.title_wanted, p2.title_wanted, p1.chan_id
> FROM (programme natural join channel NATURAL JOIN title) AS p1
> LEFT OUTER JOIN (programme NATURAL JOIN title) AS p2
> ON p1.prog_next = p2.prog_id
> WHERE p1.prog_start <= timestamp 'now' AND p1.prog_stop > timestamp 'now'
> ORDER BY p1.chan_id ASC

Although, as I've just found, another bottleneck is the title table.
PostgreSQL seems to inst on doing a Seq Scan on the entire table.

Compare:
tv=> explain analyse SELECT * FROM tid LEFT OUTER JOIN title ON t1 = title_id OR t2 = title_id;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Nested Loop Left Join (cost=190.83..267285.83 rows=2000 width=35) (actual time=222.776..2430.073 rows=33 loops=1)
Join Filter: (("outer".t1 = "inner".title_id) OR ("outer".t2 = "inner".title_id))
-> Seq Scan on tid (cost=0.00..20.00 rows=1000 width=8) (actual time=0.028..10.457 rows=17 loops=1)
-> Materialize (cost=190.83..297.66 rows=10683 width=27) (actual time=0.197..57.918 rows=10767 loops=17)
-> Seq Scan on title (cost=0.00..190.83 rows=10683 width=27) (actual time=0.045..64.988 rows=10767 loops=1)
Total runtime: 2435.059 ms
(6 rows)

With:
tv=> explain analyse select * from title where title_id IN (SELECT t1 FROM tid UNION SELECT t2 FROM tid);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=205.16..451.40 rows=200 width=27) (actual time=3.065..82.689 rows=33 loops=1)
Hash Cond: ("outer".title_id = "inner".t1)
-> Seq Scan on title (cost=0.00..190.83 rows=10683 width=27) (actual time=0.010..36.325 rows=10767 loops=1)
-> Hash (cost=204.66..204.66 rows=200 width=4) (actual time=1.464..1.464 rows=0 loops=1)
-> HashAggregate (cost=204.66..204.66 rows=200 width=4) (actual time=1.234..1.355 rows=33 loops=1)
-> Subquery Scan "IN_subquery" (cost=169.66..199.66 rows=2000 width=4) (actual time=0.735..1.104 rows=33 loops=1)
-> Unique (cost=169.66..179.66 rows=2000 width=4) (actual time=0.728..0.934 rows=33 loops=1)
-> Sort (cost=169.66..174.66 rows=2000 width=4) (actual time=0.722..0.779 rows=34 loops=1)
Sort Key: t1
-> Append (cost=0.00..60.00 rows=2000 width=4) (actual time=0.054..0.534 rows=34 loops=1)
-> Subquery Scan "*SELECT* 1" (cost=0.00..30.00 rows=1000 width=4) (actual time=0.050..0.228 rows=17 loops=1)
-> Seq Scan on tid (cost=0.00..20.00 rows=1000 width=4) (actual time=0.041..0.126 rows=17 loops=1)
-> Subquery Scan "*SELECT* 2" (cost=0.00..30.00 rows=1000 width=4) (actual time=0.014..0.183 rows=17 loops=1)
-> Seq Scan on tid (cost=0.00..20.00 rows=1000 width=4) (actual time=0.008..0.087 rows=17 loops=1)
Total runtime: 83.214 ms
(15 rows)

--
Ceri Storey <cez(at)necrofish(dot)org(dot)uk>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message John Siracusa 2004-01-17 15:34:22 Idle postmaster taking up a lot of CPU
Previous Message Ceri Storey 2004-01-17 10:40:49 Re: Join optimisation Quandry