Re: Join performance

From: Russell Smith <mr-russ(at)pws(dot)com(dot)au>
To: Pepe Barbe <elventear(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Join performance
Date: 2007-11-10 05:26:17
Message-ID: 473540F9.7020307@pws.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Pepe Barbe wrote:
> Hello,
>
> I am having an issue on PostgreSQL 8.0.12. In the past we had
> performance issues with the query planner for queries on some tables
> where we knew we had indexes and it was doing a sequential scan, and for
> this reason we issue "SET enable_seqscan = FALSE" for some queries.
>
> Recently we have stumbled upon one of these kind of queries that is
> giving terrible performance, because seqscan is disabled. I've reduced
> the problem to a a command like this one:
>
> SELECT * from gsm_sector_metrics NATURAL JOIN gsm_amr_metrics INNER JOIN
> temp_busy_hr USING(start_time,bsc_id,sect_id);
>
> Where temp_busy_hr is a temporary table.

Have you tried analyzing the temp_busy_hr table?
Possibly adding an index to the temp table can help if you are doing lots of queries.

>
> If the previous is issued with seqscan TRUE, it runs within reasonable
> time, else it runs for ever. The query plan for the previous query with
> enable_seqscan = TRUE:

It would be worth know how far the estimates are out. Also, have you tried altering the statistics target
for relevant columns to increase the accuracy?

>
> QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------------------------------
>
> Limit (cost=0.00..384555.98 rows=1 width=3092)
> -> Nested Loop (cost=0.00..384555.98 rows=1 width=3092)
> Join Filter: (("inner".bsc_id = "outer".bsc_id) AND ("inner".site_id = "outer".site_id) AND ("inner".sect_id = "outer".sect_id))
> -> Nested Loop (cost=0.00..368645.64 rows=28 width=1192)
> Join Filter: (("outer".sect_id = "inner".sect_id) AND ("outer".bsc_id = "inner".bsc_id))
> -> Seq Scan on temp_busy_hr (cost=0.00..24.00 rows=1400 width=24)
> -> Index Scan using gsm_amr_start_time_idx on gsm_amr_metrics (cost=0.00..226.66 rows=2094 width=1168)
> Index Cond: ("outer".start_time = gsm_amr_metrics.start_time)
> -> Index Scan using gsm_sector_start_time_idx on gsm_sector_metrics t1 (cost=0.00..528.77 rows=1973 width=1936)
> Index Cond: (t1.start_time = "outer".start_time)
> (10 rows)
>
> and the plan for enable_seqscan = FALSE:
>
> QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------------------------------
>
> Limit (cost=100000097.16.. 100720844.011111 rows=1 width=3092)
> -> Nested Loop (cost=100000097.16..100720844.01 rows=1 width=3092)
> Join Filter: (("inner".bsc_id = "outer".bsc_id) AND ("inner".site_id = "outer".site_id) AND ("inner".sect_id = "outer".sect_id))
> -> Merge Join (cost=100000097.16..100704933.67 rows=28 width=1192)
> Merge Cond: ("outer".start_time = "inner".start_time)
> Join Filter: (("inner".sect_id = "outer".sect_id) AND ("inner".bsc_id = "outer".bsc_id))
> -> Index Scan using gsm_amr_start_time_idx on gsm_amr_metrics (cost=0.00..631211.45 rows=6005551 width=1168)
> -> Sort (cost=100000097.16..100000100.66 rows=1400 width=24)
> Sort Key: temp_busy_hr.start_time
> -> Seq Scan on temp_busy_hr (cost=100000000.00..100000024.00 rows=1400 width=24)
> -> Index Scan using gsm_sector_start_time_idx on gsm_sector_metrics t1 (cost=0.00..528.77 rows=1973 width=1936)
> Index Cond: (t1.start_time = "outer".start_time)
> (12 rows)
>
> Any ideas what could I try to fix this problem?
>
> Thanks,
> Pepe
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jean-David Beyer 2007-11-10 16:27:07 Curious about dead rows.
Previous Message Alvaro Herrera 2007-11-09 21:49:21 Re: Can I Determine if AutoVacuum Does Anything?