speeding up a join query that utilizes a view

From: Kirk Wythers <kwythers(at)umn(dot)edu>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: speeding up a join query that utilizes a view
Date: 2013-01-17 05:15:56
Message-ID: 0A3B5B90-CE27-47FC-A0D0-2B76932B49B0@umn.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I am looking for advice on a performance problem. I'm pretty sure that the culprit of my slow performance is a view that is several hundred million records in size. Because it is a view, I can only index the underlying table, but because the view generates an "un-pivoted" version of the underlying table with un unnest function, I can't index the important column in the underlying table, because it doesn't exist until after the un-pivot or stacking function of the view… I know… this is all very circular.

Here is the join query that uses the view. I have

SELECT
data_key.site, data_key.canopy, data_key.measurement_interval, data_key.treatment_code, data_key.treatment_abbr, data_key.plot, fifteen_min_stacked_view.*
FROM
data_key, fifteen_min_stacked_view WHERE data_key.variable_channel = fifteen_min_stacked_view.variable AND data_key.block_name = fifteen_min_stacked_view.block_name AND fifteen_min_stacked_view.variable ~ 'tsoil'

I have tried adding indexes where I can on the join colums in the data_key table
Here is the EXPLAIN.

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=195.20..548004.70 rows=196 width=192) (actual time=3.295..443523.222 rows=28779376 loops=1)
Hash Cond: ((fifteen_min_stacked_propper.variable = (data_key.variable_channel)::text) AND ((fifteen_min_stacked_propper.block_name)::text = (data_key.block_name)::text))
-> Subquery Scan on fifteen_min_stacked_propper (cost=0.00..547620.47 rows=2878 width=156) (actual time=0.247..424911.643 rows=28779376 loops=1)
Filter: (fifteen_min_stacked_propper.variable ~ 'tsoil'::text)
-> Index Scan using fifteen_min_pkey on fifteen_min (cost=0.00..525136.58 rows=1798711 width=1072) (actual time=0.034..96077.588 rows=428093218 loops=1)
-> Hash (cost=124.28..124.28 rows=4728 width=55) (actual time=3.036..3.036 rows=4728 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 437kB
-> Seq Scan on data_key (cost=0.00..124.28 rows=4728 width=55) (actual time=0.007..1.277 rows=4728 loops=1)
Total runtime: 444912.792 ms
(9 rows)

Any ideas would be much appreciated

Responses

Browse pgsql-general by date

  From Date Subject
Next Message ning chan 2013-01-17 05:17:30 Streaming Replication Failover
Previous Message classical_89 2013-01-17 03:48:49 Logging affected rows