Re: speeding up a join query that utilizes a view

From: Igor Neyman <ineyman(at)perceptron(dot)com>
To: Kirk Wythers <kwythers(at)umn(dot)edu>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: speeding up a join query that utilizes a view
Date: 2013-01-17 15:12:42
Message-ID: A76B25F2823E954C9E45E32FA49D70EC08F7283E@mail.corp.perceptron.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> -----Original Message-----
> From: Kirk Wythers [mailto:kwythers(at)umn(dot)edu]
> Sent: Thursday, January 17, 2013 12:16 AM
> To: pgsql-general(at)postgresql(dot)org
> Subject: speeding up a join query that utilizes a view
>
> I am looking for advice on a performance problem. I'm pretty sure that
> the culprit of my slow performance is a view that is several hundred
> million records in size. Because it is a view, I can only index the
> underlying table, but because the view generates an "un-pivoted"
> version of the underlying table with un unnest function, I can't index
> the important column in the underlying table, because it doesn't exist
> until after the un-pivot or stacking function of the view... I know... this
> is all very circular.
>
> Here is the join query that uses the view. I have
>
> SELECT
> data_key.site,
> data_key.canopy,
> data_key.measurement_interval,
> data_key.treatment_code,
> data_key.treatment_abbr,
> data_key.plot,
> fifteen_min_stacked_view.*
> FROM
> data_key,
> fifteen_min_stacked_view
> WHERE
> data_key.variable_channel =
> fifteen_min_stacked_view.variable AND data_key.block_name =
> fifteen_min_stacked_view.block_name
> AND fifteen_min_stacked_view.variable ~ 'tsoil'
>
> I have tried adding indexes where I can on the join colums in the
> data_key table Here is the EXPLAIN.
>
>
> QUERY PLAN
> -----------------------------------------------------------------------
> -----------------------------------------------------------------------
> --------------------------------
> Hash Join (cost=195.20..548004.70 rows=196 width=192) (actual
> time=3.295..443523.222 rows=28779376 loops=1)
> Hash Cond: ((fifteen_min_stacked_propper.variable =
> (data_key.variable_channel)::text) AND
> ((fifteen_min_stacked_propper.block_name)::text =
> (data_key.block_name)::text))
> -> Subquery Scan on fifteen_min_stacked_propper
> (cost=0.00..547620.47 rows=2878 width=156) (actual
> time=0.247..424911.643 rows=28779376 loops=1)
> Filter: (fifteen_min_stacked_propper.variable ~ 'tsoil'::text)
> -> Index Scan using fifteen_min_pkey on fifteen_min
> (cost=0.00..525136.58 rows=1798711 width=1072) (actual
> time=0.034..96077.588 rows=428093218 loops=1)
> -> Hash (cost=124.28..124.28 rows=4728 width=55) (actual
> time=3.036..3.036 rows=4728 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 437kB
> -> Seq Scan on data_key (cost=0.00..124.28 rows=4728
> width=55) (actual time=0.007..1.277 rows=4728 loops=1) Total runtime:
> 444912.792 ms
> (9 rows)
>
>
> Any ideas would be much appreciated

Not enough information:

Postgres version?
OS?
Some Postgres configuration parameters, specifically related to "RESOURCE USAGE" and " QUERY TUNING"?
Table structures (including indexes) for: fifteen_min_stacked_propper, fifteen_min, and data_key?
View definition for fifteen_min_stacked_view?

Regards,
Igor Neyman

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2013-01-17 15:35:50 Re: String comparison and the SQL standard
Previous Message Миша Тюрин 2013-01-17 14:16:12 standby, pg_basebackup and last xlog file