Re: [Gmod-schema] Re: performace problem after VACUUM

From: Scott Cain <cain(at)cshl(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org, gmod schema <gmod-schema(at)lists(dot)sourceforge(dot)net>
Subject: Re: [Gmod-schema] Re: performace problem after VACUUM
Date: 2003-02-14 19:22:51
Message-ID: 1045250571.1486.625.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

An update: I ran alter table as suggested, ie,

alter table featureloc alter srcfeature_id set statistics 100;

on each column in the table, running vacuum analyze and explain analyze
on the query in between each alter to see if it made any difference. It
did not. Postgres still instists on doing a seq scan on featureloc:

Unique (cost=336831.46..337179.45 rows=2320 width=47) (actual
time=27219.62..27220.30 rows=179 loops=1)
-> Sort (cost=336831.46..336831.46 rows=23200 width=47) (actual
time=27219.61..27219.80 rows=186 loops=1)
-> Nested Loop (cost=0.00..334732.77 rows=23200 width=47)
(actual time=1003.04..27217.99 rows=186 loops=1)
-> Seq Scan on featureloc fl (cost=0.00..261709.31
rows=23200 width=14) (actual time=814.68..26094.18 rows=186 loops=1)
-> Index Scan using feature_pkey on feature f
(cost=0.00..3.14 rows=1 width=33) (actual time=6.03..6.03 rows=1
loops=186)
Total runtime: 27220.63 msec

On Fri, 2003-02-14 at 12:29, Scott Cain wrote:
> Tom,
>
> Sorry about that: I'll try to briefly give the information you are
> looking for. I've read the docs on ALTER TABLE, but it is not clear to
> me what columns I should change STATISTICS on, or should I just do it on
> all of the columns for which indexes exist?
>
> Here's the query again:
>
> select distinct f.name,fl.nbeg,fl.nend,fl.strand,f.type_id,f.feature_id
> from feature f, featureloc fl
> where
> fl.srcfeature_id = 1 and
> ((fl.strand=1 and fl.nbeg <= 393164 and fl.nend >= 390956) OR
> (fl.strand=-1 and fl.nend <= 393164 and fl.nbeg >= 390956)) and
> f.feature_id = fl.feature_id
>
> --------------------------------------------------------------------------
>
> Naive database:
>
> Unique (cost=75513.46..75513.48 rows=1 width=167) (actual
> time=22815.25..22815.93 rows=179 loops=1)
> -> Sort (cost=75513.46..75513.46 rows=1 width=167) (actual
> time=22815.24..22815.43 rows=186 loops=1)
> -> Nested Loop (cost=0.00..75513.45 rows=1 width=167) (actual
> time=2471.25..22814.01 rows=186 loops=1)
> -> Index Scan using featureloc_idx2 on featureloc fl
> (cost=0.00..75508.43 rows=1 width=14) (actual time=2463.83..22796.50
> rows=186 loops=1)
> -> Index Scan using feature_pkey on feature f
> (cost=0.00..5.01 rows=1 width=153) (actual time=0.08..0.08 rows=1
> loops=186)
> Total runtime: 22816.63 msec
> --------------------------------------------------------------------------
>
> Naive database after featureloc_idx2 dropped:
>
> Unique (cost=75545.46..75545.48 rows=1 width=167) (actual
> time=5232.36..5234.51 rows=179 loops=1)
> -> Sort (cost=75545.46..75545.46 rows=1 width=167) (actual
> time=5232.35..5232.54 rows=186 loops=1)
> -> Nested Loop (cost=0.00..75545.45 rows=1 width=167) (actual
> time=291.46..5220.69 rows=186 loops=1)
> -> Index Scan using featureloc_src_strand_beg_end on
> featureloc fl (cost=0.00..75540.43 rows=1 width=14) (actual
> time=291.30..5214.46 rows=186 loops=1)
> -> Index Scan using feature_pkey on feature f
> (cost=0.00..5.01 rows=1 width=153) (actual time=0.02..0.03 rows=1
> loops=186)
> Total runtime: 5234.89 msec
> --------------------------------------------------------------------------
>
> Database after VACUUM ANALYZE was run:
>
> Unique (cost=344377.70..344759.85 rows=2548 width=47) (actual
> time=26466.82..26467.51 rows=179 loops=1)
> -> Sort (cost=344377.70..344377.70 rows=25477 width=47) (actual
> time=26466.82..26467.01 rows=186 loops=1)
> -> Nested Loop (cost=0.00..342053.97 rows=25477 width=47)
> (actual time=262.66..26465.63 rows=186 loops=1)
> -> Seq Scan on featureloc fl (cost=0.00..261709.31
> rows=25477 width=14) (actual time=118.62..26006.05 rows=186 loops=1)
> -> Index Scan using feature_pkey on feature f
> (cost=0.00..3.14 rows=1 width=33) (actual time=2.45..2.46 rows=1
> loops=186)
> Total runtime: 26467.85 msec
> --------------------------------------------------------------------------
>
> After disallowing seqscans (set enable_seqscan=0):
>
> Unique (cost=356513.46..356895.61 rows=2548 width=47) (actual
> time=27494.62..27495.34 rows=179 loops=1)
> -> Sort (cost=356513.46..356513.46 rows=25477 width=47) (actual
> time=27494.61..27494.83 rows=186 loops=1)
> -> Nested Loop (cost=0.00..354189.73 rows=25477 width=47)
> (actual time=198.88..27493.48 rows=186 loops=1)
> -> Index Scan using featureloc_idx1 on featureloc fl
> (cost=0.00..273845.08 rows=25477 width=14) (actual time=129.30..27280.95
> rows=186 loops=1)
> -> Index Scan using feature_pkey on feature f
> (cost=0.00..3.14 rows=1 width=33) (actual time=1.13..1.13 rows=1
> loops=186)
> Total runtime: 27495.66 msec
> --------------------------------------------------------------------------
>
> After dropping featureloc_idx1:
>
> Unique (cost=1310195.21..1310577.36 rows=2548 width=47) (actual
> time=21692.69..21693.37 rows=179 loops=1)
> -> Sort (cost=1310195.21..1310195.21 rows=25477 width=47) (actual
> time=21692.69..21692.88 rows=186 loops=1)
> -> Nested Loop (cost=0.00..1307871.48 rows=25477 width=47)
> (actual time=2197.65..21691.39 rows=186 loops=1)
> -> Index Scan using featureloc_idx2 on featureloc fl
> (cost=0.00..1227526.82 rows=25477 width=14) (actual
> time=2197.49..21618.89 rows=186 loops=1)
> -> Index Scan using feature_pkey on feature f
> (cost=0.00..3.14 rows=1 width=33) (actual time=0.37..0.38 rows=1
> loops=186)
> Total runtime: 21693.72 msec
> --------------------------------------------------------------------------
>
> After dropping featureloc_idx2:
>
> Unique (cost=1414516.98..1414899.13 rows=2548 width=47) (actual
> time=1669.17..1669.86 rows=179 loops=1)
> -> Sort (cost=1414516.98..1414516.98 rows=25477 width=47) (actual
> time=1669.17..1669.36 rows=186 loops=1)
> -> Nested Loop (cost=0.00..1412193.25 rows=25477 width=47)
> (actual time=122.69..1668.08 rows=186 loops=1)
> -> Index Scan using featureloc_src_strand_beg_end on
> featureloc fl (cost=0.00..1331848.60 rows=25477 width=14) (actual
> time=122.51..1661.81 rows=186 loops=1)
> -> Index Scan using feature_pkey on feature f
> (cost=0.00..3.14 rows=1 width=33) (actual time=0.02..0.03 rows=1
> loops=186)
> Total runtime: 1670.20 msec
>
>
> On Fri, 2003-02-14 at 12:00, Tom Lane wrote:
> > Scott Cain <cain(at)cshl(dot)org> writes:
> > > [ much stuff ]
> >
> > Could we see EXPLAIN ANALYZE, not just EXPLAIN, output for all these
> > alternatives? Your question boils down to "why is the planner
> > misestimating these queries" ... which is a difficult question to
> > answer when given only the estimates and not the reality.
> >
> > A suggestion though is that you might need to raise the statistics
> > target on the indexed columns, so that ANALYZE will collect
> > finer-grained statistics. (See ALTER TABLE ... SET STATISTICS.)
> > Try booting it up to 100 (from the default 10), re-analyze, and
> > then see if/how the plans change.
> >
> > regards, tom lane
--
------------------------------------------------------------------------
Scott Cain, Ph. D. cain(at)cshl(dot)org
GMOD Coordinator (http://www.gmod.org/) 216-392-3087
Cold Spring Harbor Laboratory

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message ShengQiang Shu 2003-02-14 21:04:05 Re: [Gmod-schema] Re: performace problem after VACUUM ANALYZE
Previous Message SLetovsky 2003-02-14 19:07:49 Re: [Gmod-schema] Re: performace problem after VACUUM ANALYZE