Re: performace problem after VACUUM ANALYZE

From: Scott Cain <cain(at)cshl(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org, gmod schema <gmod-schema(at)lists(dot)sourceforge(dot)net>
Subject: Re: performace problem after VACUUM ANALYZE
Date: 2003-02-14 17:29:36
Message-ID: 1045243776.1485.617.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Tom,

Sorry about that: I'll try to briefly give the information you are
looking for. I've read the docs on ALTER TABLE, but it is not clear to
me what columns I should change STATISTICS on, or should I just do it on
all of the columns for which indexes exist?

Here's the query again:

select distinct f.name,fl.nbeg,fl.nend,fl.strand,f.type_id,f.feature_id
from feature f, featureloc fl
where
fl.srcfeature_id = 1 and
((fl.strand=1 and fl.nbeg <= 393164 and fl.nend >= 390956) OR
(fl.strand=-1 and fl.nend <= 393164 and fl.nbeg >= 390956)) and
f.feature_id = fl.feature_id

--------------------------------------------------------------------------

Naive database:

Unique (cost=75513.46..75513.48 rows=1 width=167) (actual
time=22815.25..22815.93 rows=179 loops=1)
-> Sort (cost=75513.46..75513.46 rows=1 width=167) (actual
time=22815.24..22815.43 rows=186 loops=1)
-> Nested Loop (cost=0.00..75513.45 rows=1 width=167) (actual
time=2471.25..22814.01 rows=186 loops=1)
-> Index Scan using featureloc_idx2 on featureloc fl
(cost=0.00..75508.43 rows=1 width=14) (actual time=2463.83..22796.50
rows=186 loops=1)
-> Index Scan using feature_pkey on feature f
(cost=0.00..5.01 rows=1 width=153) (actual time=0.08..0.08 rows=1
loops=186)
Total runtime: 22816.63 msec
--------------------------------------------------------------------------

Naive database after featureloc_idx2 dropped:

Unique (cost=75545.46..75545.48 rows=1 width=167) (actual
time=5232.36..5234.51 rows=179 loops=1)
-> Sort (cost=75545.46..75545.46 rows=1 width=167) (actual
time=5232.35..5232.54 rows=186 loops=1)
-> Nested Loop (cost=0.00..75545.45 rows=1 width=167) (actual
time=291.46..5220.69 rows=186 loops=1)
-> Index Scan using featureloc_src_strand_beg_end on
featureloc fl (cost=0.00..75540.43 rows=1 width=14) (actual
time=291.30..5214.46 rows=186 loops=1)
-> Index Scan using feature_pkey on feature f
(cost=0.00..5.01 rows=1 width=153) (actual time=0.02..0.03 rows=1
loops=186)
Total runtime: 5234.89 msec
--------------------------------------------------------------------------

Database after VACUUM ANALYZE was run:

Unique (cost=344377.70..344759.85 rows=2548 width=47) (actual
time=26466.82..26467.51 rows=179 loops=1)
-> Sort (cost=344377.70..344377.70 rows=25477 width=47) (actual
time=26466.82..26467.01 rows=186 loops=1)
-> Nested Loop (cost=0.00..342053.97 rows=25477 width=47)
(actual time=262.66..26465.63 rows=186 loops=1)
-> Seq Scan on featureloc fl (cost=0.00..261709.31
rows=25477 width=14) (actual time=118.62..26006.05 rows=186 loops=1)
-> Index Scan using feature_pkey on feature f
(cost=0.00..3.14 rows=1 width=33) (actual time=2.45..2.46 rows=1
loops=186)
Total runtime: 26467.85 msec
--------------------------------------------------------------------------

After disallowing seqscans (set enable_seqscan=0):

Unique (cost=356513.46..356895.61 rows=2548 width=47) (actual
time=27494.62..27495.34 rows=179 loops=1)
-> Sort (cost=356513.46..356513.46 rows=25477 width=47) (actual
time=27494.61..27494.83 rows=186 loops=1)
-> Nested Loop (cost=0.00..354189.73 rows=25477 width=47)
(actual time=198.88..27493.48 rows=186 loops=1)
-> Index Scan using featureloc_idx1 on featureloc fl
(cost=0.00..273845.08 rows=25477 width=14) (actual time=129.30..27280.95
rows=186 loops=1)
-> Index Scan using feature_pkey on feature f
(cost=0.00..3.14 rows=1 width=33) (actual time=1.13..1.13 rows=1
loops=186)
Total runtime: 27495.66 msec
--------------------------------------------------------------------------

After dropping featureloc_idx1:

Unique (cost=1310195.21..1310577.36 rows=2548 width=47) (actual
time=21692.69..21693.37 rows=179 loops=1)
-> Sort (cost=1310195.21..1310195.21 rows=25477 width=47) (actual
time=21692.69..21692.88 rows=186 loops=1)
-> Nested Loop (cost=0.00..1307871.48 rows=25477 width=47)
(actual time=2197.65..21691.39 rows=186 loops=1)
-> Index Scan using featureloc_idx2 on featureloc fl
(cost=0.00..1227526.82 rows=25477 width=14) (actual
time=2197.49..21618.89 rows=186 loops=1)
-> Index Scan using feature_pkey on feature f
(cost=0.00..3.14 rows=1 width=33) (actual time=0.37..0.38 rows=1
loops=186)
Total runtime: 21693.72 msec
--------------------------------------------------------------------------

After dropping featureloc_idx2:

Unique (cost=1414516.98..1414899.13 rows=2548 width=47) (actual
time=1669.17..1669.86 rows=179 loops=1)
-> Sort (cost=1414516.98..1414516.98 rows=25477 width=47) (actual
time=1669.17..1669.36 rows=186 loops=1)
-> Nested Loop (cost=0.00..1412193.25 rows=25477 width=47)
(actual time=122.69..1668.08 rows=186 loops=1)
-> Index Scan using featureloc_src_strand_beg_end on
featureloc fl (cost=0.00..1331848.60 rows=25477 width=14) (actual
time=122.51..1661.81 rows=186 loops=1)
-> Index Scan using feature_pkey on feature f
(cost=0.00..3.14 rows=1 width=33) (actual time=0.02..0.03 rows=1
loops=186)
Total runtime: 1670.20 msec

On Fri, 2003-02-14 at 12:00, Tom Lane wrote:
> Scott Cain <cain(at)cshl(dot)org> writes:
> > [ much stuff ]
>
> Could we see EXPLAIN ANALYZE, not just EXPLAIN, output for all these
> alternatives? Your question boils down to "why is the planner
> misestimating these queries" ... which is a difficult question to
> answer when given only the estimates and not the reality.
>
> A suggestion though is that you might need to raise the statistics
> target on the indexed columns, so that ANALYZE will collect
> finer-grained statistics. (See ALTER TABLE ... SET STATISTICS.)
> Try booting it up to 100 (from the default 10), re-analyze, and
> then see if/how the plans change.
>
> regards, tom lane
--
------------------------------------------------------------------------
Scott Cain, Ph. D. cain(at)cshl(dot)org
GMOD Coordinator (http://www.gmod.org/) 216-392-3087
Cold Spring Harbor Laboratory

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Josh Berkus 2003-02-14 17:36:02 Re: Tuning scenarios (was Changing the default configuration)
Previous Message Tom Lane 2003-02-14 17:00:18 Re: performace problem after VACUUM ANALYZE