Re: Large # of rows in query extremely slow, not using

From: Stephen Crowley <stephen(dot)crowley(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Cc: Manfred Koizar <mkoi-pg(at)aon(dot)at>, Pierre-Frédéric Cai llaud <lists(at)boutiquenumerique(dot)com>
Subject: Re: Large # of rows in query extremely slow, not using
Date: 2004-09-18 00:23:44
Message-ID: 3f71fdf1040917172395a3be5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Ok.. now I ran "VACUUM FULL' and things seem to be working as they should..

explain analyze select * from history where date='2004-09-07' and stock='MSFT';

Seq Scan on island_history (cost=0.00..275359.13 rows=292274
width=83) (actual time=50.000..411683.000 rows=265632 loops=1)
Filter: ((date = '2004-09-07'::date) AND ((stock)::text = 'MSFT'::text))
Total runtime: 412703.000 ms

random_page_cost and effective_cache_size are both default, 8 and 1000

explain analyze select * from history where date='2004-09-07' and stock='ORCL';

"Index Scan using island_history_date_stock_time on island_history
(cost=0.00..181540.07 rows=102166 width=83) (actual
time=551.000..200268.000 rows=159618 loops=1)"
" Index Cond: ((date = '2004-09-07'::date) AND ((stock)::text = 'ORCL'::text))"
"Total runtime: 201009.000 ms"

So now this in all in proportion and works as expected.. the question
is, why would the fact that it needs to be vaccumed cause such a huge
hit in performance? When i vacuumed it did free up nearly 25% of the
space.

--Stephen

On Fri, 17 Sep 2004 22:44:05 +0200, Manfred Koizar <mkoi-pg(at)aon(dot)at> wrote:
> On Thu, 16 Sep 2004 20:51:11 -0500, Stephen Crowley
> <stephen(dot)crowley(at)gmail(dot)com> wrote:
> >explain analyze select * from history where date='2004-09-07' and
> >stock='ORCL' LIMIT 10;
>
> >" -> Index Scan using island_history_date_stock_time on
> >island_history (cost=0.00..183099.72 rows=102166 width=83) (actual
> >time=1612.000..1702.000 rows=10 loops=1)"
> ^^
> LIMIT 10 hides what would be the most interesting info here. I don't
> believe that
> EXPLAIN ANALYSE SELECT * FROM history WHERE ...
> consumes lots of memory. Please try it.
>
> And when you post the results please include your Postgres version, some
> info about hardware and OS, and your non-default settings, especially
> random_page_cost and effective_cache_size.
>
> May I guess that the correlation of the physical order of tuples in your
> table to the contents of the date column is pretty good (examine
> correlation in pg_stats) and that island_history_date_stock_time is a
> 3-column index?
>
> It is well known that the optimizer overestimates the cost of index
> scans in those situations. This can be compensated to a certain degree
> by increasing effective_cache_size and/or decreasing random_page_cost
> (which might harm other planner decisions).
>
> You could also try
> CREATE INDEX history_date_stock ON history("date", stock);
>
> This will slow down INSERTs and UPDATEs, though.
>
> Servus
> Manfred
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message mudfoot 2004-09-18 01:22:43 Re: Tryint to match Solaris-Oracle performance with directio?
Previous Message Gaetano Mendola 2004-09-17 23:42:25 Re: Tryint to match Solaris-Oracle performance with directio?