Re: Large index scan perfomance and indexCorrelation (PG

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Andrew Sagulin <andrews42(at)yandex(dot)ru>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Large index scan perfomance and indexCorrelation (PG
Date: 2006-06-27 22:04:17
Message-ID: 1151445858.2479.170.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, 2006-06-27 at 16:14 +0400, Andrew Sagulin wrote:

> Result showed up that there were no page seq scan at all - true random access
> only.
> The simple model which can explain the situation: the sequence of numbers 2, 1,
> 4, 3, 6, 5, ..., 100, 99 has correlation about 0,9994. Let's imagine it's the page
> order of an index scan. H'm, bad choice, isn't it?

Your example is only possible if whole blocks of data were out of order,
which I guess is possible within a multi-million row table. Slightly out
of order values would be ignored, since I/O works at the block rather
than the tuple level.

ANALYZE doesn't cope well with tables as large as you have. It doesn't
sample enough rows, nor does it look within single blocks/groups to
discover anomalies such as yours. As a result, the data that is sampled
looks almost perfectly ordered, though the main bulk is not.

I think what you are also pointing out is that the assumption of the
effects of correlation doesn't match the current readahead logic of
filesystems. If we were to actively force a readahead stride of 2 for
this scan (somehow), then the lack of correlation would disappear
completely. IIRC the current filesystem readahead logic would find that
such a sequence would be seen as random, and so no readahead would be
performed at all - even though the data is highly correlated. That isn't
PostgreSQL's fault directly, since the readahead ought to work better
than it does, but we fail indirectly by relying upon it in this case.

> I think indexCorrelation can help to estimate page count but not page
> fetch cost. Why not to use formula
>
> min_IO_cost = ceil(indexSelectivity * T) * random_page_cost
>
> instead of
>
> min_IO_cost = ceil(indexSelectivity * T) ?

That part is sensible. The min_IO_cost is when the access is sequential,
which by definition has a cost of 1.0.

The bit you might have issue with is how we extrapolate from the
min_IO_cost and correlation to arrive at a cost.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Andrew Sagulin 2006-06-28 09:33:55 Large index scan perfomance and indexCorrelation (PG 8.1.4 Win32)
Previous Message Andrus 2006-06-27 17:50:23 Re: why group expressions cause query to run forever