Use "average field correlation per hard disk page" instead of global one?

From: Alexey Nalbat <alexey(at)price(dot)ru>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Use "average field correlation per hard disk page" instead of global one?
Date: 2004-03-03 15:58:51
Message-ID: 200403031858.51680.alexey@price.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

I have a table of 2'500'000 tuples and 100'000 pages, and an index
on non-unique field, to each key value corresponds approximately
50'000 tuples.

Due to the updating algorithm the physical order of tuples in the
table happens to be such that all equal keys are placed together,
but not ordered globally. Correlation computed by "VACUUM ANALYZE"
is 0.15.

When computing indexscan cost for query with clause "key = ?"
the planner makes it closer to "Mackert and Lohman formula" value
than to "selectivity * pages". As a result it chooses seqscan
rather than indexscan while in fact indexscan is 20 times faster.

The question is, which is the best way to correct this behavior?

Maybe "VACUUM ANALYZE" could calculate some average of "field
correlation per page" and even use this value somewhere inside
(not outside) "Mackert and Lohman formula"?

Are there any better ideas?

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan Wieck 2004-03-03 16:00:37 Client side copy functionality
Previous Message Magnus Hagander 2004-03-03 15:22:30 Re: [HACKERS] Tablespaces