Re: Use of index for 50% column restriction

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Use of index for 50% column restriction
Date: 2016-06-08 21:07:34
Message-ID: 20160608210734.GA9614@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 8, 2016 at 01:28:54PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > As part of my research on the parsing/planning behavior of PREPARE, I
> > found a surprising behavior --- a WHERE clause that is 50% restrictive
> > is using an index. I thought only <10% restrictions used indexes.
>
> There's no such hard-and-fast rule. The cost estimate break point depends
> greatly on the index order correlation (which is 100% in your example),
> as well as some other factors like the index size versus
> effective_cache_size.
>
> For randomly-ordered data I believe the cutover is actually well below 10%.

Ah, I had not considered the correlation order of the rows in the table.
This test returns the sequential scan I expected by using floor(random()
* 2):

DROP TABLE IF EXISTS test;
CREATE TABLE test (c1 INT, c2 INT, c3 INT);
INSERT INTO test SELECT c1, floor(random() * 2), 0 FROM generate_series(1, 10000) AS a(c1);
INSERT INTO test SELECT c1, floor(random() * 2), 1 FROM generate_series(10001, 20000) AS a(c1);
CREATE INDEX i_test_c2 ON test (c2);
ANALYZE test;
EXPLAIN SELECT * FROM test WHERE c2 = 0;

Thanks.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2016-06-08 21:36:08 Re: Use of index for 50% column restriction
Previous Message Oleg Bartunov 2016-06-08 20:39:42 Re: Should phraseto_tsquery('simple', 'blue blue') @@ to_tsvector('simple', 'blue') be true ?