Re: kill_prior_tuple and index scan costing

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: kill_prior_tuple and index scan costing
Date: 2020-03-22 04:53:05
Message-ID: 20200322045305.GC2563@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 21, 2020 at 07:33:02PM -0700, Andres Freund wrote:
> While your recent btree work ensures that we get the heap tids for an
> equality lookup in heap order (right?),

I think when I tested the TID tiebreaker patch, it didn't help for our case,
which is for inequality: (timestamptz >= start AND timestamptz < end).

That seems to explain why, although I don't understand why it wouldn't also
apply to inequality comparison ?

|template1=# CREATE TABLE t(i int,j int); CREATE INDEX ON t(i); INSERT INTO t SELECT (0.0001*a+9*(random()-0.5))::int FROM generate_series(1,99999999) a; VACUUM ANALYZE t;
|template1=# explain (analyze,buffers) SELECT * FROM t WHERE i BETWEEN 2000 AND 3000;
| Index Scan using t_i_idx on t (cost=0.44..277164.86 rows=10026349 width=8) (actual time=0.199..6839.564 rows=10010076 loops=1)
| Index Cond: ((i >= 2000) AND (i <= 3000))
| Buffers: shared hit=394701 read=52699

vs.

|template1=# SET enable_seqscan=off; SET enable_indexscan=off; explain (analyze,buffers) SELECT * FROM t WHERE i BETWEEN 2000 AND 3000;
| Bitmap Heap Scan on t (cost=135038.52..1977571.10 rows=10026349 width=8) (actual time=743.649..3760.643 rows=10010076 loops=1)
| Recheck Cond: ((i >= 2000) AND (i <= 3000))
| Heap Blocks: exact=44685
| Buffers: shared read=52700
| -> Bitmap Index Scan on t_i_idx (cost=0.00..132531.93 rows=10026349 width=0) (actual time=726.474..726.475 rows=10010076 loops=1)
| Index Cond: ((i >= 2000) AND (i <= 3000))
| Buffers: shared read=8015

I'm not concerned with the "actual" time or hit vs cached, but the total buffer
pages. Indexscan accessed 450k buffers vs 52k for bitmapscan.

> I don't think we currently have
> the planner infrastructure to know that that's the case (since other
> index types don't guarantee that) / take it into account for planning?

Right, since correlation is a property of the table column and not of the
index. See also:
https://www.postgresql.org/message-id/14438.1512499811@sss.pgh.pa.us

Years ago I had a patch to make correlation a property of indexes.

--
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2020-03-22 05:03:30 Re: kill_prior_tuple and index scan costing
Previous Message Tom Lane 2020-03-22 03:23:03 Re: BUG #16040: PL/PGSQL RETURN QUERY statement never uses a parallel plan