Re: Index Skip Scan

From: Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>
To: Floris Van Nee <florisvannee(at)optiver(dot)com>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru>, Peter Geoghegan <pg(at)bowt(dot)ie>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Bhushan Uparkar <bhushan(dot)uparkar(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, James Coleman <jtc331(at)gmail(dot)com>
Subject: Re: Index Skip Scan
Date: 2019-06-01 10:03:30
Message-ID: CA+FpmFeKh9BAhDTAf-ZWQ=pvVaRy+Lnahb_8=OOFqcVXJDBWWQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 1 Jun 2019 at 06:10, Floris Van Nee <florisvannee(at)optiver(dot)com> wrote:
>
> Actually I'd like to add something to this. I think I've found a bug in the current implementation. Would someone be able to check?
>
I am willing to give it a try.
> Given a table definition of (market text, feedcode text, updated_at timestamptz, value float8) and an index on (market, feedcode, updated_at desc) (note that this table slightly deviates from what I described in my previous mail) and filling it with data.
>
>
> The following query uses an index skip scan and returns just 1 row (incorrect!)
>
> select distinct on (market, feedcode) market, feedcode
> from streams.base_price
> where market='TEST'
>
> The following query still uses the regular index scan and returns many more rows (correct)
> select distinct on (market, feedcode) *
> from streams.base_price
> where market='TEST'
>
Aren't those two queries different?
select distinct on (market, feedcode) market, feedcode vs select
distinct on (market, feedcode)*
Anyhow, it's just the difference in projection so doesn't matter much.
I verified this scenario at my end and you are right, there is a bug.
Here is my repeatable test case,

create table t (market text, feedcode text, updated_at timestamptz,
value float8) ;
create index on t (market, feedcode, updated_at desc);
insert into t values('TEST', 'abcdef', (select timestamp '2019-01-10
20:00:00' + random() * (timestamp '2014-01-20 20:00:00' - timestamp
'2019-01-20 20:00:00') ), generate_series(1,100)*9.88);
insert into t values('TEST', 'jsgfhdfjd', (select timestamp
'2019-01-10 20:00:00' + random() * (timestamp '2014-01-20 20:00:00' -
timestamp '2019-01-20 20:00:00') ), generate_series(1,100)*9.88);

Now, without the patch,
select distinct on (market, feedcode) market, feedcode from t where
market='TEST';
market | feedcode
--------+-----------
TEST | abcdef
TEST | jsgfhdfjd
(2 rows)
explain select distinct on (market, feedcode) market, feedcode from t
where market='TEST';
QUERY PLAN
----------------------------------------------------------------
Unique (cost=12.20..13.21 rows=2 width=13)
-> Sort (cost=12.20..12.70 rows=201 width=13)
Sort Key: feedcode
-> Seq Scan on t (cost=0.00..4.51 rows=201 width=13)
Filter: (market = 'TEST'::text)
(5 rows)

And with the patch,
select distinct on (market, feedcode) market, feedcode from t where
market='TEST';
market | feedcode
--------+----------
TEST | abcdef
(1 row)

explain select distinct on (market, feedcode) market, feedcode from t
where market='TEST';
QUERY PLAN
------------------------------------------------------------------------------------------------
Index Only Scan using t_market_feedcode_updated_at_idx on t
(cost=0.14..0.29 rows=2 width=13)
Scan mode: Skip scan
Index Cond: (market = 'TEST'::text)
(3 rows)

Notice that in the explain statement it shows correct number of rows
to be skipped.

--
Regards,
Rafia Sabih

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Dolgov 2019-06-01 10:28:28 Re: Index Skip Scan
Previous Message Alvaro Herrera 2019-06-01 04:55:47 Re: coverage additions