Quick Links

Re: Index Skip Scan

From:	Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru>
To:	jesper(dot)pedersen(at)redhat(dot)com, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc:	pg(at)bowt(dot)ie, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, bhushan(dot)uparkar(at)gmail(dot)com, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Subject:	Re: Index Skip Scan
Date:	2018-09-13 13:01:13
Message-ID:	522092c7-3cb7-a1ec-410b-dfa97ae722f3@postgrespro.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi Jesper,

While testing this patch I noticed that current implementation doesn't
perform well when we have lots of small groups of equal values. Here is
the execution time of index skip scan vs unique over index scan, in ms,
depending on the size of group. The benchmark script is attached.

group size    skip unique
1             2,293.85    132.55
5    464.40 106.59
10      239.61    102.02
50 56.59       98.74
100       32.56   103.04
500 6.08      97.09

So, the current implementation can lead to performance regression, and
the choice of the plan depends on the notoriously unreliable ndistinct
statistics. The regression is probably because skip scan always does
_bt_search to find the next unique tuple. I think we can improve this,
and the skip scan can be strictly faster than index scan regardless of
the data. As a first approximation, imagine that we somehow skipped
equal tuples inside _bt_next instead of sending them to the parent
Unique node. This would already be marginally faster than Unique + Index
scan. A more practical implementation would be to remember our position
in tree (that is, BTStack returned by _bt_search) and use it to skip
pages in bulk. This looks straightforward to implement for a tree that
does not change, but I'm not sure how to make it work with concurrent
modifications. Still, this looks a worthwhile direction to me, because
if we have a strictly faster skip scan, we can just use it always and
not worry about our unreliable statistics. What do you think?

--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment	Content-Type	Size
test-skip.sql	application/sql	571 bytes

In response to

Re: Index Skip Scan at 2018-09-11 13:21:57 from Jesper Pedersen

Responses

Re: Index Skip Scan at 2018-09-13 15:39:46 from Jesper Pedersen

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Laurenz Albe	2018-09-13 13:55:20	Re: Loaded footgun open_datasync on Windows
Previous Message	Kyotaro HORIGUCHI	2018-09-13 12:40:59	Re: Protect syscache from bloating with negative cache entries