Quick Links

Re: Improving DISTINCT with LooseScan node

From:	Dmitriy Sarafannikov <dsarafannikov(at)yandex(dot)ru>
To:	Adrien Nayrat <adrien(dot)nayrat(at)dalibo(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Improving DISTINCT with LooseScan node
Date:	2017-09-18 08:59:56
Message-ID:	54531D0A-0ACD-4FAF-AD79-F5C74CA931FF@yandex.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Yep. Now i can see 2 use cases for this feature:
1. DISTINCT queries.
2. Effectively scanning multicolumn index if first column is omitted and has low cardinality

> Not an answer to your question, but generally +1 for working on this
> area. I did some early proto-hacking a while ago, which I haven't had
> time to get back to yet:
>
> https://www.postgresql.org/message-id/flat/CADLWmXWALK8NPZqdnRQiPnrzAnic7NxYKynrkzO_vxYr8enWww%40mail.gmail.com <https://www.postgresql.org/message-id/flat/CADLWmXWALK8NPZqdnRQiPnrzAnic7NxYKynrkzO_vxYr8enWww%40mail.gmail.com>
>
> That was based on the idea that a DISTINCT scan using a btree index to
> skip ahead is always going to be using the leading N columns and
> always going to be covered by the index, so I might as well teach
> Index Only Scan how to do it directly rather than making a new node to
> sit on top. As you can see in that thread I did start thinking about
> using a new node to sit on top and behave a bit like a nested loop for
> the more advanced feature of an Index Skip Scan (trying every value of
> (a) where you had an index on (a, b, c) but had a WHERE clause qual on
> (b, c)), but the only feedback I had so far was from Robert Haas who
> thought that too should probably be pushed into the index scan.

Thank you for information, i will look at this thread.

> FWIW I'd vote for 'skip' rather than 'loose' as a term to use in this
> family of related features (DISTINCT being the easiest to build). It
> seems more descriptive than the MySQL term. (DB2 added this a little
> while ago and went with 'index jump scan', suggesting that we should
> consider 'hop'... (weak humour, 'a hop, skip and a jump' being an
> English idiom meaning a short distance)).

Maybe skip would be better, but there will be no problems with something like patents?
I mean database which name beginning with big letter «O»? As i know, it has Index Skip Scan.

In response to

Re: Improving DISTINCT with LooseScan node at 2017-09-18 08:14:39 from Adrien Nayrat

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Rafal Pietrak	2017-09-18 09:12:19	global indices
Previous Message	Andres Freund	2017-09-18 08:52:44	Reporting query on crash even if completed