Re: [WIP PATCH] Index scan offset optimisation using visibility map

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Michail Nikolaev <michail(dot)nikolaev(at)gmail(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tels <nospam-pg-abuse(at)bloodgate(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP PATCH] Index scan offset optimisation using visibility map
Date: 2018-07-13 21:17:30
Message-ID: 2c171aa4-3df4-3c90-c649-be6f9d1d16dc@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 21/05/18 18:43, Michail Nikolaev wrote:
> Hello everyone.
> This letter related to “Extended support for index-only-scan” from my
> previous message in the thread.
>
> WIP version of the patch is ready for a while now and I think it is time to
> resume the work on the feature. BTW, I found a small article about Oracle
> vs Postgres focusing this issue -
> https://blog.dbi-services.com/postgres-vs-oracle-access-paths-viii/
>
> Current WIP version of the patch is located here -
> https://github.com/postgres/postgres/compare/88ba0ae2aa4aaba8ea0d85c0ff81cc46912d9308...michail-nikolaev:index_only_fetch,
> passing all checks. In addition, patch includes small optimization for
> caching of amcostestimate results.

Please submit an actual path, extracted e.g. with "git format-patch
-n1", rather than a link to an external site. That is a requirement for
archival purposes, so that people reading the email archives later on
can see what was being discussed. (And that link doesn't return a proper
diff, anyway.)

> For now, I decide to name the plan as “Index Only Fetch Scan”. Therefore:
> * In case of “Index Scan” – we touch the index and heap for EVERY tuple we
> need to test
> * For “Index Only Scan” – we touch the index for every tuple and NEVER
> touch the heap
> * For “Index Only Fetch Scan” – we touch the index for every tuple and
> touch the heap for those tuples we need to RETURN ONLY.

Hmm. IIRC there was some discussion on doing that, when index-only scans
were implemented. It's not generally OK to evaluate expressions based on
data that has already been deleted from the table. As an example of the
kind of problems you might get:

Imagine that a user does "DELETE * FROM table WHERE div = 0; SELECT *
FROM table WHERE 100 / div < 10". Would you expect the query to throw a
"division by zero error"? If there was an index on 'div', you might
evaluate the "100 / div" expression based on values from the index,
which still includes entries for the just-deleted tuples with div = 0.
They would be filtered out later, after performing the visibility
checks, but it's too late if you already threw an error.

Now, maybe there's some way around that, but I don't know what. Without
some kind of a solution, this won't work.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2018-07-13 21:19:29 Re: pgsql: Fix parallel index and index-only scans to fall back to serial.
Previous Message Alvaro Herrera 2018-07-13 21:11:31 Re: partition pruning doesn't work with IS NULL clause in multikey range partition case