Re: [WIP PATCH] Index scan offset optimisation using visibility map

From: Michail Nikolaev <michail(dot)nikolaev(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Tels <nospam-pg-abuse(at)bloodgate(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP PATCH] Index scan offset optimisation using visibility map
Date: 2018-03-20 15:41:46
Message-ID: CANtu0ohoXVT3KHonFPPvQkN_m92tX9=iM9BYUah+z2DGZNxbQQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello everyone.

I need an advice.

I was reworking the patch: added support for the planner, added support for
queries with projection, addded support for predicates which could be
executed over index data.
And.. I realized that my IndexScan is even more index-only than the
original IndexOnlyScan. So, it seems to be a wrong way.

I think the task could be splitted into two:

1. Extended support for index-only-scan

Currently IndexOnlyScan is used only in case when target data is fully
located in
index. If we need some additional columns - regular index scan is used
anyway.

For example, let's consider such table and index:

CREATE TABLE test_events (
time timestamp ,
event varchar(255),
data jsonb
);

CREATE INDEX on test_events USING btree(time, event);

It is some kind of big table with log events. And let's consinder such
query:

SELECT data->>'event_id'
FROM test_events
WHERE
time > (now() - interval '2 year') AND -- indexqual
event = 'tax' AND -- indexqual
extract(ISODOW from time) = 1 --qpquals
ORDER BY time DESC

At the moment IndexScan plan will be used for such query due to result
data. But 1/7 of all heap access will be lost. At the same time
"extract(ISODOW from time) = 1" (qpqualsl) could be easily calculated over
index data.

The idea is simple: extend IndexOnly scan to be used if all query
predicates (both indexqual and qpquals) could be calculated over index
data. And if all checks are passed - just load tuple from heap to return.
It seems like index-data access is really cheap now and such plan will
be faster even for qpquals without high selectivity. At least for
READCOMMITED.

I think I will able to create prototype within a few days (most of work
is done in current patch rework).

Probably it is not an ne idea - so, is it worth implementation? Maybe
I've missed something huge.

2. For extented IndexOnlyScan - add support to avoid heap fetch in case of
OFFSET applied to tuple.

If first part is implemented - OFFSET optimisation is much easier to
achieve.

Thanks,
Michail.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeremy Finzel 2018-03-20 16:19:04 Re: found xmin from before relfrozenxid on pg_catalog.pg_authid
Previous Message Robert Haas 2018-03-20 15:28:52 Re: Define variable only in the scope that needs it