Re: [WIP PATCH] Index scan offset optimisation using visibility map

From: Michail Nikolaev <michail(dot)nikolaev(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Tels <nospam-pg-abuse(at)bloodgate(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP PATCH] Index scan offset optimisation using visibility map
Date: 2018-03-13 16:47:10
Message-ID: CANtu0ohuBMXimT-qriqZMcrch+4qdJ1HxnW8BKBqcdER39M3Qw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

Tom, thanks a lot for your thorough review.

> What you've done to
> IndexNext() is a complete disaster from a modularity standpoint: it now
> knows all about the interactions between index_getnext, index_getnext_tid,
> and index_fetch_heap.

I was looking into the current IndexOnlyNext implementation as a starting
point - and it knows about index_getnext_tid and index_fetch_heap already.
At the same time I was trying to keep patch non-invasive.
Patched IndexNext now only knowns about index_getnext_tid and
index_fetch_heap - the same as IndexOnlyNext.
But yes, it probably could be done better.

> I'm not sure about a nicer way to refactor that, but I'd suggest that
> maybe you need an additional function in indexam.c that hides all this
> knowledge about the internal behavior of an IndexScanDesc.

I'll try to split index_getnext into two functions. A new one will do
everything index_getnext does except index_fetch_heap.

> Or that is, it knows too much and still not enough,
> because it's flat out wrong for the case that xs_continue_hot is set.
> You can't call index_getnext_tid when that's still true from last time.

Oh.. Yes, clear error here.

< The PredicateLockPage call also troubles me quite a bit, not only from
< a modularity standpoint but because that implies a somewhat-user-visible
< behavioral change when this optimization activates. People who are using
< serializable mode are not going to find it to be an improvement if their
< queries fail and need retries a lot more often than they did before.
< I don't know if that problem is bad enough that we should disable skipping
< when serializable mode is active, but it's something to think about.

Current IndexOnlyScan already does that. And I think a user should expect
such a change in serializable mode.

> You haven't documented the behavior required for tuple-skipping in any
> meaningful fashion, particularly not the expectation that the child plan
> node will still return tuples that just need not contain any valid
> content.

Only nodeLimit could receive such tuples and they are immediately
discarded. I'll add some comment to it.

> is a gross hack and probably wrong. You could use ExecStoreAllNullTuple,
> perhaps.

Oh, nice, missed it.

> I'm disturbed by the fact that you've not taught the planner about the
> potential cost saving from this, so that it won't have any particular
> reason to pick a regular indexscan over some other plan type when this
> optimization applies. Maybe there's no practical way to do that, or maybe
> it wouldn't really matter in practice; I've not looked into it. But not
> doing anything feels like a hack.

I was trying to do it. But current planner architecture does not provide a
nice way to achive it due to the way it handles limit and offset.
So, I think it is better to to be avoided for now.

> Setting this back to Waiting on Author.

I'll try to make the required changes in a few days.

Thanks.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-03-13 17:28:17 Re: PATCH: Configurable file mode mask
Previous Message Alvaro Herrera 2018-03-13 16:46:42 Re: PATCH: Unlogged tables re-initialization tests