From: | Michail Nikolaev <michail(dot)nikolaev(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Tels <nospam-pg-abuse(at)bloodgate(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [WIP PATCH] Index scan offset optimisation using visibility map |
Date: | 2018-03-13 16:47:10 |
Message-ID: | CANtu0ohuBMXimT-qriqZMcrch+4qdJ1HxnW8BKBqcdER39M3Qw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello.
Tom, thanks a lot for your thorough review.
> What you've done to
> IndexNext() is a complete disaster from a modularity standpoint: it now
> knows all about the interactions between index_getnext, index_getnext_tid,
> and index_fetch_heap.
I was looking into the current IndexOnlyNext implementation as a starting
point - and it knows about index_getnext_tid and index_fetch_heap already.
At the same time I was trying to keep patch non-invasive.
Patched IndexNext now only knowns about index_getnext_tid and
index_fetch_heap - the same as IndexOnlyNext.
But yes, it probably could be done better.
> I'm not sure about a nicer way to refactor that, but I'd suggest that
> maybe you need an additional function in indexam.c that hides all this
> knowledge about the internal behavior of an IndexScanDesc.
I'll try to split index_getnext into two functions. A new one will do
everything index_getnext does except index_fetch_heap.
> Or that is, it knows too much and still not enough,
> because it's flat out wrong for the case that xs_continue_hot is set.
> You can't call index_getnext_tid when that's still true from last time.
Oh.. Yes, clear error here.
< The PredicateLockPage call also troubles me quite a bit, not only from
< a modularity standpoint but because that implies a somewhat-user-visible
< behavioral change when this optimization activates. People who are using
< serializable mode are not going to find it to be an improvement if their
< queries fail and need retries a lot more often than they did before.
< I don't know if that problem is bad enough that we should disable skipping
< when serializable mode is active, but it's something to think about.
Current IndexOnlyScan already does that. And I think a user should expect
such a change in serializable mode.
> You haven't documented the behavior required for tuple-skipping in any
> meaningful fashion, particularly not the expectation that the child plan
> node will still return tuples that just need not contain any valid
> content.
Only nodeLimit could receive such tuples and they are immediately
discarded. I'll add some comment to it.
> is a gross hack and probably wrong. You could use ExecStoreAllNullTuple,
> perhaps.
Oh, nice, missed it.
> I'm disturbed by the fact that you've not taught the planner about the
> potential cost saving from this, so that it won't have any particular
> reason to pick a regular indexscan over some other plan type when this
> optimization applies. Maybe there's no practical way to do that, or maybe
> it wouldn't really matter in practice; I've not looked into it. But not
> doing anything feels like a hack.
I was trying to do it. But current planner architecture does not provide a
nice way to achive it due to the way it handles limit and offset.
So, I think it is better to to be avoided for now.
> Setting this back to Waiting on Author.
I'll try to make the required changes in a few days.
Thanks.
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2018-03-13 17:28:17 | Re: PATCH: Configurable file mode mask |
Previous Message | Alvaro Herrera | 2018-03-13 16:46:42 | Re: PATCH: Unlogged tables re-initialization tests |