Re: Pluggable Storage - Andres's take

From: Andres Freund <andres(at)anarazel(dot)de>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Pluggable Storage - Andres's take
Date: 2018-07-16 13:44:53
Message-ID: 20180716134453.oinsvuoqnh737nxa@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2018-07-05 15:25:25 +0300, Alexander Korotkov wrote:
> > - I think the move of the indexing from outside the table layer into the
> > storage layer isn't a good idea. It lead to having to pass EState into
> > the tableam, a callback API to perform index updates, etc. This seems
> > to have at least partially been triggered by the speculative insertion
> > codepaths. I've reverted this part of the changes. The speculative
> > insertion / confirm codepaths are now exposed to tableam.h - I think
> > that's the right thing because we'll likely want to have that
> > functionality across more than a single tuple in the future.
>
> I agree that passing EState into tableam doesn't look good. But I
> believe that tableam needs way more control over indexes than it has
> in your version patch. If even tableam-independent insertion into
> indexes on tuple insert is more or less ok, but on update we need
> something smarter rather than just insert index tuples depending
> "update_indexes" flag. Tableam specific update method may decide to
> update only some of indexes. For example, when zheap performs update
> in-place then it inserts to only indexes, whose fields were updated.
> And I think any undo-log based storage would have similar behavior.
> Moreover, it might be required to do something with existing index
> tuples (for instance, as I know, zheap sets "deleted" flag to index
> tuples related to previous values of updated fields).

I agree that we probably need more - I'm just inclined to think that we
need a more concrete target to work against. Currently zheap's indexing
logic still is fairly naive, I don't think we'll get the interface right
without having worked further on the zheap side of things.

> > - The visibility functions relied on the *caller* performing buffer
> > locking. That's not a great idea, because generic code shouldn't know
> > about the locking scheme a particular AM needs. I've changed the
> > external visibility functions to instead take a slot, and perform the
> > necessary locking inside.
>
> Makes sense for me. But would it cause extra locking/unlocking and in
> turn performance impact?

I don't think so - nearly all the performance relevant cases do all the
visibility logic inside the AM. Where the underlying functions can be
used, that do not do the locking. Pretty all the converted places just
had manual LockBuffer calls.

> > - HOT was encoded in the API in a bunch of places. That doesn't look
> > right to me. I tried to improve a bit on that, but I'm not yet quite
> > sure I like it. Needs written explanation & arguments...
>
> Yes, HOT is heapam specific feature. Other tableams might not have
> HOT. But it appears that we still expose hot_search_buffer() function
> in tableam API. But that function have no usage, so it's just
> redundant and can be removed.

Yea, that was a leftover.

> > - the heap tableam did a heap_copytuple() nearly everywhere. Leading to
> > a higher memory usage, because the resulting tuples weren't freed or
> > anything. There might be a reason for doing such a change - we've
> > certainly discussed that before - but I'm *vehemently* against doing
> > that at the same time we introduce pluggable storage. Analyzing the
> > performance effects will be hard enough without changes like this.
>
> I think once we switched to slots, doing heap_copytuple() do
> frequently is not required anymore.

It's mostly gone now.

> > - I've for now backed out the heap rewrite changes, partially. Mostly
> > because I didn't like the way the abstraction looks, but haven't quite
> > figured out how it should look like.
>
> Yeah, it's hard part, but we need to invent something in this area...

I agree. But I really don't yet quite now what. I somewhat wonder if we
should just add a cluster_rel() callback to the tableam and let it deal
with everything :(. As previously proposed the interface wouldn't have
worked with anything not losslessly encodable into a heaptuple, which is
unlikely to be sufficient.

FWIW, I plan to be mostly out until Thursday this week, and then I'll
rebase onto the new version of the abstract slot patch and then try to
split up the patchset. Once that's done, I'll do a prototype conversion
of zheap, which I'm sure will show up a lot of weaknesses in the current
abstraction. Once that's done I hope we can collaborate / divide &
conquer to get the individual pieces into commit shape.

If either of you wants to get a head start separating something out,
let's try to organize who would do what? The EPQ and trigger
slotification are probably good candidates.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-07-16 13:55:48 Re: Usage of epoch in txid_current
Previous Message Andres Freund 2018-07-16 13:35:27 Re: Pluggable Storage - Andres's take