Re: Pluggable storage

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Pluggable storage
Date: 2017-07-17 20:24:58
Message-ID: CAPpHfdvMEAK2pQf7Oj=Zewmz4Pjw3d31Gu3_PSWay3dm_UEd5Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 17, 2017 at 7:51 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:

> On Mon, Jul 17, 2017 at 3:22 AM, Alexander Korotkov
> <a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> > I think that "retail index tuple deletion" is the feature which could
> give
> > us some advantages even independently from pluggable storages. For
> example,
> > imagine very large table with only small amount of dead tuples. In this
> > case, it would be cheaper to delete index links to those dead tuples one
> by
> > one using "retail index tuple deletion", rather than do full scan of
> every
> > index to perform "bulk delete" of index tuples. One may argue that you
> > shouldn't do vacuum of large table when only small amount of tuples are
> > dead. But in terms of index bloat mitigation, very aggressive vacuum
> > strategy could be justified.
>
> Yes, definitely. Especially with the visibility map. Even still, I
> tend to think that for unique indexes, true duplicates should be
> disallowed, and dealt with with an additional layer of indirection. So
> this would be for secondary indexes.
>

It's probably depends on particular storage (once we have pluggable
storages). Some storages would have additional level of indirection while
others wouldn't. But even if unique index contain no true duplicates, it's
still possible that true delete happen. Then we still have to delete tuple
even from unique index.

>> I agree with Robert that being able to store an arbitrary payload as a
> >> TID is probably not going to ever work very well.
> >
> >
> > Support of arbitrary payload as a TID doesn't sound easy. However, that
> > doesn't mean it's unachievable. For me, it's more like long way which
> could
> > be traveled step by step.
>
> To be fair, it probably is achievable. Where there is a will, there is
> a way. I just think that it will be easier to find a different way of
> realizing similar benefits. I'm mostly talking about benefits around
> making it cheap to have many secondary indexes by having logical
> indirection instead of physical pointers (doesn't *have* to be
> user-visible primary key values).

It's possible to add indirection layer "on demand". Thus, initially index
tuples point directly to the heap tuple. If tuple gets updates and doesn't
fit to the page anymore, then it's moved to another place with redirect in
the old place. I think that if carefully designed, it's possible to
guarantee there is at most one redirect.

But I sill think that evading arbitrary payload for indexes is delaying of
inevitable, if only we want pluggable storages and want them to reuse
existing index AMs. So, for example, arbitrary payload together with
ability to update this payload allows us to make indexes separately
versioned (have separate garbage collection process more or less unrelated
to heap). Despite overhead caused by MVCC attributes, I think such indexes
could give significant advantages in various workloads.

> HOT simply isn't effective enough at
> preventing UPDATE index tuple insertions for indexes on unchanged
> attributes, often just because pruning can fail to happen in time,
> which WARM will not fix.
>

Right. I think HOT and WARM depend on factors which are hard to control:
distribution of UPDATEs between heap pages, oldest snapshot and so on.
It's quite hard for DBA to understand why table starts getting bloat while
it didn't before.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2017-07-17 21:04:43 Re: segfault in HEAD when too many nested functions call
Previous Message Mark Dilger 2017-07-17 20:17:54 Re: Something for the TODO list: deprecating abstime and friends