Re: Patch: Write Amplification Reduction Method (WARM)

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: Jaime Casanova <jaime(dot)casanova(at)2ndquadrant(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch: Write Amplification Reduction Method (WARM)
Date: 2017-01-31 13:51:21
Message-ID: 20170131135121.sy6yzt4yr6tqbfhs@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Pavan Deolasee wrote:
> On Thu, Jan 26, 2017 at 2:38 AM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
> wrote:

> > The simple_heap_update + CatalogUpdateIndexes pattern is getting
> > obnoxious. How about creating something like catalog_heap_update which
> > does both things at once, and stop bothering each callsite with the WARM
> > stuff?
>
> What I realised that there are really 2 patterns:
> 1. simple_heap_insert, CatalogUpdateIndexes
> 2. simple_heap_update, CatalogUpdateIndexes
>
> There are only couple of places where we already have indexes open or have
> more than one tuple to update, so we call CatalogIndexInsert directly. What
> I ended up doing in the attached patch is add two new APIs which combines
> the two steps of each of these patterns. It seems much cleaner to me and
> also less buggy for future users. I hope I am not missing a reason not to
> do combine these steps.

CatalogUpdateIndexes was just added as a convenience function on top of
a very common pattern. If we now have a reason to create a second one
because there are now two very common patterns, it seems reasonable to
have two functions. I think I would commit the refactoring to create
these functions ahead of the larger WARM patch, since I think it'd be
bulky and largely mechanical. (I'm going from this description; didn't
read your actual code.)

> > +#define HeapTupleHeaderGetNextTid(tup, next_ctid) \
> > +do { \
> > + AssertMacro(!((tup)->t_infomask2 & HEAP_LATEST_TUPLE)); \
> > + ItemPointerCopy(&(tup)->t_ctid, (next_ctid)); \
> > +} while (0)
>
> > Actually, I think this macro could just return the TID so that it can be
> > used as struct assignment, just like ItemPointerCopy does internally --
> > callers can do
> > ctid = HeapTupleHeaderGetNextTid(tup);
>
> While I agree with your proposal, I wonder why we have ItemPointerCopy() in
> the first place because we freely copy TIDs as struct assignment. Is there
> a reason for that? And if there is, does it impact this specific case?

I dunno. This macro is present in our very first commit d31084e9d1118b.
Maybe it's an artifact from the Lisp to C conversion. Even then, we had
some cases of iptrs being copied by struct assignment, so it's not like
it didn't work. Perhaps somebody envisioned that the internal details
could change, but that hasn't happened in two decades so why should we
worry about it now? If somebody needs it later, it can be changed then.

> There is one issue that bothers me. The current implementation lacks
> ability to convert WARM chains into HOT chains. The README.WARM has some
> proposal to do that. But it requires additional free bit in tuple header
> (which we don't have) and of course, it needs to be vetted and implemented.
> If the heap ends up with many WARM tuples, then index-only-scans will
> become ineffective because index-only-scan can not skip a heap page, if it
> contains a WARM tuple. Alternate ideas/suggestions and review of the design
> are welcome!

t_infomask2 contains one last unused bit, and we could reuse vacuum
full's bits (HEAP_MOVED_OUT, HEAP_MOVED_IN), but that will need some
thinking ahead. Maybe now's the time to start versioning relations so
that we can ensure clusters upgraded to pg10 do not contain any of those
bits in any tuple headers.

I don't have any ideas regarding the estate passed to recheck yet --
haven't looked at the callsites in detail. I'll give this another look
later.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2017-01-31 13:57:07 Re: patch: function xmltable
Previous Message Abbas Butt 2017-01-31 13:48:08 Re: An issue in remote query optimization