Quick Links

Re: Patch: Write Amplification Reduction Method (WARM)

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Jaime Casanova <jaime(dot)casanova(at)2ndquadrant(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Patch: Write Amplification Reduction Method (WARM)
Date:	2017-04-05 13:36:47
Message-ID:	CA+TgmoYUfxy1LseDzsw8uuuLUJHH0r8NCD-Up-HZMC1fYDPH3Q@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Apr 4, 2017 at 11:43 PM, Pavan Deolasee
<pavan(dot)deolasee(at)gmail(dot)com> wrote:
> Well, better than causing a deadlock ;-)

Yep.

> Lets see if we want to go down the path of blocking WARM when tuples have
> toasted attributes. I submitted a patch yesterday, but having slept over it,
> I think I made mistakes there. It might not be enough to look at the caller
> supplied new tuple because that may not have any toasted values, but the
> final tuple that gets written to the heap may be toasted.

Yes, you have to make whatever decision you're going to make here
after any toast-ing has been done.

> We could look at
> the new tuple's attributes to find if any indexed attributes are toasted,
> but that might suck as well. Or we can simply block WARM if the old or the
> new tuple has external attributes i.e. HeapTupleHasExternal() returns true.
> That could be overly restrictive because irrespective of whether the indexed
> attributes are toasted or just some other attribute is toasted, we will
> block WARM on such updates. May be that's not a problem.

Well, I think that there's some danger of whittling down this
optimization to the point where it still incurs most of the costs --
in bit-space if not in CPU cycles -- but no longer yields much of the
benefit. Even though the speed-up might still be substantial in the
cases where the optimization kicks in, if a substantial number of
users doing things that are basically pretty normal sometimes fail to
get the optimization, this isn't going to be very exciting outside of
synthetic benchmarks.

Backing up a little bit, it seems like the root of the issue here is
that, at a certain point in what was once a HOT chain, you make a WARM
update, and you make a decision about which indexes to update at that
point. Now, later on, when you traverse that chain, you need to be
able to figure what decide you made before; otherwise, you might make
a bad decision about whether an index pointer applies to a particular
tuple. If the index tuple is WARM, then the answer is "yes" if the
heap tuple is also WARM, and "no" if the heap tuple is CLEAR (which is
an odd antonym to WARM, but leave that aside). If the index tuple is
CLEAR, then the answer is "yes" if the heap tuple is also CLEAR, and
"maybe" if the heap tuple is WARM.

In that "maybe" case, we are trying to reconstruct the decision that
we made when we did the update. If, at the time of the update, we
decided to insert a new index entry, then the answer is "no"; if not,
it's "yes". From an integrity point of view, it doesn't really matter
how we make the decision; what matters is that we're consistent. More
specifically, if we sometimes insert a new index tuple even when the
value has not changed in any user-visible way, I think that would be
fine, provided that later chain traversals can tell that we did that.
As an extreme example, suppose that the WARM update inserted in some
magical way a bitmap of which attributes had changed into the new
tuple. Then, when we are walking the chain following a CLEAR index
tuple, we test whether the index columns overlap with that bitmap; if
they do, then that index got a new entry; if not, then it didn't. It
would actually be fine (apart from efficiency) to set extra bits in
this bitmap; extra indexes would get updated, but chain traversal
would know exactly which ones, so no problem. This is of course just
a gedankenexperiment, but the point is that as long as the insert
itself and later chain traversals agree on the rule, there's no
integrity problem. I think.

The first idea I had for an actual solution to this problem was to
make the decision as to whether to insert new index entries based on
whether the indexed attributes in the final tuple (post-TOAST) are
byte-for-byte identical with the original tuple. If somebody injects
a new compression algorithm into the system, or just changes the
storage parameters on a column, or we re-insert an identical value
into the TOAST table when we could have reused the old TOAST pointer,
then you might have some potentially-WARM updates that end up being
done as regular updates, but that's OK. When you are walking the
chain, you will KNOW whether you inserted new index entries or not,
because you can do the exact same comparison that was done before and
be sure of getting the same answer. But that's actually not really a
solution, because it doesn't work if all of the CLEAR tuples are gone
-- all you have is the index tuple and the new heap tuple; there's no
old heap tuple with which to compare.

The only other idea that I have for a really clean solution here is to
support this only for index types that are amcanreturn, and actually
compare the value stored in the index tuple with the one stored in the
heap tuple, ensuring that new index tuples are inserted whenever they
don't match and then using the exact same test to determine the
applicability of a given index pointer to a given heap tuple. I'm not
sure how viable that is either, but hopefully you see my underlying
point here: it would be OK for there to be cases where we fall back to
a non-WARM update because a logically equal value changed at the
physical level, especially if those cases are likely to be rare in
practice, but it can never be allowed to happen that chain traversal
gets confused about which indexes actually got touched by a particular
WARM update.

By the way, the "Converting WARM chains back to HOT chains" section of
README.WARM seems to be out of date. Any chance you could update that
to reflect the current state and thinking of the patch?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: Patch: Write Amplification Reduction Method (WARM) at 2017-04-05 03:43:31 from Pavan Deolasee

Responses

Re: Patch: Write Amplification Reduction Method (WARM) at 2017-04-05 18:27:30 from Andres Freund
Re: Patch: Write Amplification Reduction Method (WARM) at 2017-04-05 18:32:41 from Pavan Deolasee

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Robert Haas	2017-04-05 13:39:07	Re: strange parallel query behavior after OOM crashes
Previous Message	Stephen Frost	2017-04-05 13:24:26	Re: Rewriting the test of pg_upgrade as a TAP test