Re: Patch: Write Amplification Reduction Method (WARM)

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Jaime Casanova <jaime(dot)casanova(at)2ndquadrant(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch: Write Amplification Reduction Method (WARM)
Date: 2017-03-30 12:25:49
Message-ID: CABOikdPuDh9w-LvNLZe4ECB87Ce=QbUEOeHw9YvunfaQu_CftQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 30, 2017 at 5:27 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:

>
>
> How have you verified that? Have you checked that in
> heap_prepare_insert it has called toast_insert_or_update() and then
> returned a tuple different from what the input tup is? Basically, I
> am easily able to see it and even the reason why the heap and index
> tuples will be different. Let me try to explain,
> toast_insert_or_update returns a new tuple which contains compressed
> data and this tuple is inserted in heap where as slot still refers to
> original tuple (uncompressed one) which is passed to heap_insert.
> Now, ExecInsertIndexTuples and the calls under it like FormIndexDatum
> will refer to the tuple in slot which is uncompressed and form the
> values[] using uncompressed value.

Ah, yes. You're right. Not sure why I saw things differently. That doesn't
anything though because during recheck we'll get compressed value and not
do anything with it. In the index we already have compressed value and we
can compare them. Even if we decide to decompress everything and do the
comparison, that should be possible. So I don't see a problem as far as
correctness goes.

>
> So IIUC, in above test during initialization you have one WARM update
> and then during actual test all are HOT updates, won't in such a case
> the WARM chain will be converted to HOT by vacuum and then all updates
> from thereon will be HOT and probably no rechecks?
>

There is no AV.. Just 1 tuple being HOT updated out of 100 tuples.
Confirmed by looking at pg_stat_user_tables. Also made sure that the tuple
doesn't get non-HOT updated in between, thus breaking the WARM chain.

>
>
> >
> > I then also repeated the tests, but this time using compressible values.
> The
> > regression in this case is much higher, may be 15% or more.
> >
>
> Sounds on higher side.
>
>
Yes, definitely. If we can't reduce that, we might want to provide table
level option to explicitly turn WARM off on such tables.

> IIUC, by the time you are comparing tuple attrs to check for modified
> columns, you don't have the compressed values for new tuple.
>
>
I think it depends. If the value is not being modified, then we will get
both values as compressed. At least I confirmed with your example and
running an update which only changes c1. Don't know if that holds for all
cases.

> > I know you had
> > raised concerns, but Robert confirmed that (IIUC) it's not a problem
> today.
> >
>
> Yeah, but I am not sure if we can take Robert's statement as some sort
> of endorsement for what the patch does.
>
>
Sure.

> > We will figure out how to deal with it if we ever add support for
> different
> > compression algorithms or compression levels. And I also think this is
> kinda
> > synthetic use case and the fact that there is not much regression with
> > indexes as large as 2K bytes seems quite comforting to me.
> >
>
> I am not sure if we can consider it as completely synthetic because we
> might see some similar cases for json datatypes. Can we once try to
> see the impact when the same test runs from multiple clients?

Ok. Might become hard to control HOT behaviour though. Or will need to do
mix of WARM/HOT updates. Will see if this is something easily doable by
setting high FF etc.

> For
> your information, I am also trying to setup some tests along with one
> of my colleague and we will report the results once the tests are
> complete.
>
>
That'll be extremely helpful, especially if its a something close to
real-world scenario. Thanks for doing that.

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2017-03-30 12:52:36 Re: [PATCH] Reduce src/test/recovery verbosity
Previous Message Jesper Pedersen 2017-03-30 12:18:12 Re: Page Scan Mode in Hash Index