Re: Patch: Write Amplification Reduction Method (WARM)

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Jaime Casanova <jaime(dot)casanova(at)2ndquadrant(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, klaussfreire(at)gmail(dot)com
Subject: Re: Patch: Write Amplification Reduction Method (WARM)
Date: 2017-04-12 20:34:59
Message-ID: CAH2-WzmisFPFGQ5cmDK3T1YiNY1hVNi2hDXen6otkLefQn+oLQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 12, 2017 at 10:12 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> I may have missed something, but there is no intention to ignore known
>> regressions/reviews. Of course, I don't think that every regression will be
>> solvable, like if you run a CPU-bound workload, setting it up in a way such
>> that you repeatedly exercise the area where WARM is doing additional work,
>> without providing any benefit, may be you can still find regression. I am
>> willing to fix them as long as they are fixable and we are comfortable with
>> the additional code complexity. IMHO certain trade-offs are good, but I
>> understand that not everybody will agree with my views and that's ok.
>
> The point here is that we can't make intelligent decisions about
> whether to commit this feature unless we know which situations get
> better and which get worse and by how much. I don't accept as a
> general principle the idea that CPU-bound workloads don't matter.
> Obviously, I/O-bound workloads matter too, but we can't throw
> CPU-bound workloads under the bus. Now, avoiding index bloat does
> also save CPU, so it is easy to imagine that WARM could come out ahead
> even if each update consumes slightly more CPU when actually updating,
> so we might not actually regress. If we do, I guess I'd want to know
> why.

I myself wonder if this CPU overhead is at all related to LP_DEAD
recycling during page splits. I have my suspicions that the recyling
has some relationship to locality, which leads me to want to
investigate how Claudio Freire's patch to consistently treat heap TID
as part of the B-Tree sort order could help, both in general, and for
WARM.

Bear in mind that the recycling has to happen with an exclusive buffer
lock held on a leaf page, which could hold up rather a lot of scans
that need to visit the same value even if it's on some other,
relatively removed leaf page.

This is just a theory.

--
Peter Geoghegan

VMware vCenter Server
https://www.vmware.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-04-12 20:57:29 Re: pg_statistic_ext.staenabled might not be the best column name
Previous Message Peter Eisentraut 2017-04-12 20:25:01 Re: logical replication and PANIC during shutdown checkpoint in publisher