Re: Patch: Write Amplification Reduction Method (WARM)

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch: Write Amplification Reduction Method (WARM)
Date: 2016-10-05 08:13:23
Message-ID: bdae5d5a-c10e-bdba-d9aa-acc393e2f5ba@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/05/2016 06:53 AM, Pavan Deolasee wrote:
>
>
> On Thu, Sep 1, 2016 at 9:44 PM, Bruce Momjian <bruce(at)momjian(dot)us
> <mailto:bruce(at)momjian(dot)us>> wrote:
>
> On Thu, Sep 1, 2016 at 02:37:40PM +0530, Pavan Deolasee wrote:
> > I like the simplified approach, as long as it doesn't block further
> > improvements.
> >
> >
> >
> > Yes, the proposed approach is simple yet does not stop us from improving things
> > further. Moreover it has shown good performance characteristics and I believe
> > it's a good first step.
>
> Agreed. This is BIG. Do you think it can be done for PG 10?
>
>
> I definitely think so. The patches as submitted are fully functional and
> sufficient. Of course, there are XXX and TODOs that I hope to sort out
> during the review process. There are also further tests needed to ensure
> that the feature does not cause significant regression in the worst
> cases. Again something I'm willing to do once I get some feedback on the
> broader design and test cases. What I am looking at this stage is to
> know if I've missed something important in terms of design or if there
> is some show stopper that I overlooked.
>
> Latest patches rebased with current master are attached. I also added a
> few more comments to the code. I forgot to give a brief about the
> patches, so including that as well.
>
> 0001_track_root_lp_v4.patch: This patch uses a free t_infomask2 bit to
> track latest tuple in an update chain. The t_ctid.ip_posid is used to
> track the root line pointer of the update chain. We do this only in the
> latest tuple in the chain because most often that tuple will be updated
> and we need to quickly find the root only during update.
>
> 0002_warm_updates_v4.patch: This patch implements the core of WARM
> logic. During WARM update, we only insert new entries in the indexes
> whose key has changed. But instead of indexing the real TID of the new
> tuple, we index the root line pointer and then use additional recheck
> logic to ensure only correct tuples are returned from such potentially
> broken HOT chains. Each index AM must implement a amrecheck method to
> support WARM. The patch currently implements this for hash and btree
> indexes.
>

Hi,

I've been looking at the patch over the past few days, running a bunch
of benchmarks etc. I can confirm the significant speedup, often by more
than 75% (depending on number of indexes, whether the data set fits into
RAM, etc.). Similarly for the amount of WAL generated, although that's a
bit more difficult to evaluate due to full_page_writes.

I'm not going to send detailed results, as that probably does not make
much sense at this stage of the development - I can repeat the tests
once the open questions get resolved.

There's a lot of useful and important feedback in the thread(s) so far,
particularly the descriptions of various failure cases. I think it'd be
very useful to collect those examples and turn them into regression
tests - that's something the patch should include anyway.

I don't really have much comments regarding the code, but during the
testing I noticed a bit strange behavior when updating statistics.
Consider a table like this:

create table t (a int, b int, c int) with (fillfactor = 10);
insert into t select i, i from generate_series(1,1000) s(i);
create index on t(a);
create index on t(b);

and update:

update t set a = a+1, b=b+1;

which has to update all indexes on the table, but:

select n_tup_upd, n_tup_hot_upd from pg_stat_user_tables

n_tup_upd | n_tup_hot_upd
-----------+---------------
1000 | 1000

So it's still counted as "WARM" - does it make sense? I mean, we're
creating a WARM chain on the page, yet we have to add pointers into all
indexes (so not really saving anything). Doesn't this waste the one WARM
update per HOT chain without actually getting anything in return?

The way this is piggy-backed on the current HOT statistics seems a bit
strange for another reason, although WARM is a relaxed version of HOT.
Until now, HOT was "all or nothing" - we've either added index entries
to all indexes or none of them. So the n_tup_hot_upd was fine.

But WARM changes that - it allows adding index entries only to a subset
of indexes, which means the "per row" n_tup_hot_upd counter is not
sufficient. When you have a table with 10 indexes, and the counter
increases by 1, does that mean the update added index tuple to 1 index
or 9 of them?

So I think we'll need two counters to track WARM - number of index
tuples we've added, and number of index tuples we've skipped. So
something like blks_hit and blks_read. I'm not sure whether we should
replace the n_tup_hot_upd entirely, or keep it for backwards
compatibility (and to track perfectly HOT updates).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Oleg Bartunov 2016-10-05 08:15:46 Re: [PATCH] Generic type subscription
Previous Message Amit Kapila 2016-10-05 08:03:38 Re: Speed up Clog Access by increasing CLOG buffers