Re: Patch: Write Amplification Reduction Method (WARM)

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Jaime Casanova <jaime(dot)casanova(at)2ndquadrant(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch: Write Amplification Reduction Method (WARM)
Date: 2017-03-21 12:41:21
Message-ID: CABOikdOdvqJXPATNqmyJMMpYHesQuZm-Y4Q9O033pOwoAyrOjA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 21, 2017 at 5:34 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Tue, Mar 21, 2017 at 6:56 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
> >> Hmm, that test case isn't all that synthetic. It's just a single
> >> column bulk update, which isn't anything all that crazy, and 5-10%
> >> isn't nothing.
> >>
> >> I'm kinda surprised it made that much difference, though.
> >>
> >
> > I think it is because heap_getattr() is not that cheap. We have
> > noticed the similar problem during development of scan key push down
> > work [1].
>
> Yeah. So what's the deal with this? Is somebody working on figuring
> out a different approach that would reduce this overhead? Are we
> going to defer WARM to v11? Or is the intent to just ignore the 5-10%
> slowdown on a single column update and commit everything anyway?

I think I should clarify something. The test case does a single column
update, but it also has columns which are very wide, has an index on many
columns (and it updates a column early in the list). In addition, in the
test Mithun updated all 10million rows of the table in a single
transaction, used UNLOGGED table and fsync was turned off.

TBH I see many artificial scenarios here. It will be very useful if he can
rerun the query with some of these restrictions lifted. I'm all for
addressing whatever we can, but I am not sure if this test demonstrates a
real world usage.

Having said that, may be if we can do a few things to reduce the overhead.

- Check if the page has enough free space to perform a HOT/WARM update. If
not, don't look for all index keys.
- Pass bitmaps separately for each index and bail out early if we conclude
neither HOT nor WARM is possible. In this case since there is just one
index and as soon as we check the second column we know neither HOT nor
WARM is possible, we will return early. It might complicate the API a lot,
but I can give it a shot if that's what is needed to make progress.

Any other ideas?

Thanks,
Pavan
--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2017-03-21 12:49:58 Re: Speed up Clog Access by increasing CLOG buffers
Previous Message Emre Hasegeli 2017-03-21 12:39:22 Re: BRIN cost estimate