Re: Patch: Write Amplification Reduction Method (WARM)

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Jaime Casanova <jaime(dot)casanova(at)2ndquadrant(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch: Write Amplification Reduction Method (WARM)
Date: 2017-01-30 17:53:37
Message-ID: CABOikdO26jksD0HwHCU3UXejjd6GO-ZF3qe+Aat+3y400v6LEQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 26, 2017 at 2:38 AM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
wrote:

> Looking at your 0002 patch now. It no longer applies, but the conflicts
> are trivial to fix. Please rebase and resubmit.
>
>
Thanks.

>
> Maybe I will spend some time trying to
> convert it to Perl using PostgresNode.
>
>
Agree. I put together a test harness to hammer the WARM code as much as we
can. This harness has already discovered some bugs, especially around index
creation part. It also discovered one outstanding bug in master, so it's
been useful. But I agree to rewrite it using perl.

>
> I think having the "recheck" index methods create an ExecutorState looks
> out of place. How difficult is it to pass the estate from the calling
> code?
>

I couldn't find an easy way given the place where recheck is required. Can
you suggest something?

>
> IMO heap_get_root_tuple_one should be called just heap_get_root_tuple().
> That function and its plural sibling heap_get_root_tuples() should
> indicate in their own comments what the expectations are regarding the
> root_offsets output argument, rather than deferring to the comments in
> the "internal" function, since they differ on that point; for the rest
> of the invariants I think it makes sense to say "Also see the comment
> for heap_get_root_tuples_internal". I wonder if heap_get_root_tuple
> should just return the ctid instead of assigning the value to a
> passed-in pointer, i.e.
> OffsetNumber
> heap_get_root_tuple(Page page, OffsetNumber target_offnum)
> {
> OffsetNumber off;
> heap_get_root_tuples_internal(page, target_offnum, &off);
> return off;
> }
>
>
Yes, all of that makes sense. Will fix.

>
> The simple_heap_update + CatalogUpdateIndexes pattern is getting
> obnoxious. How about creating something like catalog_heap_update which
> does both things at once, and stop bothering each callsite with the WARM
> stuff? In fact, given that CatalogUpdateIndexes is used in other
> places, maybe we should leave its API alone and create another function,
> so that we don't have to change the many places that only do
> simple_heap_insert. (Places like OperatorCreate which do either insert
> or update could just move the index update call into each branch.)
>
>
What I ended up doing is I added two new APIs.
- CatalogUpdateHeapAndIndex
- CatalogInsertHeapAndIndex

I could replace almost all occurrences of simple_heap_update +
CatalogUpdateIndexes with the first API and simple_heap_insert +
CatalogUpdateIndexes with the second API. This looks like a good
improvement to me anyways since there are about 180 places where these
functions are called almost in the same pattern. May be it will also avoid
a bug when someone forgets to update the index after inserting/updating
heap.

> .
> I wonder if heap_hot_search_buffer() and heap_hot_search() should return
> a tri-valued enum instead of boolean; that idea looks reasonable in
> theory but callers have to do more work afterwards, so maybe not.
>
>
Ok. I'll try to rearrange it a bit. May be we just have one API after all?
There are only a very few callers of these APIs.

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2017-01-30 17:55:27 Re: Subtle bug in "Simplify tape block format" commit
Previous Message Tom Lane 2017-01-30 17:47:44 Re: Improvements in psql hooks for variables