Re: Patch: Write Amplification Reduction Method (WARM)

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Jaime Casanova <jaime(dot)casanova(at)2ndquadrant(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch: Write Amplification Reduction Method (WARM)
Date: 2017-03-14 18:41:11
Message-ID: CABOikdPAOmO7m=sFiNPoP6bd24Rq+_0yaRyRLKgdpQMqu7Y13g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 14, 2017 at 7:17 AM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
wrote:

> > @@ -234,6 +236,21 @@ index_beginscan(Relation heapRelation,
> > scan->heapRelation = heapRelation;
> > scan->xs_snapshot = snapshot;
> >
> > + /*
> > + * If the index supports recheck, make sure that index tuple is
> saved
> > + * during index scans.
> > + *
> > + * XXX Ideally, we should look at all indexes on the table and
> check if
> > + * WARM is at all supported on the base table. If WARM is not
> supported
> > + * then we don't need to do any recheck.
> RelationGetIndexAttrBitmap() does
> > + * do that and sets rd_supportswarm after looking at all indexes.
> But we
> > + * don't know if the function was called earlier in the session
> when we're
> > + * here. We can't call it now because there exists a risk of
> causing
> > + * deadlock.
> > + */
> > + if (indexRelation->rd_amroutine->amrecheck)
> > + scan->xs_want_itup = true;
> > +
> > return scan;
> > }
>
> I didn't like this comment very much. But it's not necessary: you have
> already given relcache responsibility for setting rd_supportswarm. The
> only problem seems to be that you set it in RelationGetIndexAttrBitmap
> instead of RelationGetIndexList, but it's not clear to me why.

Hmm. I think you're right. Will fix that way and test.

>
> I noticed that nbtinsert.c and nbtree.c have a bunch of new includes
> that they don't actually need. Let's remove those. nbtutils.c does
> need them because of btrecheck().

Right. It's probably a left over from the way I wrote the first version.
Will fix.

Speaking of which:
>
> I have already commented about the executor involvement in btrecheck();
> that doesn't seem good. I previously suggested to pass the EState down
> from caller, but that's not a great idea either since you still need to
> do the actual FormIndexDatum. I now think that a workable option would
> be to compute the values/isnulls arrays so that btrecheck gets them
> already computed.

I agree with your complaint about modularity violation. What I am unclear
is how passing values/isnulls array will fix that. The way code is
structured currently, recheck routines are called by index_fetch_heap(). So
if we try to compute values/isnulls in that function, we'll still need
access EState, which AFAIU will lead to similar violation. Or am I
mis-reading your idea?

I wonder if we should instead invent something similar to IndexRecheck(),
but instead of running ExecQual(), this new routine will compare the index
values by the given HeapTuple against given IndexTuple. ISTM that for this
to work we'll need to modify all callers of index_getnext() and teach them
to invoke the AM specific recheck method if xs_tuple_recheck flag is set to
true by index_getnext().

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2017-03-14 18:41:28 Re: logical replication access control patches
Previous Message Tom Lane 2017-03-14 18:36:24 Re: Write Ahead Logging for Hash Indexes