Re: Reduce pinning in btree indexes

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: kgrittn(at)ymail(dot)com
Cc: hlinnakangas(at)vmware(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Reduce pinning in btree indexes
Date: 2015-03-04 07:59:33
Message-ID: 20150304.165933.95524488.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

> It looks like the remaining performance regression was indeed a
> result of code alignment. I found two "paranoia" assignments I had
> accidentally failed to put back with the rest of the mark/restore
> optimization; after that trivial change the patched version is
> ever-so-slightly faster than master on all tests.

The performance which your test shows looks great. The gain might
comes from removing of buffer locking on _bt_next. I also would
like to see this to come along with 9.5 but it is the matter for
committers.

Apart from it, I looked into the patch itself more
closely. Please reconcile as you like among contradicting
comments below:)

In nbtutils.c, _bt_killitems:

- This is not in charge of this patch, but the comment for
_bt_killitems looks to have a trivial typo.

> * _bt_killitems - set LP_DEAD state for items an indexscan caller has
> * told us were killed
> *
> * scan->so contains information about the current page and killed tuples
> * thereon (generally, this should only be called if so->numKilled > 0).

I suppose "scan->so" should be "scan->opaque" and
"so->numKilled" be "opaque->numKilled".

- The following comment looks somewhat wrong.

> /* Since the else condition needs page, get it here, too. */

"the else condition needs page" means "the page of the buffer
is needed later"? But, I suppose the comment itself is not
necessary.

- If (BTScanPosIsPinned(so->currPos)).

As I mention below for nbtsearch.c, the page aquired in the
if-then block may be vacuumed so the LSN check done in the
if-else block is need also in the if-then block. It will be
accomplished only by changing the position of the end of the
if-else block.

- _bt_killitems is called without pin when rescanning from
before, so I think the previous code has already considered the
unpinned case ("if (ItemPointerEquals(&ituple..." does
that). Vacuum rearranges line pointers after deleting LP_DEAD
items so the previous check seems enough for the purpose. The
previous code is more effeicient becuase the mathing items will
be processed even after vacuum.

In nbtsearch.c

- _bt_first(): It now releases the share lock on the page before
the items to be killed is processed. This allows other backends
to insert items into the page simultaneously. It seems to be
dangerous alone, but _bt_killitems already considers the
case. Anyway I think It'll be better to add a comment
mentioning unlocking is not dangerous.

... Sorry, time's up for now.

regards,

> Average of 3 `make check-world` runs:
> master: 336.288 seconds
> patch: 332.237 seconds
>
> Average of 6 `make check` runs:
> master: 25.409 seconds
> patch: 25.270 seconds
>
> The following were all run 12 times, the worst 2 dropped, the rest
> averaged:
>
> Kyotaro-san's 1m mark "worst case" benchmark:
> master: 962.581 ms, 6.765 stdev
> patch: 947.518 ms, 3.584 stdev
>
> Kyotaro-san's 500k mark, 500k restore "worst case" benchmark:
> master: 1366.639 ms, 5.314 stdev
> patch: 1363.844 ms, 5.896 stdev
>
> pgbench -i -s 16 pgbench / pgbench -c 16 -j 4 -T 500 pgbench
> master: 265.011 tps, 4.952 stdev
> patch: 267.164 tps, 9.094 stdev
>
> How do people feel about the idea of me pushing this for 9.5 (after
> I clean up all the affected comments and README files)? I know
> this first appeared in the last CF, but the footprint is fairly
> small and the only user-visible behavior change is that a btree
> index scan of a WAL-logged table using an MVCC snapshot no longer
> blocks a vacuum indefinitely. (If there are objections I will move
> this to the first CF for the next release.)
>
> src/backend/access/nbtree/nbtree.c | 50 +++++-------
> src/backend/access/nbtree/nbtsearch.c | 141 +++++++++++++++++++++++++---------
> src/backend/access/nbtree/nbtutils.c | 58 ++++++++++----
> src/include/access/nbtree.h | 36 ++++++++-
> 4 files changed, 201 insertions(+), 84 deletions(-)

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2015-03-04 08:00:17 Re: Join push-down support for foreign tables
Previous Message Albe Laurenz 2015-03-04 07:58:32 Re: Optimization for updating foreign tables in Postgres FDW