Re: Resurrecting per-page cleaner for btree

From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, teramoto(dot)junji(at)lab(dot)ntt(dot)co(dot)jp
Subject: Re: Resurrecting per-page cleaner for btree
Date: 2006-07-26 05:05:08
Message-ID: 20060726130209.55D7.ITAGAKI.TAKAHIRO@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> I've applied this but I'm now having some second thoughts about it,
> because I'm seeing an actual *decrease* in pgbench numbers from the
> immediately prior CVS HEAD code.

> Had you done any performance testing on this patch, and if so what
> tests did you use? I'm a bit hesitant to try to fix it on the basis
> of pgbench results alone.

Thank you for applying and adding documentation.
But hmm... I tested on DBT-2 and pgbench, and the patch worked well on both.
I used disks with battery backup for WAL and enabled writeback cache, so the
extra WAL records might be of no significance.
On DBT-2, the bottleneck was completely the data disks. Cache usage was
important there, so avoiding splitting was useful for saving momory.
On pgbench, full dataset was cached in memory, so the bottleneck was CPUs.
Especially cleanup for branches_pkey was effective because it was frequently
updated and had many LP_DELETE'd tuples. Degradation was eased on my machine.

> The problem is that we've traded splitting a page every few hundred
> inserts for doing a PageIndexMultiDelete, and emitting an extra WAL
> record, on *every* insert. This is not good.

I suspect PageIndexMultiDelete() consumes CPU. If there are one or two
dead tuples, PageIndexTupleDelete() is called and memmove(4KB average)
and adjustment of the linepointer-offsets are performed everytime.
I think this is a heavy operation. But if the size of most upper index
entry is same with the dead tuple, we can only move the upper to the hole
and avoid to modify all tuples. Is this change acceptable?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2006-07-26 06:12:04 Re: GUC with units, details
Previous Message Bruce Momjian 2006-07-26 03:59:20 Re: [COMMITTERS] pgsql: /contrib/cube improvements: Update

Browse pgsql-patches by date

  From Date Subject
Next Message Bernd Helmle 2006-07-26 07:40:47 Re: Patch for updatable views
Previous Message Bruce Momjian 2006-07-26 03:59:20 Re: [COMMITTERS] pgsql: /contrib/cube improvements: Update