Quick Links

Re: Sorted writes in checkpoint

From:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>
To:	"ITAGAKI Takahiro" <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc:	"PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>, "Greg Smith" <gsmith(at)gregsmith(dot)com>, "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Subject:	Re: Sorted writes in checkpoint
Date:	2007-06-14 17:50:17
Message-ID:	1181843417.5776.118.camel@silverbirch.site
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-patches

On Thu, 2007-06-14 at 16:39 +0900, ITAGAKI Takahiro wrote:
> Greg Smith <gsmith(at)gregsmith(dot)com> wrote:
>
> > On Mon, 11 Jun 2007, ITAGAKI Takahiro wrote:
> > > If the kernel can treat sequential writes better than random writes, is
> > > it worth sorting dirty buffers in block order per file at the start of
> > > checkpoints?
>
> I wrote and tested the attached sorted-writes patch base on Heikki's
> ldc-justwrites-1.patch. There was obvious performance win on OLTP workload.
>
> tests | pgbench | DBT-2 response time (avg/90%/max)
> ---------------------------+---------+-----------------------------------
> LDC only | 181 tps | 1.12 / 4.38 / 12.13 s
> + BM_CHECKPOINT_NEEDED(*) | 187 tps | 0.83 / 2.68 / 9.26 s
> + Sorted writes | 224 tps | 0.36 / 0.80 / 8.11 s
>
> (*) Don't write buffers that were dirtied after starting the checkpoint.
>
> machine : 2GB-ram, SCSI*4 RAID-5
> pgbench : -s400 -t40000 -c10 (about 5GB of database)
> DBT-2 : 60WH (about 6GB of database)

I'm very surprised by the BM_CHECKPOINT_NEEDED results. What percentage
of writes has been saved by doing that? We would expect a small
percentage of blocks only and so that shouldn't make a significant
difference. I thought we discussed this before, about a year ago. It
would be easy to get that wrong and to avoid writing a block that had
been re-dirtied after the start of checkpoint, but was already dirty
beforehand. How long was the write phase of the checkpoint, how long
between checkpoints?

I can see the sorted writes having an effect because the OS may not
receive blocks within a sufficient time window to fully optimise them.
That effect would grow with increasing sizes of shared_buffers and
decrease with size of controller cache. How big was the shared buffers
setting? What OS scheduler are you using? The effect would be greatest
when using Deadline.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

In response to

Sorted writes in checkpoint at 2007-06-14 07:39:37 from ITAGAKI Takahiro

Responses

Re: Sorted writes in checkpoint at 2007-06-15 02:37:14 from Gregory Maxwell
Re: Sorted writes in checkpoint at 2007-06-15 09:14:20 from Zeugswetter Andreas ADI SD
Re: Sorted writes in checkpoint at 2007-06-15 09:33:47 from ITAGAKI Takahiro

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bruce Momjian	2007-06-14 18:21:45	Re: tsearch_core patch: permissions and security issues
Previous Message	Joshua D. Drake	2007-06-14 17:45:31	Re: tsearch_core patch: permissions and security issues

Browse pgsql-patches by date

	From	Date	Subject
Next Message	Alexey Klyukin	2007-06-14 19:19:08	Re: Silly bug in pgbench's random number generator
Previous Message	Gregory Stark	2007-06-14 16:51:48	Silly bug in pgbench's random number generator