Re: patch to allow disable of WAL recycling

From: Jerry Jelinek <jerry(dot)jelinek(at)joyent(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch to allow disable of WAL recycling
Date: 2018-07-06 12:44:18
Message-ID: CACPQ5Fo29F0VG7GZURW+2wEpRj5cOvh7nxTCcybwDoB0W41Aqw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thomas,

We're using a zfs recordsize of 8k to match the PG blocksize of 8k, so what
you're describing is not the issue here.

Thanks,
Jerry

On Thu, Jul 5, 2018 at 3:44 PM, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
wrote:

> On Fri, Jul 6, 2018 at 3:37 AM, Jerry Jelinek <jerry(dot)jelinek(at)joyent(dot)com>
> wrote:
> >> If the problem is specifically the file system caching behavior, then we
> >> could also consider using the dreaded posix_fadvise().
> >
> > I'm not sure that solves the problem for non-cached files, which is where
> > we've observed the performance impact of recycling, where what should be
> a
> > write intensive workload turns into a read-modify-write workload because
> > we're now reading an old WAL file that is many hours, or even days, old
> and
> > has thus fallen out of the memory-cached data for the filesystem. The
> disk
> > reads still have to happen.
>
> What ZFS record size are you using? PostgreSQL's XLOG_BLCKSZ is usually
> 8192 bytes. When XLogWrite() calls write(some multiple of XLOG_BLCKSZ), on
> a traditional filesystem the kernel will say 'oh, that's overwriting whole
> pages exactly, so I have no need to read it from disk' (for example in
> FreeBSD ffs_vnops.c ffs_write() see the comment "We must peform a
> read-before-write if the transfer size does not cover the entire buffer").
> I assume ZFS has a similar optimisation, but it uses much larger records
> than the traditional 4096 byte pages, defaulting to 128KB. Is that the
> reason for this?
>
> --
> Thomas Munro
> http://www.enterprisedb.com
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Antonin Houska 2018-07-06 12:58:40 Re: Push down Aggregates below joins
Previous Message Pavel Stehule 2018-07-06 12:12:24 Re: [HACKERS] Optional message to user when terminating/cancelling backend