Re: Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance (summary v2 2014-1-17)

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Mel Gorman <mgorman(at)suse(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Andres Freund <andres(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, Trond Myklebust <trondmy(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Jonathan Corbet <corbet(at)lwn(dot)net>, Dave Chinner <david(at)fromorbit(dot)com>, Joshua Drake <jd(at)commandprompt(dot)com>, James Bottomley <James(dot)Bottomley(at)HansenPartnership(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, lsf-pc(at)lists(dot)linux-foundation(dot)org, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance (summary v2 2014-1-17)
Date: 2014-01-21 16:36:08
Message-ID: 20140121163608.GB5325@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 17, 2014 at 04:31:48PM +0000, Mel Gorman wrote:
> NUMA Optimisations
> ------------------
>
> The primary one that showed up was zone_reclaim_mode. Enabling that parameter
> is a disaster for many workloads and apparently Postgres is one. It might
> be time to revisit leaving that thing disabled by default and explicitly
> requiring that NUMA-aware workloads that are correctly partitioned enable it.
> Otherwise NUMA considerations are not that much of a concern right now.

Here is a blog post about our zone_reclaim_mode-disable recommendations:

http://frosty-postgres.blogspot.com/2012/08/postgresql-numa-and-zone-reclaim-mode.html

> Direct IO, buffered IO, double buffering and wishlists
> ------------------------------------------------------
> 6. Only writeback pages if explicitly synced. Postgres has strict write
> ordering requirements. In the words of Tom Lane -- "As things currently
> stand, we dirty the page in our internal buffers, and we don't write
> it to the kernel until we've written and fsync'd the WAL data that
> needs to get to disk first". mmap() would avoid double buffering but
> it has no control about the write ordering which is a show-stopper.
> As Andres Freund described;

What was not explicitly stated here is that the Postgres design is
taking advantage of the double-buffering "feature" here and writing to a
memory copy of the page while there is still an unmodified copy in the
kernel cache, or on disk. In the case of a crash, we rely on the fact
that the disk page is unchanged. Certainly any design that requires the
kernel to mange two different copies of the same page is going to be
confusing.

One larger question is how many of these things that Postgres needs are
needed by other applications? I doubt Postgres is large enough to
warrant changes on its own.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-01-21 16:37:35 Re: dynamic shared memory and locks
Previous Message Tom Lane 2014-01-21 16:33:40 Re: Add %z support to elog/ereport?