Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance

From: Trond Myklebust <trondmy(at)gmail(dot)com>
To: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Joshua Drake <jd(at)commandprompt(dot)com>, Mel Gorman <mgorman(at)suse(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "lsf-pc(at)lists(dot)linux-foundation(dot)org" <lsf-pc(at)lists(dot)linux-foundation(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
Date: 2014-01-14 00:48:56
Message-ID: 190E6EA3-7B06-4315-9E4C-33FBEC961531@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Jan 13, 2014, at 19:03, Hannu Krosing <hannu(at)2ndQuadrant(dot)com> wrote:

> On 01/13/2014 09:53 PM, Trond Myklebust wrote:
>> On Jan 13, 2014, at 15:40, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>>
>>> On 2014-01-13 15:15:16 -0500, Robert Haas wrote:
>>>> On Mon, Jan 13, 2014 at 1:51 PM, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:
>>>>> I notice, Josh, that you didn't mention the problems many people
>>>>> have run into with Transparent Huge Page defrag and with NUMA
>>>>> access.
>>>> Amen to that. Actually, I think NUMA can be (mostly?) fixed by
>>>> setting zone_reclaim_mode; is there some other problem besides that?
>>> I think that fixes some of the worst instances, but I've seen machines
>>> spending horrible amounts of CPU (& BUS) time in page reclaim
>>> nonetheless. If I analyzed it correctly it's in RAM << working set
>>> workloads where RAM is pretty large and most of it is used as page
>>> cache. The kernel ends up spending a huge percentage of time finding and
>>> potentially defragmenting pages when looking for victim buffers.
>>>
>>>> On a related note, there's also the problem of double-buffering. When
>>>> we read a page into shared_buffers, we leave a copy behind in the OS
>>>> buffers, and similarly on write-out. It's very unclear what to do
>>>> about this, since the kernel and PostgreSQL don't have intimate
>>>> knowledge of what each other are doing, but it would be nice to solve
>>>> somehow.
>>> I've wondered before if there wouldn't be a chance for postgres to say
>>> "my dear OS, that the file range 0-8192 of file x contains y, no need to
>>> reread" and do that when we evict a page from s_b but I never dared to
>>> actually propose that to kernel people...
>> O_DIRECT was specifically designed to solve the problem of double buffering
>> between applications and the kernel. Why are you not able to use that in these situations?
> What is asked is the opposite of O_DIRECT - the write from a buffer inside
> postgresql to linux *buffercache* and telling linux that it is the same
> as what
> is currently on disk, so don't bother to write it back ever.

I don’t understand. Are we talking about mmap()ed files here? Why would the kernel be trying to write back pages that aren’t dirty?

> This would avoid current double-buffering between postgresql and linux
> buffer caches while still making use of linux cache when possible.
>
> The use case is pages that postgresql has moved into its buffer cache
> but which it has not modified. They will at some point be evicted from the
> postgresql cache, but it is likely that they will still be needed
> sometime soon,
> so what is required is "writing them back" to the original file, only
> they should
> not really be written - or marked dirty to be written later - more
> levels than
> just to the linux cache, as they *already* are on the disk.
>
> It is probably ok to put them in the LRU position as they are "written"
> out from postgresql, though it may be better if we get some more control
> over
> where in the LRU order they would be placed. It may make sense to put them
> there based on when they were last read while residing inside postgresql
> cache
>
> Cheers
>
>
> --
> Hannu Krosing
> PostgreSQL Consultant
> Performance, Scalability and High Availability
> 2ndQuadrant Nordic OÜ

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2014-01-14 00:55:41 Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
Previous Message Jim Nasby 2014-01-14 00:46:24 Re: Linux kernel impact on PostgreSQL performance