Re: Linux kernel impact on PostgreSQL performance

From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Jim Nasby <jim(at)nasby(dot)net>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Mel Gorman <mgorman(at)suse(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Joshua Drake <jd(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Magnus Hagander <magnus(at)hagander(dot)net>, "lsf-pc(at)lists(dot)linux-foundation(dot)org" <lsf-pc(at)lists(dot)linux-foundation(dot)org>
Subject: Re: Linux kernel impact on PostgreSQL performance
Date: 2014-01-15 06:00:37
Message-ID: CAGTBQpZwtN=7UgQcTZ-87-36oFTjKaWYHis2M4fLqzhPKkEPYw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 15, 2014 at 1:07 AM, Jim Nasby <jim(at)nasby(dot)net> wrote:
>>>
>>> Though, it also occurs to me... perhaps it would be better for us to
>>> simply
>>> map temp objects to memory and let the kernel swap them out if needed...
>>
>>
>>
>> Oum... bad idea.
>>
>> Swap logic has very poor taste for I/O patterns.
>
>
> Well, to be honest, so do we. Practically zero in fact...

I've used mmap'd files for years, they're great for sharing mutable
memory across unrelated (as in out-of-heirarchy) processes.

And my experience is, that when swapping to-from disk is expectably a
significant percentage of the workload, explicit I/O of even the
dumbest kind far outperforms swap-based I/O.

I've read the kernel code and I'm not 100% sure of why is that, but I
have a suspect.

My completely unproven theory is that swapping is overwhelmed by
near-misses. Ie: a process touches a page, and before it's actually
swapped in, another process touches it too, blocking on the other
process' read. But the second process doesn't account for that page
when evaluating predictive models (ie: read-ahead), so the next I/O by
process 2 is unexpected to the kernel. Then the same with 1. Etc... In
essence, swap, by a fluke of its implementation, fails utterly to
predict the I/O pattern, and results in far sub-optimal reads.

Explicit I/O is free from that effect, all read calls are accountable,
and that makes a difference.

Maybe, if the kernel could be fixed in that respect, you could
consider mmap'd files as a suitable form of temporary storage. But
that would depend on the success and availability of such a fix/patch.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Florian Pflug 2014-01-15 06:07:12 Re: plpgsql.warn_shadow
Previous Message Dave Chinner 2014-01-15 05:50:12 Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance