Re: Why we are going to have to go DirectIO

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Why we are going to have to go DirectIO
Date: 2013-12-04 08:47:24
Message-ID: 529EEC1C.2040207@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/04/2013 01:08 AM, Tom Lane wrote:
> Magnus Hagander <magnus(at)hagander(dot)net> writes:
>> On Tue, Dec 3, 2013 at 11:44 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>>> Would certainly be nice. Realistically, getting good automated
>>> performace tests will require paying someone like Greg S., Mark or me
>>> for 6 solid months to develop them, since worthwhile open source
>>> performance test platforms currently don't exist. That money has never
>>> been available; maybe I should do a kickstarter.
>
>> So in order to get *testing* we need to pay somebody. But to build a great
>> database server, we can rely on volunteer efforts or sponsorship from
>> companies who are interested in moving the project forward?
>
> And even more to the point, volunteers to reinvent the kernel I/O stack
> can be found on every street corner?

Actually, yes, I think so. That's a lot more exciting to work on than a
regression test suite.

> And those volunteers won't need any
> test scaffolding to be sure that *their* version never has performance
> regressions? (Well, no, they won't, because no such thing will ever be
> built. But we do need better test scaffolding for real problems.)

Maybe we should lie, and *say* that we want direct I/O, but require that
all submissions come with a test suite to prove that it's a gain. Then
someone might actually write one, as a sidekick of a direct I/O patch.
Then we could toss out the direct I/O stuff and take only the test
framework.

FWIW, I also think that it'd be a folly to reimplement the I/O stack.
The kernel does a lot of things for us. It might not do a great job, but
it's good enough. As one datapoint, before my time, the VMware vPostgres
team actually did use direct I/O in vPostgres. We shipped that in a few
releases. It was a lot of effort to get the code right, and for DBAs, it
made correct tuning of shared_buffers a lot more important - set it too
low and you want take full advantage of your RAM, set it too high and
you won't have memory available for other things. To be a good VM
citizen, they also had to implement a memory ballooning module inside
Postgres, to release shared buffers if the system hosting the VM is
under memory pressure. What did we gain by doing all that, compared to
just letting the kernel handle? Some extra performance in some use
cases, and a loss in others. Not worth the trouble.

- Heikki

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-12-04 08:54:17 Re: logical changeset generation v6.7
Previous Message Jeff Davis 2013-12-04 08:39:07 Re: Extension Templates S03E11