Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Claudio Freire <klaussfreire(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: James Bottomley <James(dot)Bottomley(at)hansenpartnership(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, "lsf-pc(at)lists(dot)linux-foundation(dot)org" <lsf-pc(at)lists(dot)linux-foundation(dot)org>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Dave Chinner <david(at)fromorbit(dot)com>, Joshua Drake <jd(at)commandprompt(dot)com>, Mel Gorman <mgorman(at)suse(dot)de>, Trond Myklebust <trondmy(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
Date: 2014-01-14 17:09:04
Message-ID: 1389719344.92536.YahooMailNeo@web122304.mail.ne1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> James Bottomley <James(dot)Bottomley(at)hansenpartnership(dot)com> wrote:

>>> I don't understand why this has to be absolute: if you advise
>>> us to hold the pages dirty and we do up until it becomes a
>>> choice to hold on to the pages or to thrash the system into a
>>> livelock, why would you ever choose the latter?

Because the former creates database corruption and the latter does
not.

>>> And if, as I'm assuming, you never would,

That assumption is totally wrong.

>>> why don't you want the kernel to make that choice for you?
>>
>> If you don't understand how write-ahead logging works, this
>> conversation is going nowhere.  Suffice it to say that the word
>> "ahead" is not optional.
>
> In essence, if you do flush when you shouldn't, and there is a
> hardware failure, or kernel panic, or anything that stops the
> rest of the writes from succeeding, your database is kaputt, and
> you've got to restore a backup.
>
> Ie: very very bad.

Yup.  And when that's a few terrabytes, you will certainly find
yourself wishing that you had been able to do a recovery up to the
end of the last successfully committed transaction rather than a
restore from backup.

Now, as Tom said, if there was an API to create write boundaries
between particular dirty pages we could leave it to the OS.  Each
WAL record's write would be conditional on the previous one and
each data page write would be conditional on the WAL record for the
last update to the page.  But nobody seems to think that would
yield acceptable performance.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2014-01-14 17:12:35 Exposing currentTransactionWALVolume
Previous Message Heikki Linnakangas 2014-01-14 17:03:29 Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance