Re: We really ought to do something about O_DIRECT and data=journalled on ext4

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: We really ought to do something about O_DIRECT and data=journalled on ext4
Date: 2010-12-01 18:09:05
Message-ID: 19015.1291226945@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> It's a bug and it's our bug.

No, it's a filesystem bug that this particular filesystem doesn't
support a perfectly reasonable combination of options, and doesn't
even fail gracefully as it could easily do. But assigning blame
doesn't help much.

> Back when we added O_DIRECT, we assumed
> that support for O_DIRECT/opensync could be determined on an OS/kernel
> basis, because that was the information we had. Now it turns out that
> support can vary *by filesystem* and *between remounts*. We didn't have
> any way of knowing different back in 2004, but that doesn't mean we
> don't need to fix our mistaken assumption now.

> Ideally, we would change our code to test support for O_DIRECT on
> startup, rather than at compile time, and backport *that*.

I'm not convinced that a startup-time test would be enough either,
since as you note a remount might be enough to change the situation.

I think the best answer is to get out of the business of using
O_DIRECT by default, especially seeing that available evidence
suggests it might not be a performance win anyway.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2010-12-01 18:19:40 Re: We really ought to do something about O_DIRECT and data=journalled on ext4
Previous Message Dmitriy Igrishin 2010-12-01 18:08:49 Re: DELETE with LIMIT (or my first hack)