Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance

From: Jan Kara <jack(at)suse(dot)cz>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jan Kara <jack(at)suse(dot)cz>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Jim Nasby <jim(at)nasby(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, "lsf-pc(at)lists(dot)linux-foundation(dot)org" <lsf-pc(at)lists(dot)linux-foundation(dot)org>, Kevin Grittner <kgrittn(at)ymail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Dave Chinner <david(at)fromorbit(dot)com>, Joshua Drake <jd(at)commandprompt(dot)com>, Bottomley James <James(dot)Bottomley(at)hansenpartnership(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Mel Gorman <mgorman(at)suse(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Trond Myklebust <trondmy(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
Date: 2014-01-16 02:58:46
Message-ID: 20140116025846.GB25833@quack.suse.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed 15-01-14 21:37:16, Robert Haas wrote:
> On Wed, Jan 15, 2014 at 8:41 PM, Jan Kara <jack(at)suse(dot)cz> wrote:
> > On Wed 15-01-14 10:12:38, Robert Haas wrote:
> >> On Wed, Jan 15, 2014 at 4:35 AM, Jan Kara <jack(at)suse(dot)cz> wrote:
> >> > Filesystems could in theory provide facility like atomic write (at least up
> >> > to a certain size say in MB range) but it's not so easy and when there are
> >> > no strong usecases fs people are reluctant to make their code more complex
> >> > unnecessarily. OTOH without widespread atomic write support I understand
> >> > application developers have similar stance. So it's kind of chicken and egg
> >> > problem. BTW, e.g. ext3/4 has quite a bit of the infrastructure in place
> >> > due to its data=journal mode so if someone on the PostgreSQL side wanted to
> >> > research on this, knitting some experimental ext4 patches should be doable.
> >>
> >> Atomic 8kB writes would improve performance for us quite a lot. Full
> >> page writes to WAL are very expensive. I don't remember what
> >> percentage of write-ahead log traffic that accounts for, but it's not
> >> small.
> > OK, and do you need atomic writes on per-IO basis or per-file is enough?
> > It basically boils down to - is all or most of IO to a file going to be
> > atomic or it's a smaller fraction?
>
> The write-ahead log wouldn't need it, but data files writes would. So
> we'd need it a lot, but not for absolutely everything.
>
> For any given file, we'd either care about writes being atomic, or we
> wouldn't.
OK, when you say that either all writes to a file should be atomic or
none of them should be, then can you try the following:
chattr +j <file>

will turn on data journalling for <file> on ext3/ext4 filesystem.
Currently it *won't* guarantee the atomicity in all the cases but the
performance will be very similar as if it would. You might also want to
increase filesystem journal size with 'tune2fs -J size=XXX /dev/yyy' where
XXX is desired journal size in MB. Default is 128 MB I think but with
intensive data journalling you might want to have that in GB range. I'd be
interested in hearing what impact does turning 'atomic write' support
in PostgreSQL and using data journalling on ext4 have.

Honza
--
Jan Kara <jack(at)suse(dot)cz>
SUSE Labs, CR

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2014-01-16 04:23:02 Re: Why conf.d should be default, and auto.conf and recovery.conf should be in it
Previous Message Florian Pflug 2014-01-16 02:47:51 Re: [PATCH] Negative Transition Aggregate Functions (WIP)