| From: | Jan Kara <jack(at)suse(dot)cz> | 
|---|---|
| To: | Robert Haas <robertmhaas(at)gmail(dot)com> | 
| Cc: | Jan Kara <jack(at)suse(dot)cz>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Jim Nasby <jim(at)nasby(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, "lsf-pc(at)lists(dot)linux-foundation(dot)org" <lsf-pc(at)lists(dot)linux-foundation(dot)org>, Kevin Grittner <kgrittn(at)ymail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Dave Chinner <david(at)fromorbit(dot)com>, Joshua Drake <jd(at)commandprompt(dot)com>, Bottomley James <James(dot)Bottomley(at)hansenpartnership(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Mel Gorman <mgorman(at)suse(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Trond Myklebust <trondmy(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net> | 
| Subject: | Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance | 
| Date: | 2014-01-16 01:41:34 | 
| Message-ID: | 20140116014134.GA24867@quack.suse.cz | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Wed 15-01-14 10:12:38, Robert Haas wrote:
> On Wed, Jan 15, 2014 at 4:35 AM, Jan Kara <jack(at)suse(dot)cz> wrote:
> > Filesystems could in theory provide facility like atomic write (at least up
> > to a certain size say in MB range) but it's not so easy and when there are
> > no strong usecases fs people are reluctant to make their code more complex
> > unnecessarily. OTOH without widespread atomic write support I understand
> > application developers have similar stance. So it's kind of chicken and egg
> > problem. BTW, e.g. ext3/4 has quite a bit of the infrastructure in place
> > due to its data=journal mode so if someone on the PostgreSQL side wanted to
> > research on this, knitting some experimental ext4 patches should be doable.
> 
> Atomic 8kB writes would improve performance for us quite a lot.  Full
> page writes to WAL are very expensive.  I don't remember what
> percentage of write-ahead log traffic that accounts for, but it's not
> small.
  OK, and do you need atomic writes on per-IO basis or per-file is enough?
It basically boils down to - is all or most of IO to a file going to be
atomic or it's a smaller fraction?
As Dave notes, unless there is HW support (which is coming with newest
solid state drives), ext4/xfs will have to implement this by writing data
to a filesystem journal and after transaction commit checkpointing them to
a final location. Which is exactly what you do with your WAL logs so
it's not clear it will be a performance win. But it is easy enough to code
for ext4 that I'm willing to try...
								Honza
-- 
Jan Kara <jack(at)suse(dot)cz>
SUSE Labs, CR
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Peter Eisentraut | 2014-01-16 02:13:18 | Re: [PATCH] Add transforms feature | 
| Previous Message | Florian Pflug | 2014-01-16 01:32:46 | Re: [PATCH] Negative Transition Aggregate Functions (WIP) |