Re: 2nd Level Buffer Cache

From: rsmogura <rsmogura(at)softperience(dot)eu>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: <josh(at)agliodbs(dot)com>, <gsstark(at)mit(dot)edu>, <jim(at)nasby(dot)net>, <robertmhaas(at)gmail(dot)com>, <Kevin(dot)Grittner(at)wicourts(dot)gov>, <pgsql-hackers(at)postgresql(dot)org>, <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: 2nd Level Buffer Cache
Date: 2011-03-31 13:53:01
Message-ID: 420c1ace890f429adcdb3df12bd35257@mail.softperience.eu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 26 Mar 2011 08:33:42 -0400, Merlin Moncure wrote:
> On Fri, Mar 25, 2011 at 11:02 PM, Radosław Smogura
> <rsmogura(at)softperience(dot)eu> wrote:
>> Merlin Moncure <mmoncure(at)gmail(dot)com> Thursday 24 March 2011 15:50:36
>>> On Thu, Mar 24, 2011 at 1:25 AM, Radosław Smogura
>>>
>>> <rsmogura(at)softperience(dot)eu> wrote:
>>> > Merlin Moncure <mmoncure(at)gmail(dot)com> Wednesday 23 March 2011
>>> 21:26:16
>>> >
>>> >> On Wed, Mar 23, 2011 at 3:23 PM, Radosław Smogura
>>> >>
>>> >> <rsmogura(at)softperience(dot)eu> wrote:
>>> >> > Simple allocating whole file and pointer add (as I found on
>>> some
>>> >> > forum, too),
>>> >>
>>> >> got a link for that?
>>> >>
>>> >> > is performance killer. Query executes 2.5x slower. Adding
>>> mlock is
>>> >> > next performance killer, hehe.
>>> >>
>>> >> there is no reason to mlock in postgres.
>>> >>
>>> >> > I saw mmaped code is really sensitive. Commenting/uncommenting
>>> >> > statement that doesn't gives anything to code flow may kill
>>> >> > performance, maybe kernel swaps out pages.
>>> >>
>>> >> hm. you are sure mmap is slower??
>>> >>
>>> >> merlin
>>> >
>>> > I found da light,
>>> >
>>> > Hehe. When I switched to mmap "whole file", I compiled pg with
>>> debug, no
>>> > optimization and casserts!!!
>>> >
>>> > mmap it's really faster, query that took 450ms, went down to
>>> 410ms, and
>>> > when I bootstrap mmaping query takes 430ms (situation: one query
>>> one
>>> > backend).
>>>
>>> This is really good news. I ran several tests and mmap is
>>> outperforming read() by factor of 2x() in some cases and
>>> underperforming in others.  I'm still not sure it will work out to
>>> win
>>> in the end.
>>>
>>> I did some more looking in terms of how deeply you can replace
>>> shared
>>> buffer implementation.  Hooking into current bufmgr is simpler
>>> approach.  Critical logic is fired (XLogFlush) when buffers leave
>>> shared buffer system.  Insertion into WAL (XLogInsert) however is
>>> managed outside of bufmgr.
>>>
>>> shared buffers play two critical roles: they so buffer pages on top
>>> of
>>> file cache but also stage dirty data so you are not constantly
>>> flusing
>>> xlog.  my idea yesterday would not perform xlog cache.  however,
>>> probably server would perform better with buffers reserved strictly
>>> for write caching.
>>>
>>> Just thinking out loud. I'm learning as I go.
>>>
>>> merlin
>> I think read is done (no locals, no API clean), "final" solution was
>> to not
>> mmap whole file only, but mmap it with maximum size - segment size
>> :)
>>
>> In addition crash report API with simple generator. If you will
>> compile with -
>> ggdb -g3 nice results may be printed in order of crash.
>>
>> Added simple tests and switch for mmap in configure.
>>
>> I don't know if I will have time to look at mmap in this weekend.
>
> thanks -- I'll take a look.
>
> merlin

I think I done this, at lest at simple level, without big optimization
etc. At least works for one client. Still you must initdb form original
sources (there is bug somewhere in initdb (there is read or write over
mmaped segment). Autovaac, etc is killer. I think I preserve WAL before
data, and I hope I use shared buffers. It's still really buggy.

Sometimes db crashes form other processes, and I didn't attached crash
report for those.

Regards,
Radek

Attachment Content-Type Size
pg_mmap_20110331_writing.diff.bz2 application/x-bzip2 132.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2011-03-31 14:55:48 Re: Problem with pg_upgrade?
Previous Message Kevin Grittner 2011-03-31 13:31:40 Re: SSI bug?