Re: Streaming base backups

From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Streaming base backups
Date: 2011-01-10 20:48:30
Message-ID: 4D2B709E.4000507@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/10/2011 08:13 PM, Cédric Villemain wrote:
> 2011/1/10 Magnus Hagander<magnus(at)hagander(dot)net>:
>> On Sun, Jan 9, 2011 at 23:33, Cédric Villemain
>> <cedric(dot)villemain(dot)debian(at)gmail(dot)com> wrote:
>>> 2011/1/7 Magnus Hagander<magnus(at)hagander(dot)net>:
>>>> On Fri, Jan 7, 2011 at 01:47, Cédric Villemain
>>>> <cedric(dot)villemain(dot)debian(at)gmail(dot)com> wrote:
>>>>> 2011/1/5 Magnus Hagander<magnus(at)hagander(dot)net>:
>>>>>> On Wed, Jan 5, 2011 at 22:58, Dimitri Fontaine<dimitri(at)2ndquadrant(dot)fr> wrote:
>>>>>>> Magnus Hagander<magnus(at)hagander(dot)net> writes:
>>>>>>>> * Stefan mentiond it might be useful to put some
>>>>>>>> posix_fadvise(POSIX_FADV_DONTNEED)
>>>>>>>> in the process that streams all the files out. Seems useful, as long as that
>>>>>>>> doesn't kick them out of the cache *completely*, for other backends as well.
>>>>>>>> Do we know if that is the case?
>>>>>>>
>>>>>>> Maybe have a look at pgfincore to only tag DONTNEED for blocks that are
>>>>>>> not already in SHM?
>>>>>>
>>>>>> I think that's way more complex than we want to go here.
>>>>>>
>>>>>
>>>>> DONTNEED will remove the block from OS buffer everytime.
>>>>
>>>> Then we definitely don't want to use it - because some other backend
>>>> might well want the file. Better leave it up to the standard logic in
>>>> the kernel.
>>>
>>> Looking at the patch, it is (very) easy to add the support for that in
>>> basebackup.c
>>> That supposed allowing mincore(), so mmap(), and so probably switch
>>> the fopen() to an open() (or add an open() just for mmap
>>> requirement...)
>>>
>>> Let's go ?
>>
>> Per above, I still don't think we *should* do this. We don't want to
>> kick things out of the cache underneath other backends, and since we
>
> we are dropping stuff underneath other backends anyway but I
> understand your point.
>
>> can't control that. Either way, it shouldn't happen in the beginning,
>> and if it does, should be backed with proper benchmarks.
>
> I agree.

well I want to point out that the link I provided upthread actually
provides a (linux centric) way to do get the property of interest for this:

* if the datablocks are in the OS buffercache just leave them alone, if
the are NOT tell the OS that "this current user" is not interested in
having it there

I would like to see something like that implemented in the backend
sometime and maybe even as a guc of some sort, that way we actually
could use that for say a pg_dump run as well, I have seen the
responsetimes of big boxes tank not because of the CPU and lock-load
pg_dump imposes but because of the way that it can cause the
OS-buffercache to get spoiled with not-really-important data.

anyway I agree that the (positive and/or negative) effect of something
like that needs to be measured but this effect is not too easy to see in
very simple setups...

Stefan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2011-01-10 20:50:23 Re: Compatibility GUC for serializable
Previous Message Dimitri Fontaine 2011-01-10 20:08:39 Re: walsender parser patch