Re: Page replacement algorithm in buffer cache

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Robert Haas'" <robertmhaas(at)gmail(dot)com>
Cc: "'Greg Smith'" <greg(at)2ndquadrant(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Page replacement algorithm in buffer cache
Date: 2013-04-06 03:08:31
Message-ID: 007001ce3274$05774220$1065c660$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Saturday, April 06, 2013 12:38 AM Robert Haas wrote:
> On Fri, Apr 5, 2013 at 1:12 AM, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
> wrote:
> > If we just put it to freelist, then next time if it get allocated
> directly
> > from bufhash table, then who will remove it from freelist
> > or do you think that, in BufferAlloc, if it gets from bufhash table,
> then it
> > should verify if it's in freelist, then remove from freelist.
>
> No, I don't think that's necessary. We already have the following
> guard in StrategyGetBuffer:
>
> if (buf->refcount == 0 && buf->usage_count == 0)
> {
> if (strategy != NULL)
> AddBufferToRing(strategy, buf);
> return buf;
> }
>
> If a buffer is allocated from the freelist and it turns out that it
> actually has a non-zero reference count or a non-zero pin count, we
> just discard it and pull the next buffer off the freelist instead.
> So, in the scenario you describe, the buffer gets reallocated (due to
> a non-NULL BufferAccessStrategy, presumably) and then somebody comes a
> long and pulls it off the freelist. But, since the buffer has just
> been used by someone else, it'll most likely be pinned or have a
> non-zero usage count, so we'll just skip it and allocate some other
> buffer instead. No harm done.

Yes, you are right, I have missed that part of code while thinking of this
scenario, but I was talking about NULL BufferAccessStrategy as well.

I still have one more doubt, consider the below scenario for cases when we
Invalidate buffers during moving to freelist v/s just move to freelist

Backend got the buffer from freelist for a request of page-9 (number 9 is
random, just to explain), it still have association with another page-10
It needs to add the buffer with new tag (new page association) in bufhash
table and remove the buffer with oldTag (old page association).

The benefit for just moving to freelist is that if we get request of same
page until somebody else used it for another page, it will save read I/O.
However on the other side for many cases
Backend will need extra partition lock to remove oldTag (which can lead to
some bottleneck).

I think saving read I/O is more beneficial but just not sure if that is best
as cases might be less for it.

> Now, it is possible that the buffer could get added to the freelist,
> then allocated via a BufferAccessStrategy, and then the clock sweep
> could hit it and push the usage count back to 0. But that's no big
> deal either: if we go to put it on the freelist and see (via
> buf->freeNext) that it's already there, we can just leave it where it
> is (or maybe move it to the end). On a related note, we probably need
> a variant of StrategyFreeBuffer which pushes buffers onto the end of
> the freelist rather than the front. It makes sense to stick
> invalidated buffers on the front of the list (which is what
> StrategyFreeBuffer does), but non-invalidated buffers should be placed
> at the end to more closely approximate LRU.

Okay.

Last time following tests have been executed to validate the results:

Test suite - pgbench
DB Size - 16 GB
RAM - 24 GB
Shared Buffers - 2G, 5G, 7G, 10G
Concurrency - 8, 16, 32, 64 clients
Pre-warm the buffers before start of test

Shall we try for any other scenario's or for initial test of patch above are
okay.

With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2013-04-06 03:23:07 Re: Unrecognized type error (postgres 9.1.4)
Previous Message Peter Eisentraut 2013-04-06 01:50:39 Re: Back branches vs. gcc 4.8.0