Re: [WIP PATCH] for Performance Improvement in Buffer Management

From: Amit kapila <amit(dot)kapila(at)huawei(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP PATCH] for Performance Improvement in Buffer Management
Date: 2012-09-06 09:08:15
Message-ID: 6C0B27F7206C9E4CA54AE035729E9C3828530C8F@szxeml509-mbs
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Tuesday, September 04, 2012 6:55 PM Amit kapila wrote:
On Tuesday, September 04, 2012 12:42 AM Jeff Janes wrote:
On Mon, Sep 3, 2012 at 7:15 AM, Amit kapila <amit(dot)kapila(at)huawei(dot)com> wrote:
>>> This patch is based on below Todo Item:
>
>>> Consider adding buffers the background writer finds reusable to the free
>>> list
>
>
>
>>> I have tried implementing it and taken the readings for Select when all the
>>> data is in either OS buffers
>
>>> or Shared Buffers.
>
>
>
>>> The Patch has simple implementation for "bgwriter or checkpoint process
>>> moving the unused buffers (unpinned with "ZERO" usage_count buffers) into
>>> "freelist".

>> I don't think InvalidateBuffer can be safely used in this way. It
>> says "We assume
>> that no other backend could possibly be interested in using the page",
>> which is not true here.

> As I understood and anlyzed based on above, that there is problem in attached patch such that in function
> InvalidateBuffer(), after UnlockBufHdr() and before PartitionLock if some backend uses that buffer and
> increase the usage count to 1, still
> InvalidateBuffer() will remove the buffer from hash table and put it in Freelist.
> I have modified the code to address above by checking refcount & usage_count inside Partition Lock
> , LockBufHdr and only after that move it to freelist which is similar to InvalidateBuffer.
> In actual code we can optimize the current code by using extra parameter in InvalidateBuffer.

> Please let me know if I understood you correctly or you want to say something else by above comment?

The results for the updated code is attached with this mail.
The scenario is same as in original mail.
1. Load all the files in to OS buffers (using pg_prewarm with 'read' operation) of all tables and indexes.
2. Try to load all buffers with "pgbench_accounts" table and "pgbench_accounts_pkey" pages (using pg_prewarm with 'buffers' operation).
3. Run the pgbench with select only for 20 minutes.

Platform details:
Operating System: Suse-Linux 10.2 x86_64
Hardware : 4 core (Intel(R) Xeon(R) CPU L5408 @ 2.13GHz)
RAM : 24GB

Server Configuration:
shared_buffers = 5GB (1/4 th of RAM size)
Total data size = 16GB
Pgbench configuration:
transaction type: SELECT only
scaling factor: 1200
query mode: simple
number of clients: <varying from 8 to 64 >
number of threads: <varying from 8 to 64 >
duration: 1200 s

I shall take further readings for following configurations and post the same:
1. The intention for taking with below configuration is that, with the defined testcase, there will be some cases where I/O can happen. So I wanted to check the impact of it.

Shared_buffers - 7 GB
number of clients: <varying from 8 to 64 >
number of threads: <varying from 8 to 64 >
transaction type: SELECT only

2.The intention for taking with below configuration is that, with the defined testcase, memory kept for shared buffers is less then the recommended. So I wanted to check the impact of it.
Shared_buffers - 2 GB
number of clients: <varying from 8 to 64 >
number of threads: <varying from 8 to 64 >
transaction type: SELECT only

3. The intention for taking with below configuration is that, with the defined testcase, it will test mix of dml operations where there will be I/O due to dml operations. So I wanted to check the impact of it.
Shared_buffers - 5GB
number of clients: <varying from 8 to 64 >
number of threads: <varying from 8 to 64 >
transaction type: tpc_b

> One problem I could see with proposed change is that in some cases the usage count will get decrement for > a buffer allocated
> from free list immediately as it can be nextvictimbuffer.
> However there can be solution to this problem.

With Regards,
Amit Kapila.

Attachment Content-Type Size
Results_v2_patch.html text/html 28.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2012-09-06 10:04:06 Re: txid failed epoch increment, again, aka 6291
Previous Message Amit Kapila 2012-09-06 06:09:07 Behavior difference for walsender and walreceiver for n/w breakdown case