Re: Priority table or Cache table

From: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Priority table or Cache table
Date: 2014-03-17 04:16:43
Message-ID: CAJrrPGfHA_XzcaH4vTJKd0yQMws5JgirH9jicU2GYxFJNf0Qfg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 21, 2014 at 12:02 PM, Haribabu Kommi
<kommi(dot)haribabu(at)gmail(dot)com> wrote:
> On Thu, Feb 20, 2014 at 10:06 PM, Ashutosh Bapat
> <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>>
>> On Thu, Feb 20, 2014 at 10:23 AM, Haribabu Kommi
>> <kommi(dot)haribabu(at)gmail(dot)com> wrote:
>>>
>>> On Thu, Feb 20, 2014 at 2:26 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
>>> wrote:
>>>>
>>>> On Thu, Feb 20, 2014 at 6:24 AM, Haribabu Kommi
>>>> <kommi(dot)haribabu(at)gmail(dot)com> wrote:
>>>> > On Thu, Feb 20, 2014 at 11:38 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>> >> > I want to propose a new feature called "priority table" or "cache
>>>> >> > table".
>>>> >> > This is same as regular table except the pages of these tables are
>>>> >> > having
>>>> >> > high priority than normal tables. These tables are very useful,
>>>> >> > where a
>>>> >> > faster query processing on some particular tables is expected.
>>>> >>
>>>> >> Why exactly does the existing LRU behavior of shared buffers not do
>>>> >> what you need?
>>>> >
>>>> >
>>>> > Lets assume a database having 3 tables, which are accessed regularly.
>>>> > The
>>>> > user is expecting a faster query results on one table.
>>>> > Because of LRU behavior which is not happening some times.

I Implemented a proof of concept patch to see whether the buffer pool
split can improve the performance or not.

Summary of the changes:
1. The priority buffers are allocated as continuous to the shared buffers.
2. Added new reloption parameter called "buffer_pool" to specify the
buffer_pool user wants the table to use.
3. Two free lists are created to store the information for two buffer pools.
4. While allocating the buffer based on the table type, the
corresponding buffer is allocated.

The Performance test is carried as follows:
1. Create all the pgbench tables and indexes on the new buffer pool.
2. Initialize the pgbench test with a scale factor of 75 equals to a
size of 1GB.
3. Create an another load test table with a size of 1GB with default
buffer pool.
4. In-parallel with performance test the select and update operations
are carried out on the load test table (singe thread).

Configuration changes:
shared_buffers - 1536MB (Head) Patched Shared_buffers
-512MB, Priority_buffers - 1024MB.
synchronous_commit - off, wal_buffers-16MB, checkpoint_segments - 255,
checkpoint_timeout - 15min.

Threads Head Patched Diff
1 25 25 0%
2 35 59 68%
4 52 79 51%
8 79 150 89%

In my testing it shows very good improvement in performance.

The POC patch and the test script is attached in the mail used for
testing the performance.
The modified pgbench.c code is also attached to use the newly created
buffer pool instead of default for the test purpose.
Copy the test script to the installation folder and execute as
./rub_bg.sh ./run_reading.sh 1 1

please let me know your suggestions.

Regards,
Hari Babu
Fujitsu Australia

Attachment Content-Type Size
test_script.zip application/zip 25.1 KB
cache_table_poc.patch application/octet-stream 69.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2014-03-17 04:24:39 First draft of update announcement
Previous Message Amit Kapila 2014-03-17 04:10:14 Re: [RFC] What should we do for reliable WAL archiving?