Re: V4 of PITR performance improvement for 8.4

From: Koichi Suzuki <koichi(dot)szk(at)gmail(dot)com>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: V4 of PITR performance improvement for 8.4
Date: 2009-01-23 08:30:28
Message-ID: a778a7260901230030x234e53c8gf875bb654e241e9c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Please find enclosed 2nd patch of pg_readahead which include a patch
to bufer manager to skip prefetch of pages already in shared buffer.

This is complete which works with current head and ready for further
review and to be included in 8.4.

Because sync.rep is still under work, I'm planning to post additional
patch to make pg_readahead work with sync.rep.

It will be additional patch and not necessary for usual operation.

Regards;

2009/1/20 Koichi Suzuki <koichi(dot)szk(at)gmail(dot)com>:
> Thanks a lot for the review.
>
> Enclosed is a revised patch reflecting your comments. Also, some
> reply to the comments are included inline.
>
> 2009/1/14 ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>:
>>
>> "Koichi Suzuki" <koichi(dot)szk(at)gmail(dot)com> wrote:
>>
>>> > This patc also include additional patch for posix_fadvise to skip
>>> > prefetch of pages which does not exist.
>>
>> I reviewed your patch and found items to be fixed and improved.
>>
>> - You should not claim your copyrights.
>> There are your copyright claims in two files newly added, but they
>> should be included by PostgreSQL Global Development Group.
>> | * Portions Copyright (c) 2008, Nippon Telegraph and Telephone Corporation
>> | * Portions Copyright (c) 1996-2004, PostgreSQL Global Development Group
>
> Okay, fixed.
>
>>
>> - Avoid pending wal records on continuous REDO.
>> In archive recovery, ReadRecord() waits until the next segment comes.
>> Should we avoid pending WAL records during the waiting?
>> RedoRecords() might be needed before calling RestoreArchivedFile().
>> It would be better to redo records and restore archived files
>> in parallel, but we will need to replace system() to other system
>> because system() blocks until the process finishes.
>
> Yes, I added this feature. Because system() blocks, it does not run
> in parallel.
>
>>
>> - Should check the buffer pool before calling smgrprefetch.
>> If the page is in the shared buffer pool, we don't need to
>> prefetch the buffer from disks. I think it is good to add
>> PrefetchBufferWithoutRelcache() like ReadBufferWithoutRelcache().
>
> Agreed. This needs an addition to buffer manager API. I'd like to
> submit this change as a separate patch by the end of this week.
>
>>
>> - Is possible to remove calls of ReadAheadHasRoom()?
>> If we checks rooms of readahead-queue, xxx_readahead functions
>> don't need to call ReadAheadHasRoom. The maximum readaheads are
>> at most 3, so we'll flush the queue when the rooms of the queue
>> are less than 3. And if the queue is full unexpectedly, we can
>> ignore the added items because it is only hint.
>
> More than three pages are involved in GiST WAL record so I think the
> above is needed.
>
>>
>> - Should avoid large static variables.
>> | [src/backend/access/transam/xlog]
>> | + static char RecordQueueBuf[XLogSegSize * 2];
>> | [readahead.c]
>> | + static XLogReadAhead ReadAheadQueue[ReadAheadQueueSize];
>> I think it is a bad idea to place a large object as a static variable.
>> It should be a pointer and allocated with palloc() in runtime.
>> Also those variable are only used by startup process.
>
> In startup process, most of memory allocation is done by malloc(), not
> palloc(). Replacing malloc() by palloc() may be done separately.
>
>>
>> - Merge readahead.h to xlog.h or something.
>> It export just a few functions and the functionality belongs
>> xlog and recovery.
>
> Agreed.
>
>>
>> Regards,
>> ---
>> ITAGAKI Takahiro
>> NTT Open Source Software Center
>>
>>
>>
>
>
>
> --
> ------
> Koichi Suzuki
>

--
------
Koichi Suzuki

Attachment Content-Type Size
readahead-20090121.patch text/x-diff 39.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2009-01-23 08:35:21 Controlling hot standby
Previous Message Peter Eisentraut 2009-01-23 08:25:30 Re: autovacuum and reloptions