Re: GSoC 2011: Fast GiST index build

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GSoC 2011: Fast GiST index build
Date: 2011-04-27 07:27:31
Message-ID: 4DB7C563.9060605@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 27.04.2011 09:51, Alexander Korotkov wrote:
> On Tue, Apr 26, 2011 at 1:10 PM, Alexander Korotkov<aekorotkov(at)gmail(dot)com>wrote:
>
>> Since algorithm is focused to reduce I/O, we should expect best
>> acceleration in the case when index doesn't fitting to memory. Size of
>> buffers is comparable to size of whole index. It means that if we can hold
>> buffers in memory then we mostly can hold whole index in memory. That's why
>> I think we should have simple on-disk buffers management for first
>> representative benchmark.
>>
> Since we need to free all buffers after index built, I believe that buffers
> should be stored separately. If not, index become bloat immediatly after
> creation. We probably need to create a temporary relation to store buffers
> in it. If my thought is right, then is there any example of using temporary
> relation?

A temporary relation is a bit heavy-weight for this, a temporary file
should be enough. See src/backend/storage/file/buffile.c,
BufFileCreateTemp() function in particular. Or perhaps a tuplestore
suits better, see src/backend/utils/sort/tuplestore.c, that's simpler to
use if you're storing tuples. tuplestore only supports storing heap
tuples at the moment, but it could easily be extended to store index
tuples, like tuplesort.c does.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2011-04-27 08:19:32 Re: Memory leak in FDW
Previous Message Markus Wanner 2011-04-27 07:22:37 Re: Proposal - asynchronous functions