Re: still gin index creation takes forever

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Ivan Sergio Borgonovo <mail(at)webthatworks(dot)it>, pgsql-general(at)postgresql(dot)org
Subject: Re: still gin index creation takes forever
Date: 2008-11-12 17:12:00
Message-ID: 491B0E60.4020706@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

>> GIN's build algorithm could use bulk insert of ItemPointers if and only if they
>> should be inserted on rightmost page (exact piece of code - dataPlaceToPage() in
>> gindatapage.c, lines 407-427)
>
> I'm not following. Rightmost page of what --- it can't be the whole
> index, can it, or the case would hardly ever apply?

GIN's index contains btree over keys (entry tree) and for each key it
contains list of ItemPointers (posting list) or btree over ItemPointers
(posting tree or data tree) depending on its quantity. Bulk insertion
process collects into memory keys and sorted arrays of ItemPointers, and
then for each keys, it tries to insert every ItemPointer from array into
corresponding data tree one by one. But if the smallest ItemPointer in
array is greater than the biggest stored one then algorithm will insert
the whole array on rightmost page in data tree.

So, in that case process can insert about 1000 ItemPointers per one data
tree lookup, in opposite case it does 1000 lookups in data tree.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Steve Atkins 2008-11-12 17:12:11 Re: Post to another db using pl/pgsql
Previous Message Alvaro Herrera 2008-11-12 17:07:11 Re: Table bloat and vacuum