Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]

From: Andrew Borodin <borodin(at)octonica(dot)com>
To: Andrew Borodin <amborodin(at)acm(dot)org>
Cc: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Sergey Mirvoda <sergey(at)mirvoda(dot)com>
Subject: Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
Date: 2016-07-04 11:22:37
Message-ID: CAJEAwVHc2OwjePsfc6Kz5jUV0zSkrGZNV7xMN6pgXYgPwD2eiA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here is a patch with corrections from Alexander Korotkov.
My tests, check and check-world show no problems against Ubuntu LTS 14 x64 VM.

Best regards, Andrey Borodin, Octonica & Ural Federal University.

2016-07-04 10:41 GMT+05:00 Andrew Borodin <borodin(at)octonica(dot)com>:
> Hi!
>>I think you should implement PageReplaceItem() version and add it to the commitfest.
> Here is the patch.
> I've called function PageIndexTupleOverwrite() because it's suitable
> only for indices. It works on my tests and performance is the same as
> in proof-of-concept (despite some sanity checks copied from
> PageIndexTupleDelete).
> Next weekend I'll do more testing, and then add it to commitfest.
>
> Best regards, Andrey Borodin, Octonica & Ural Federal University.
>
> 2016-07-03 15:21 GMT+05:00 Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>:
>> Hi!
>>
>> On Sun, Jul 3, 2016 at 12:24 PM, Andrew Borodin <borodin(at)octonica(dot)com>
>> wrote:
>>>
>>> I think there is some room for improving GiST inserts. Following is
>>> the description of what I think is performance problem.
>>>
>>> Function gistplacetopage in file /src/backend/access/gist/gist.c is
>>> responsible for updating or appending new tuple on GiST page.
>>> Currently, after checking necessity of page split due to overflow, it
>>> essentially executes following:
>>>
>>> if (OffsetNumberIsValid(oldoffnum))
>>> PageIndexTupleDelete(page, oldoffnum);
>>> gistfillbuffer(page, itup, ntup, InvalidOffsetNumber);
>>>
>>> That is: remove old tuple if it’s there, then place updated tuple at the
>>> end.
>>>
>>> Half of the old data have to be shifted my memmove inside
>>> PageIndexTupleDelete() call, half of the linp-s have to be corrected.
>>>
>>> If the updated tuple has same size as already residing on page we can
>>> just overwrite it. Attached patch demonstrates that concept. Attached
>>> test.sql inserts million rows into GiST index based on cube extension.
>>> My machine is Hyper-V VM running Ubuntu on i5-2500 CPU with SSD
>>> storage. Before patch, insert part of test is executed on average
>>> within 159 second, after patch application: insert part is executed
>>> within 77 seconds on average. That is almost twice faster (for
>>> CPU\Mem-bounded inserts, disk-constrained test will show no
>>> improvement). But it works only for fixed-size tuple inserts.
>>
>>
>> Very promising results!
>>
>>> I know that code in patch is far from beautiful: it operates with
>>> three different levels of abstraction within 5 lines of code. Those
>>> are low level memmove(), system-wide PageAddItem() and GiST private
>>> gistfillBuffer().
>>>
>>> By the way PageAddItem() have overwrite flag, but it only works with
>>> unused ItemId’s. Marking old ItemId as unused before PageAddItem()
>>> didn’t work for me. Unfortunately bufpage.c routines do not contain
>>> one for updating(replacing with new) tuple on page. It is important
>>> for me because I’m working on advanced GiST page layout (
>>>
>>> https://www.postgresql.org/message-id/CAJEAwVE0rrr%2BOBT-P0gDCtXbVDkBBG_WcXwCBK%3DGHo4fewu3Yg%40mail.gmail.com
>>> ), current approach is to use skip-tuples which can be used to skip
>>> many flowing tuples with one key check. Obviously, this design cares
>>> about tuples order. And update in a fashion “place updated tuple at
>>> the end” won’t work for me.
>>>
>>> So, I think it would be better to implement PageReplaceItem()
>>> functionality to make code better, to make existing GiST inserts
>>> faster and to enable new advanced page layouts in indices.
>>
>>
>> +1 for PageReplaceItem()
>> Even if item is not the same size, we can move the tail of page once instead
>> of twice.
>> I think you should implement PageReplaceItem() version and add it to the
>> commitfest.
>>
>> ------
>> Alexander Korotkov
>> Postgres Professional: http://www.postgrespro.com
>> The Russian Postgres Company

Attachment Content-Type Size
PageIndexTupleOverwrite v2.patch application/octet-stream 5.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2016-07-04 12:31:05 Re: Declarative partitioning
Previous Message Greg Stark 2016-07-04 11:10:57 oddity in initdb probing of max_connections/shared_buffers