Re: Small improvement to compactify_tuples

From: Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: pgsql-hackers(at)postgresql(dot)org, Heikki Linnakangas <hlinnaka(at)gmail(dot)com>
Subject: Re: Small improvement to compactify_tuples
Date: 2017-05-15 13:18:10
Message-ID: e49befcc6f1d7099834c6fdf5c675a60@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sokolov Yura писал 2017-05-15 15:08:
> Heikki Linnakangas писал 2017-05-15 12:06:
>> On 05/14/2017 09:47 PM, Sokolov Yura wrote:
>>> Good day, everyone.
>>>
>>> I've been playing a bit with unlogged tables - just random updates on
>>> simple
>>> key-value table. I've noticed amount of cpu spent in a
>>> compactify_tuples
>>> (called by PageRepareFragmentaion). Most of time were spent in qsort
>>> of
>>> itemidbase items.
>>
>> Ah, I played with this too a couple of years ago, see
>> https://www.postgresql.org/message-id/546B89DE.7030906%40vmware.com,
>> but got distracted by other things and never got around to commit
>> that.
>>
>>> itemidbase array is bounded by number of tuples in a page, and
>>> itemIdSortData
>>> structure is simple, so specialized version could be a better choice.
>>>
>>> Attached patch adds combination of one pass of prefix sort with
>>> insertion
>>> sort for larger array and shell sort for smaller array.
>>> Insertion sort and shell sort are implemented as macros and could be
>>> reused.
>>
>> Cool! Could you compare that against the bucket sort I posted in the
>> above thread, please?
>>
>> At a quick glance, your "prefix sort" seems to be the the same
>> algorithm as the bucket sort that I implemented. You chose 256
>> buckets, where I picked 32. And you're adding a shell sort
>> implementation, for small arrays, while I used a straight insertion
>> sort. Not sure what these differences mean in practice.
>>
>> - Heikki
>
> Thank you for attention.
>
> My first version of big page sort was almost exactly same to yours.
> I had a bug in my insertion sort, so I had to refactor it.
> (bug were fixed)
>
> I found that items in itemidbase are almost sorted, so it is important
> to try keep its order in prefix sort. So I've changed --count[i] to
> count[i+1]++.
>
> And it looks like it is better to have more buckets:
> - with 256 buckets, size of single bucket is almost always less than 2,
> so array is almost always sorted after prefix sort pass.
>
> But it looks like it is better to sort each bucket separately, as you
> did, and as it was in my early version.
>
> Also I used your names for functions and some comments.
>
> I attached new version of the patch.
>
> I left memcpy intact cause it looks like it doesn't take noticable
> cpu time.

In a sequel, I propose to simplify PageRepairFragmentation in attached
patch.

--
Sokolov Yura aka funny.falcon
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company

Attachment Content-Type Size
0002-bufpage.c-simplify-PageRepairFragmentation.patch text/x-diff 3.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2017-05-15 13:49:42 Re: Event triggers + table partitioning cause server crash in current master
Previous Message Amit Kapila 2017-05-15 13:12:42 Re: NOT NULL constraints on range partition key columns