Re: Small improvement to compactify_tuples

From: Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: pgsql-hackers(at)postgresql(dot)org, Heikki Linnakangas <hlinnaka(at)gmail(dot)com>
Subject: Re: Small improvement to compactify_tuples
Date: 2017-05-15 12:08:40
Message-ID: 0dcb7a5afa3fae6001bd1712253cf81d@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Heikki Linnakangas писал 2017-05-15 12:06:
> On 05/14/2017 09:47 PM, Sokolov Yura wrote:
>> Good day, everyone.
>>
>> I've been playing a bit with unlogged tables - just random updates on
>> simple
>> key-value table. I've noticed amount of cpu spent in a
>> compactify_tuples
>> (called by PageRepareFragmentaion). Most of time were spent in qsort
>> of
>> itemidbase items.
>
> Ah, I played with this too a couple of years ago, see
> https://www.postgresql.org/message-id/546B89DE.7030906%40vmware.com,
> but got distracted by other things and never got around to commit
> that.
>
>> itemidbase array is bounded by number of tuples in a page, and
>> itemIdSortData
>> structure is simple, so specialized version could be a better choice.
>>
>> Attached patch adds combination of one pass of prefix sort with
>> insertion
>> sort for larger array and shell sort for smaller array.
>> Insertion sort and shell sort are implemented as macros and could be
>> reused.
>
> Cool! Could you compare that against the bucket sort I posted in the
> above thread, please?
>
> At a quick glance, your "prefix sort" seems to be the the same
> algorithm as the bucket sort that I implemented. You chose 256
> buckets, where I picked 32. And you're adding a shell sort
> implementation, for small arrays, while I used a straight insertion
> sort. Not sure what these differences mean in practice.
>
> - Heikki

Thank you for attention.

My first version of big page sort was almost exactly same to yours.
I had a bug in my insertion sort, so I had to refactor it.
(bug were fixed)

I found that items in itemidbase are almost sorted, so it is important
to try keep its order in prefix sort. So I've changed --count[i] to
count[i+1]++.

And it looks like it is better to have more buckets:
- with 256 buckets, size of single bucket is almost always less than 2,
so array is almost always sorted after prefix sort pass.

But it looks like it is better to sort each bucket separately, as you
did, and as it was in my early version.

Also I used your names for functions and some comments.

I attached new version of the patch.

I left memcpy intact cause it looks like it doesn't take noticable
cpu time.

--
Sokolov Yura aka funny.falcon
Postgres Professional: https://postgrespro.ru
The Russian PostgreSQL Company

Attachment Content-Type Size
0001-storage-page-bufpage.c-improve-compactify_tuples-v02.patch text/x-diff 6.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2017-05-15 12:13:52 Re: Bug in ExecModifyTable function and trigger issues for foreign tables
Previous Message Robert Haas 2017-05-15 11:58:04 Re: UPDATE of partition key