Re: GIN improvements part 1: additional information

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Subject: Re: GIN improvements part 1: additional information
Date: 2014-01-13 17:07:24
Message-ID: CAPpHfdskbPbjWYJhkd-FkZCKEUZd03PRAw05NyT0Hd-jxWOyfg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jan 11, 2014 at 6:15 AM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:

> On 8.1.2014 22:58, Alexander Korotkov wrote:
> > Thanks for reporting. Fixed version is attached.
>
> I've tried to rerun the 'archie' benchmark with the current patch, and
> once again I got
>
> PANIC: could not split GIN page, didn't fit
>
> I reran it with '--enable-cassert' and with that I got
>
> TRAP: FailedAssertion("!(ginCompareItemPointers(&items[i - 1],
> &items[i]) < 0)", File: "gindatapage.c", Line: 149)
> LOG: server process (PID 5364) was terminated by signal 6: Aborted
> DETAIL: Failed process was running: INSERT INTO messages ...
>
> so the assert in GinDataLeafPageGetUncompressed fails for some reason.
>
> I can easily reproduce it, but my knowledge in this area is rather
> limited so I'm not entirely sure what to look for.

I've fixed this bug and many other bug. Now patch passes test suite that
I've used earlier. The results are so:

Operations time:
event | period
-----------------------+-----------------
index_build | 00:01:47.53915
index_build_recovery | 00:00:04
index_update | 00:05:24.388163
index_update_recovery | 00:00:53
search_new | 00:24:02.289384
search_updated | 00:27:09.193343
(6 rows)

Index sizes:
label | size
---------------+-----------
new | 384761856
after_updates | 667942912
(2 rows)

Also, I made following changes in algorithms:

- Now, there is a limit to number of uncompressed TIDs in the page.
After reaching this limit, they are encoded independent on if they can fit
page. That seems to me more desirable behaviour and somehow it accelerates
search speed. Before this change times were following:

event | period
-----------------------+-----------------
index_build | 00:01:51.467888
index_build_recovery | 00:00:04
index_update | 00:05:03.315155
index_update_recovery | 00:00:51
search_new | 00:24:43.194882
search_updated | 00:28:36.316784
(6 rows)

- Page are not fully re-encoded if it's enough to re-encode just last
segment.

README is updated.

------
With best regards,
Alexander Korotkov.

Attachment Content-Type Size
gin-packed-postinglists-varbyte5.patch.gz application/x-gzip 30.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2014-01-13 17:17:12 KNN-GiST with recheck
Previous Message Mel Gorman 2014-01-13 16:42:21 Linux kernel impact on PostgreSQL performance