Quick Links

Re: GIN improvements part 1: additional information

From:	Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To:	obartunov(at)gmail(dot)com
Cc:	Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tomas Vondra <tv(at)fuzzy(dot)cz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: GIN improvements part 1: additional information
Date:	2013-12-19 08:44:07
Message-ID:	52B2B1D7.2050802@vmware.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 12/19/2013 08:37 AM, Oleg Bartunov wrote:
> Guys,
>
> before digging deep into the art of comp/decomp world I'd like to know
> if you familiar with results of
> http://wwwconference.org/www2008/papers/pdf/p387-zhangA.pdf paper and
> some newer research ?

Yeah, I saw that paper.

> Do we agree in what we really want ? Basically,
> there are three main features: size, compression, decompression speed
> - we should take two :)

According to that Zhang et al paper you linked, the Vbyte method
actually performs the worst on all of those measures. The other
algorithms are quite similar in terms of size (PForDelta being the most
efficient), while PForDelta is significantly faster to compress/decompress.

Just by looking at those numbers, PForDelta looks like a clear winner.
However, it operates on much bigger batches than the other algorithms; I
haven't looked at it in detail, but Zhang et al used 128 integer
batches, and they say that 32 integers is the minimum batch size. If we
want to use it for the inline posting lists stored in entry tuples, that
would be quite wasteful if there are only a few item pointers on the tuple.

Also, in the tests I've run, the compression/decompression speed is not
a significant factor in total performance, with either varbyte encoding
or Simple9-like encoding I hacked together.

Actually, now that I think about this a bit more, maybe we should go
with Rice encoding after all? It's the most efficient in terms of size,
and I believe it would be fast enough.

> Should we design sort of plugin, which could support independent
> storage on disk, so users can apply different techniques, depending on
> data.
>
> What I want to say is that we certainly can play with this very
> challenged task, but we have limited time before 9.4 and we should
> think in positive direction.

Once we have the code in place to deal with one encoding, it's easy to
switch the implementation. Making it user-configurable or pluggable
would be overkill IMHO.

What I'm saying is that we should make sure we get the page format right
(in particular, I strongly feel we should use the self-contained
PostingListSegment struct instead of item-indees that I mentioned in the
other post), with the implementation details hidden in the functions in
ginpostinglist.c. Then we can easily experiment with different algorithms.

- Heikki

In response to

Re: GIN improvements part 1: additional information at 2013-12-19 06:37:06 from Oleg Bartunov

Responses

Re: GIN improvements part 1: additional information at 2013-12-20 15:59:06 from Heikki Linnakangas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bernd Helmle	2013-12-19 10:03:06	Re: pgsql: Allow time delayed standbys and recovery
Previous Message	Pavel Stehule	2013-12-19 08:05:29	Re: array_length(anyarray)