Quick Links

Re: [WIP] Effective storage of duplicates in B-tree index.

From:	Thom Brown <thom(at)linux(dot)com>
To:	Peter Geoghegan <pg(at)heroku(dot)com>
Cc:	Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [WIP] Effective storage of duplicates in B-tree index.
Date:	2016-01-28 17:14:10
Message-ID:	CAA-aLv610dx+4KyTCpKhvX+vSSJcO7i7BwhTz4jJCyEX2k1rwA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 28 January 2016 at 17:09, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> On Thu, Jan 28, 2016 at 9:03 AM, Thom Brown <thom(at)linux(dot)com> wrote:
>> I'm surprised that efficiencies can't be realised beyond this point. Your results show a sweet spot at around 1000 / 10000000, with it getting slightly worse beyond that. I kind of expected a lot of efficiency where all the values are the same, but perhaps that's due to my lack of understanding regarding the way they're being stored.
>
> I think that you'd need an I/O bound workload to see significant
> benefits. That seems unsurprising. I believe that random I/O from
> index writes is a big problem for us.

I was thinking more from the point of view of the index size. An
index containing 10 million duplicate values is around 40% of the size
of an index with 10 million unique values.

Thom

In response to

Re: [WIP] Effective storage of duplicates in B-tree index. at 2016-01-28 17:09:36 from Peter Geoghegan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Stefan Kaltenbrunner	2016-01-28 17:16:31	Re: HEADSUP: gitmaster.postgresql.org - upgrade NOW
Previous Message	Peter Geoghegan	2016-01-28 17:09:36	Re: [WIP] Effective storage of duplicates in B-tree index.