Quick Links

Re: pgsql 10: hash indexes testing

From:	AP <ap(at)zip(dot)com(dot)au>
To:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: pgsql 10: hash indexes testing
Date:	2017-07-06 13:10:34
Message-ID:	20170706131034.vgnkkc3pbg7ugujk@zip.com.au
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, Jul 05, 2017 at 07:31:39PM +1000, AP wrote:
> On Tue, Jul 04, 2017 at 08:23:20PM -0700, Jeff Janes wrote:
> > On Tue, Jul 4, 2017 at 3:57 AM, AP <ap(at)zip(dot)com(dot)au> wrote:
> > > The data being indexed is BYTEA, (quasi)random and 64 bytes in size.
> > > The table has over 2 billion entries. The data is not unique. There's
> > > an average of 10 duplicates for every unique value.
> >
> > What is the number of duplicates for the most common value?
>
> Damn. Was going to collect this info as I was doing a fresh upload but
> it fell through the cracks of my mind. It'll probably take at least
> half a day to collect (a simple count(*) on the table takes 1.5-1.75
> hours parallelised across 11 processes) so I'll probably have this in
> around 24 hours if all goes well. (and I don't stuff up the SQL :) )

Well...

num_ids | count
---------+----------
1 | 91456442
2 | 56224976
4 | 14403515
16 | 13665967
3 | 12929363
17 | 12093367
15 | 10347006

So the most common is a unique value, then a dupe.

AP.

In response to

Re: pgsql 10: hash indexes testing at 2017-07-05 09:31:40 from AP

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Heikki Linnakangas	2017-07-06 13:48:26	Re: AdvanceXLInsertBuffer vs. WAL segment compressibility
Previous Message	Mithun Cy	2017-07-06 13:05:57	Re: Proposal : For Auto-Prewarm.