Re: pgsql 10: hash indexes testing

From: AP <ap(at)zip(dot)com(dot)au>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgsql 10: hash indexes testing
Date: 2017-08-02 23:48:29
Message-ID: 20170802234829.riodbytcuhr4slyu@zip.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 02, 2017 at 11:34:13AM -0400, Robert Haas wrote:
> On Wed, Jul 12, 2017 at 1:10 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > It seems so. Basically, in the case of a large number of duplicates,
> > we hit the maximum number of overflow pages. There is a theoretical
> > possibility of hitting it but it could also happen that we are not
> > free the existing unused overflow pages due to which it keeps on
> > growing and hit the limit. I have requested up thread to verify if
> > that is happening in this case and I am still waiting for same. The
> > squeeze operation does free such unused overflow pages after cleaning
> > them. As this is a costly operation and needs a cleanup lock, so we
> > currently perform it only during Vacuum and next split from the bucket
> > which can have redundant overflow pages.
>
> Oops. It was rather short-sighted of us not to increase
> HASH_MAX_BITMAPS when we bumped HASH_VERSION. Actually removing that
> limit is hard, but we could have easily bumped it for 128 to say 1024
> without (I think) causing any problem, which would have given us quite
> a bit of headroom here. I suppose we could still try to jam that
> change in before beta3 (bumping HASH_VERSION again) but that might be
> asking for trouble.

I, for one, would be grateful for such a bump (or better). Currently
having more fun than I wish trying to figure out where my inserts will
begin to fail so that I don't make a mess of things. Right now I can't
match data going in with sane partitioning points in-storage. If I can
go "3 months worth of data then partition" or the like and have enough
room for differences in the data then I can be pretty happy. ATM I'm
not getting anywhere near that and am tempted to chuck it all in, eat
the 3-4x disk space cost and go back to btree which'd cost me terrabytes.

AP

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2017-08-03 00:30:57 Re: asynchronous execution
Previous Message Tatsuo Ishii 2017-08-02 23:01:35 Re: Confusing error message in pgbench