Re: pgsql 10: hash indexes testing

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: AP <ap(at)zip(dot)com(dot)au>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgsql 10: hash indexes testing
Date: 2017-07-05 04:59:09
Message-ID: CAA4eK1KrDmAWHjMj6ovc6R-C72vXW2wJ=w0jUNub7EYVOD6Kvg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 5, 2017 at 9:53 AM, AP <ap(at)zip(dot)com(dot)au> wrote:
> On Wed, Jul 05, 2017 at 08:10:10AM +0530, Amit Kapila wrote:
>> On Tue, Jul 4, 2017 at 4:27 PM, AP <ap(at)zip(dot)com(dot)au> wrote:
>> > There is one index that caused an issue. Towards the end of an import
>> > I got the following error:
>> >
>> > out of overflow pages in hash index
>> >
>> > The data being indexed is BYTEA, (quasi)random and 64 bytes in size.
>> > The table has over 2 billion entries. The data is not unique. There's
>> > an average of 10 duplicates for every unique value.
>> >
>> > Is this a valid error message or have I spotted a bug?
>>
>> It is difficult to say at this stage, but I think we can figure out.
>> We can get such a message if we consume the maximum overflow pages
>> that hash index can support which is limited by a number of
>> bitmappages. Can you try to use pgstattuple extension and get us the
>> results of Select * from pgstathashindex('index_name');? If the
>> number of bitmappages is 128 and total overflow pages are 128 * 4096,
>> then that would mean that all the pages are used. Then maybe we can
>
> Hmm. Unless I misunderstood that'd mean that overflow_pages/4096 should
> result in a number <= 128 at the moment, right?
>

No, sorry, I think my calculation above has something missing. It
should be 128 * 4096 * 8. How we can compute this number is
no_bitmap_pages * no_bits_used_to_represent_overflow_pages.

In each bitmap page (8K), we can use 4K (to meet power of 2 criteria)
to represent overflow pages. We use one bit to represent each
overflow page. Also, you need to include bitmap pages in overflow
pages as we consider those as a subset of the overflow pages.

>If so then something is
> amiss:
>
> # select * from pgstathashindex('link_datum_id_hash_idx');
> version | bucket_pages | overflow_pages | bitmap_pages | unused_pages | live_items | dead_items | free_percent
> ---------+--------------+----------------+--------------+--------------+------------+------------+------------------
> 3 | 10485760 | 2131192 | 66 | 0 | 2975444240 | 0 | 1065.19942179026
> (1 row)
>
> oldmdstash=# select 2131192/4096;
> ?column?
> ----------
> 520
> (1 row)
>

You need to divide 520 by 8 to get the bitmap page. Is this the index
in which you get the error or is this the one on which you have done
REINDEX?

> And I do appear to have an odd percentage of free space. :)
>

It looks like Vacuum hasn't been triggered.

> This index was created yesterday so it has been around for maybe 18 hours.
> Autovac is likely to have hit it by now.
>

Do you have any deletes? How have you verified whether autovacuum has
been triggered or not?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ryan Murphy 2017-07-05 05:29:14 Re: Incorrect mentions to pg_xlog in walmethods.c/h
Previous Message AP 2017-07-05 04:23:23 Re: pgsql 10: hash indexes testing