Re: 10.1: hash index size exploding on vacuum full analyze

From: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: AP <pgsql(at)inml(dot)weebeastie(dot)net>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: 10.1: hash index size exploding on vacuum full analyze
Date: 2017-12-25 07:43:19
Message-ID: CAE9k0Pn52GXCWtCVDXjgKOjLn57obKKe8M0kOrxgoruefqqzRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Dec 19, 2017 at 5:30 PM, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote:
> Hi,
>
> On Sat, Dec 16, 2017 at 12:56 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>
>> On Sat, Dec 16, 2017 at 12:27 PM, AP <pgsql(at)inml(dot)weebeastie(dot)net> wrote:
>> > On Sat, Dec 16, 2017 at 09:08:23AM +0530, Amit Kapila wrote:
>> >> On Fri, Dec 15, 2017 at 8:08 PM, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:
>> >> The estimation depends on the type of columns and stats. I think we
>> >> need to use schema and stats in the way AP is using to see the effect
>> >> AP is seeing. I was under impression that AP will help us in
>> >> verifying the problem as he can reproduce it, but it seems he is busy.
>> >
>> > Different fires keep springing up and they are causing delay. This
>> > is still on my mind and I'll get back to it as soon as I can.
>> >
>>
>> Okay. I think Ashutosh has reproduced it with a standalone test, let
>> us see if that suffice the need. In any case, feel free to verify in
>> the meantime.
>
> I am able to reproduce the problem reported by AP with the following
> test-case. My earlier test-case-[1] was also helpful in diagnosing the
> problem and understanding Amit's patch -[2] but this test-case exposes
> the problem more than my earlier test-case.
>
> test-case
> =======
> 1) CREATE TABLE hash_index_table(datum_id BYTEA NOT NULL);
> 2) INSERT INTO hash_index_table SELECT repeat(stringu1, 30)::bytea FROM tenk1;
>
> 3) ALTER TABLE hash_index_table ALTER COLUMN datum_id SET STATISTICS 0;
>
> 4) ANALYZE hash_index_table;
>
> 5) CREATE INDEX hash_index ON hash_index_table USING hash (datum_id);
>
> 6) select oid, relname, relpages, reltuples from pg_class where
> relname = 'hash_index';
>
> 7) DROP INDEX hash_index;
>
> 8) CREATE INDEX hash_index ON hash_index_table USING hash (datum_id);
>
> 9) select oid, relname, relpages, reltuples from pg_class where
> relname = 'hash_index';
>
> 10) VACUUM FULL hash_index_table;
>
> 11) select oid, relname, relpages, reltuples from pg_class where
> relname = 'hash_index';
>
> The output of above two test-cases without patch is as follows,
>
> Output (without patch):
> ================
> postgres[72965]=# select oid, relname, relpages, reltuples from
> pg_class where relname = 'hash_index';
> +--------+------------+----------+-----------+
> | oid | relname | relpages | reltuples |
> +--------+------------+----------+-----------+
> | 287169 | hash_index | 69 | 10000 |
> +--------+------------+----------+-----------+
> (1 row)
>
> Time: 0.381 ms
> postgres[72965]=#
> postgres[72965]=# VACUUM FULL hash_index_table;
> VACUUM
> Time: 55.703 ms
> postgres[72965]=#
> postgres[72965]=# select oid, relname, relpages, reltuples from
> pg_class where relname = 'hash_index';
> +--------+------------+----------+-----------+
> | oid | relname | relpages | reltuples |
> +--------+------------+----------+-----------+
> | 287169 | hash_index | 130 | 10000 |
> +--------+------------+----------+-----------+
> (1 row)
>
> Time: 0.904 ms
>
> Output (with patch):
> ==============
> postgres[85460]=# select oid, relname, relpages, reltuples from
> pg_class where relname = 'hash_index';
> +-------+------------+----------+-----------+
> | oid | relname | relpages | reltuples |
> +-------+------------+----------+-----------+
> | 26394 | hash_index | 69 | 10000 |
> +-------+------------+----------+-----------+
> (1 row)
>
> Time: 0.370 ms
> postgres[85460]=#
> postgres[85460]=# VACUUM FULL hash_index_table;
> VACUUM
> Time: 68.351 ms
> postgres[85460]=#
> postgres[85460]=# select oid, relname, relpages, reltuples from
> pg_class where relname = 'hash_index';
> +-------+------------+----------+-----------+
> | oid | relname | relpages | reltuples |
> +-------+------------+----------+-----------+
> | 26394 | hash_index | 69 | 10000 |
> +-------+------------+----------+-----------+
> (1 row)
>
> Time: 0.838 ms
>
> Please note that i have tried running above test-case both with and
> without ANALYZE hash_index_table (step #4) and the problem is observed
> in both the cases.
>
> [1]- https://www.postgresql.org/message-id/CAE9k0P%3DihFyPAKfrMX9GaDo5RaeGSJ4i4nb28fGev15wKOPYog%40mail.gmail.com
>
> [2]- https://www.postgresql.org/message-id/CAA4eK1%2B6BSGrm%2BNtUDhx59CNR51Ehbnmch9LpswMEyrLBBjKLg%40mail.gmail.com
>

I'm not sure when AP is planning to share his test-results. But, I've
shared mine test-results -[1] and also reviewed the patch. As
mentioned in my earlier update -[1], the patch looks good to me and it
fixes the issue. I am therefore moving the patch to 'Ready for
Committer'. Thanks.

[1]-https://www.postgresql.org/message-id/CAE9k0PmTZKLA2hKAPT6OCinH%2BXX%2BXVSej3jx17j9SMBJr%3DFvkA%40mail.gmail.com

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

>
>>
>>
>> --
>> With Regards,
>> Amit Kapila.
>> EnterpriseDB: http://www.enterprisedb.com
>>

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Devrim Gündüz 2017-12-26 11:38:55 Re: BUG #14991: postgis-2.4.so: undefined symbol: GEOSMinimumClearance
Previous Message yunlong.gao 2017-12-25 02:34:48 Re: May be a jsonb type bug