Re: 10.1: hash index size exploding on vacuum full analyze

From: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: AP <pgsql(at)inml(dot)weebeastie(dot)net>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: 10.1: hash index size exploding on vacuum full analyze
Date: 2017-12-19 12:00:24
Message-ID: CAE9k0PmTZKLA2hKAPT6OCinH+XX+XVSej3jx17j9SMBJr=FvkA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

On Sat, Dec 16, 2017 at 12:56 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Sat, Dec 16, 2017 at 12:27 PM, AP <pgsql(at)inml(dot)weebeastie(dot)net> wrote:
> > On Sat, Dec 16, 2017 at 09:08:23AM +0530, Amit Kapila wrote:
> >> On Fri, Dec 15, 2017 at 8:08 PM, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:
> >> The estimation depends on the type of columns and stats. I think we
> >> need to use schema and stats in the way AP is using to see the effect
> >> AP is seeing. I was under impression that AP will help us in
> >> verifying the problem as he can reproduce it, but it seems he is busy.
> >
> > Different fires keep springing up and they are causing delay. This
> > is still on my mind and I'll get back to it as soon as I can.
> >
>
> Okay. I think Ashutosh has reproduced it with a standalone test, let
> us see if that suffice the need. In any case, feel free to verify in
> the meantime.

I am able to reproduce the problem reported by AP with the following
test-case. My earlier test-case-[1] was also helpful in diagnosing the
problem and understanding Amit's patch -[2] but this test-case exposes
the problem more than my earlier test-case.

test-case
=======
1) CREATE TABLE hash_index_table(datum_id BYTEA NOT NULL);
2) INSERT INTO hash_index_table SELECT repeat(stringu1, 30)::bytea FROM tenk1;

3) ALTER TABLE hash_index_table ALTER COLUMN datum_id SET STATISTICS 0;

4) ANALYZE hash_index_table;

5) CREATE INDEX hash_index ON hash_index_table USING hash (datum_id);

6) select oid, relname, relpages, reltuples from pg_class where
relname = 'hash_index';

7) DROP INDEX hash_index;

8) CREATE INDEX hash_index ON hash_index_table USING hash (datum_id);

9) select oid, relname, relpages, reltuples from pg_class where
relname = 'hash_index';

10) VACUUM FULL hash_index_table;

11) select oid, relname, relpages, reltuples from pg_class where
relname = 'hash_index';

The output of above two test-cases without patch is as follows,

Output (without patch):
================
postgres[72965]=# select oid, relname, relpages, reltuples from
pg_class where relname = 'hash_index';
+--------+------------+----------+-----------+
| oid | relname | relpages | reltuples |
+--------+------------+----------+-----------+
| 287169 | hash_index | 69 | 10000 |
+--------+------------+----------+-----------+
(1 row)

Time: 0.381 ms
postgres[72965]=#
postgres[72965]=# VACUUM FULL hash_index_table;
VACUUM
Time: 55.703 ms
postgres[72965]=#
postgres[72965]=# select oid, relname, relpages, reltuples from
pg_class where relname = 'hash_index';
+--------+------------+----------+-----------+
| oid | relname | relpages | reltuples |
+--------+------------+----------+-----------+
| 287169 | hash_index | 130 | 10000 |
+--------+------------+----------+-----------+
(1 row)

Time: 0.904 ms

Output (with patch):
==============
postgres[85460]=# select oid, relname, relpages, reltuples from
pg_class where relname = 'hash_index';
+-------+------------+----------+-----------+
| oid | relname | relpages | reltuples |
+-------+------------+----------+-----------+
| 26394 | hash_index | 69 | 10000 |
+-------+------------+----------+-----------+
(1 row)

Time: 0.370 ms
postgres[85460]=#
postgres[85460]=# VACUUM FULL hash_index_table;
VACUUM
Time: 68.351 ms
postgres[85460]=#
postgres[85460]=# select oid, relname, relpages, reltuples from
pg_class where relname = 'hash_index';
+-------+------------+----------+-----------+
| oid | relname | relpages | reltuples |
+-------+------------+----------+-----------+
| 26394 | hash_index | 69 | 10000 |
+-------+------------+----------+-----------+
(1 row)

Time: 0.838 ms

Please note that i have tried running above test-case both with and
without ANALYZE hash_index_table (step #4) and the problem is observed
in both the cases.

[1]- https://www.postgresql.org/message-id/CAE9k0P%3DihFyPAKfrMX9GaDo5RaeGSJ4i4nb28fGev15wKOPYog%40mail.gmail.com

[2]- https://www.postgresql.org/message-id/CAA4eK1%2B6BSGrm%2BNtUDhx59CNR51Ehbnmch9LpswMEyrLBBjKLg%40mail.gmail.com

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

>
>
> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Greg Stark 2017-12-19 16:43:46 Re: BUG #14891: Old cancel request presented by pgbouncer honored after skipping a query.
Previous Message Michael Paquier 2017-12-19 07:14:05 Re: BUG #14973: hung queries