Re: Estimate maintenance_work_mem for CREATE INDEX

From: Oleksandr Shulgin <oleksandr(dot)shulgin(at)zalando(dot)de>
To: pgsql-admin(at)lists(dot)postgresql(dot)org, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Estimate maintenance_work_mem for CREATE INDEX
Date: 2017-12-19 10:00:01
Message-ID: CACACo5SbX3e90EYMfp+UDYm5Db1TaPU7crFoxnFFWhnQe=A2_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-hackers

On Tue, Dec 19, 2017 at 10:47 AM, Oleksandr Shulgin <
oleksandr(dot)shulgin(at)zalando(dot)de> wrote:

> (cross-posting admin and hackers)
>
> Hello,
>
> I wonder if I'm alone in my wish to have a way for estimating how much
> maintenance work memory would suffice to allocate for a session when
> creating an index and avoid spilling to disk?
>
> Recently I had to re-create some indexes on a 9.6 server and I had some
> input on the on-disk index size: one was around 30 GB, the other -- a bit
> over 60 GB according to \di+ output. The total number of live tuples in
> the table itself was close to 1.3e+9, the table had an estimated 25% bloat.
>
> I had some spare memory on the machine so I've given it 60 GB for
> maintenance_work_mem and expected that at least the smaller of the two will
> fit in memory completely. To my surprise that didn't suffice and both
> indexes were building with some disk spill.
>
> Is anyone aware of a query to estimate the memory requirements for CREATE
> INDEX [CONCURRENTLY]?
>
> I've looked in the postgres wiki, but didn't find anything to that end.
> Nor searching the archives of pgsql-admin did help.
>
> I understand that there were some changes in recent releases related to
> memory allocation (e.g. allowing huge allocation in 9.4), but at least
> targeting 9.6 or 10 would make sense. There are also a lot of ways how one
> CREATE INDEX can be different from the other, but in the most simple case
> where you have fixed-width columns and building the full index (i.e. no
> WHERE clause), it should be possible.
>

Now I see I fail to mention this is the default btree index with all
default options. Obviously other indexes can be very different in memory
requirements.

Not hasting to look in the source to calculate all the sizeof()s yet:
> waiting on your reply and suggestions. ;-)
>

If there would be an option in the database itself to provide those
estimation, we wouldn't even need to figure out estimation queries.
"EXPLAIN CREATE INDEX" anyone?

Regards,
--
Oleksandr "Alex" Shulgin | Database Engineer | Zalando SE | Tel: +49 176
127-59-707

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Om Prakash Jaiswal 2017-12-19 10:25:35 Couldn't create plperl language
Previous Message Oleksandr Shulgin 2017-12-19 09:47:12 Estimate maintenance_work_mem for CREATE INDEX

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2017-12-19 10:01:05 Re: [HACKERS] parallel.c oblivion of worker-startup failures
Previous Message Magnus Hagander 2017-12-19 09:56:33 Basebackups reported as idle