Remove size limitations of vacuums dead_tuples array

From: Ants Aasma <ants(at)cybertec(dot)at>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Remove size limitations of vacuums dead_tuples array
Date: 2019-10-09 12:58:11
Message-ID: CANwKhkO7+FzZsS_cbDFRKb-M3vVhWksfEcZamnjDqxDF6XgCUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

When dealing with a case where a 2TB table had 3 billion dead tuples I
discovered that vacuum currently can't make use of more than 1GB of
maintenance_work_mem - 179M tuples. This caused excessive amounts of index
scanning even though there was plenty of memory available.

I didn't see any good reason for having this limit, so here is a patch that
makes use of MemoryContextAllocHuge, and converts the array indexing to use
size_t to lift a second limit at 12GB.

One potential problem with allowing larger arrays is that bsearch might no
longer be the best way of determining if a ctid was marked dead. It might
pay off to convert the dead tuples array to a hash table to avoid O(n log
n) runtime when scanning indexes. I haven't done any profiling yet to see
how big of a problem this is.

Second issue I noticed is that the dead_tuples array is always allocated
max allowed size, unless the table can't possibly have that many tuples. It
may make sense to allocate it based on estimated number of dead tuples and
resize if needed.

Regards,
Ants Aasma
Web: https://www.cybertec-postgresql.com

Attachment Content-Type Size
0001-Allow-vacuum-to-use-more-than-1GB-of-memory.patch text/x-patch 5.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-10-09 13:24:36 Re: BUG #16045: vacuum_db crash and illegal memory alloc after pg_upgrade from PG11 to PG12
Previous Message Stephen Frost 2019-10-09 12:50:18 Re: Standby accepts recovery_target_timeline setting?