Lowering the ever-growing heap->pd_lower

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Lowering the ever-growing heap->pd_lower
Date: 2021-03-09 15:13:20
Message-ID: CAEze2WjgaQc55Y5f5CQd3L=eS5CZcff2Obxp=O6pto8-f0hC4w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

The heap AMs' pages only grow their pd_linp array, and never shrink
when trailing entries are marked unused. This means that up to 14% of
free space (=291 unused line pointers) on a page could be unusable for
data storage, which I think is a shame. With a patch in the works that
allows the line pointer array to grow up to one third of the size of
the page [0], it would be quite catastrophic for the available data
space on old-and-often-used pages if this could not ever be reused for
data.

The shrinking of the line pointer array is already common practice in
indexes (in which all LP_UNUSED items are removed), but this specific
implementation cannot be used for heap pages due to ItemId
invalidation. One available implementation, however, is that we
truncate the end of this array, as mentioned in [1]. There was a
warning at the top of PageRepairFragmentation about not removing
unused line pointers, but I believe that was about not removing
_intermediate_ unused line pointers (which would imply moving in-use
line pointers); as far as I know there is nothing that relies on only
growing page->pd_lower, and nothing keeping us from shrinking it
whilst holding a pin on the page.

Please find attached a fairly trivial patch for which detects the last
unused entry on a page, and truncates the pd_linp array to that entry,
effectively freeing 4 bytes per line pointer truncated away (up to
1164 bytes for pages with MaxHeapTuplesPerPage unused lp_unused
lines).

One unexpected benefit from this patch is that the PD_HAS_FREE_LINES
hint bit optimization can now be false more often, increasing the
chances of not having to check the whole array to find an empty spot.

Note: This does _not_ move valid ItemIds, it only removes invalid
(unused) ItemIds from the end of the space reserved for ItemIds on a
page, keeping valid linepointers intact.

Enjoy,

Matthias van de Meent

[0] https://www.postgresql.org/message-id/flat/CAD21AoD0SkE11fMw4jD4RENAwBMcw1wasVnwpJVw3tVqPOQgAw(at)mail(dot)gmail(dot)com
[1] https://www.postgresql.org/message-id/CAEze2Wjf42g8Ho%3DYsC_OvyNE_ziM0ZkXg6wd9u5KVc2nTbbYXw%40mail.gmail.com

Attachment Content-Type Size
v1-0001-Truncate-a-pages-line-pointer-array-when-it-has-t.patch text/x-patch 2.4 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Georgios Kokolatos 2021-03-09 15:53:09 Re: Allow batched insert during cross-partition updates
Previous Message David G. Johnston 2021-03-09 15:08:58 Re: DROP relation IF EXISTS Docs and Tests - Bug Fix