Re: Lowering the ever-growing heap->pd_lower

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lowering the ever-growing heap->pd_lower
Date: 2021-03-09 16:21:44
Message-ID: 2925B5E8-0F4F-446D-8735-A98C8772A309@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Mar 9, 2021, at 7:13 AM, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> wrote:
>
> Hi,
>
> The heap AMs' pages only grow their pd_linp array, and never shrink
> when trailing entries are marked unused. This means that up to 14% of
> free space (=291 unused line pointers) on a page could be unusable for
> data storage, which I think is a shame. With a patch in the works that
> allows the line pointer array to grow up to one third of the size of
> the page [0], it would be quite catastrophic for the available data
> space on old-and-often-used pages if this could not ever be reused for
> data.
>
> The shrinking of the line pointer array is already common practice in
> indexes (in which all LP_UNUSED items are removed), but this specific
> implementation cannot be used for heap pages due to ItemId
> invalidation. One available implementation, however, is that we
> truncate the end of this array, as mentioned in [1]. There was a
> warning at the top of PageRepairFragmentation about not removing
> unused line pointers, but I believe that was about not removing
> _intermediate_ unused line pointers (which would imply moving in-use
> line pointers); as far as I know there is nothing that relies on only
> growing page->pd_lower, and nothing keeping us from shrinking it
> whilst holding a pin on the page.
>
> Please find attached a fairly trivial patch for which detects the last
> unused entry on a page, and truncates the pd_linp array to that entry,
> effectively freeing 4 bytes per line pointer truncated away (up to
> 1164 bytes for pages with MaxHeapTuplesPerPage unused lp_unused
> lines).
>
> One unexpected benefit from this patch is that the PD_HAS_FREE_LINES
> hint bit optimization can now be false more often, increasing the
> chances of not having to check the whole array to find an empty spot.
>
> Note: This does _not_ move valid ItemIds, it only removes invalid
> (unused) ItemIds from the end of the space reserved for ItemIds on a
> page, keeping valid linepointers intact.
>
>
> Enjoy,
>
> Matthias van de Meent
>
> [0] https://www.postgresql.org/message-id/flat/CAD21AoD0SkE11fMw4jD4RENAwBMcw1wasVnwpJVw3tVqPOQgAw(at)mail(dot)gmail(dot)com
> [1] https://www.postgresql.org/message-id/CAEze2Wjf42g8Ho%3DYsC_OvyNE_ziM0ZkXg6wd9u5KVc2nTbbYXw%40mail.gmail.com
> <v1-0001-Truncate-a-pages-line-pointer-array-when-it-has-t.patch>

For a prior discussion on this topic:

https://www.postgresql.org/message-id/2e78013d0709130606l56539755wb9dbe17225ffe90a%40mail.gmail.com


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2021-03-09 16:22:32 Re: A problem about partitionwise join
Previous Message Tom Lane 2021-03-09 16:20:20 Re: Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]