Re: non-HOT update not looking at FSM for large tuple update

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: Floris Van Nee <florisvannee(at)optiver(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: non-HOT update not looking at FSM for large tuple update
Date: 2021-02-24 21:19:47
Message-ID: CAFBsxsFM_by1URS+guGrCWY2yp6YutQMcOZD7tdMDUfHpwjKLQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 24, 2021 at 4:52 PM Floris Van Nee <florisvannee(at)optiver(dot)com>
wrote:

> I also understand the temptation to define it based on the relation's
fill factor, as you did in the patch. However, upon some further thought I
wonder if that's useful? A relation with a higher fill factor will have a
lower 'saveFreeSpace' variable, so it's less likely to run into issues in
finding a fresh page, except if the tuple you're inserting/updating is even
larger. However, if that case happens, you'll still be wanting to look for
a page that's completely empty (except for the line items). So the only
proper metric is 'how many unused line items do we expect on empty pages'
and the fillfactor doesn't say much about this. Since this is probably
difficult to estimate at all, we may be better off just defining it off
MaxHeapTupleSize completely?
> For example, we expect 1.5% of the page could be line items, then:
>
> targetFreeSpace = MaxHeapTupleSize * 0.985

That makes sense, although the exact number seems precisely tailored to
your use case. 2% gives 164 bytes of slack and doesn't seem too large.
Updated patch attached.

--
John Naylor
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
v2-allow-inserting-tuples-into-almost-empty-pages.patch application/octet-stream 3.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2021-02-24 21:23:19 Re: Asynchronous and "direct" IO support for PostgreSQL.
Previous Message Andres Freund 2021-02-24 21:03:16 Re: Asynchronous and "direct" IO support for PostgreSQL.