Re: non-HOT update not looking at FSM for large tuple update

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Floris Van Nee <florisvannee(at)optiver(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: non-HOT update not looking at FSM for large tuple update
Date: 2021-03-18 21:30:36
Message-ID: CAEze2WhHcC3qusCZL_YUL5xUSA4Z5ztthvDxyVPEaSUJfGrcaw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 17 Mar 2021 at 21:52, John Naylor <john(dot)naylor(at)enterprisedb(dot)com> wrote:
>
> On Fri, Mar 12, 2021 at 8:45 AM Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> wrote:
> >
> > If this case isn't added, the lower else branch will fail to find
> > fitting pages for len > maxPaddedFsmRequest tuples; potentially
> > extending the relation when there is actually still enough space
> > available.
>
> Okay, with that it looks better to go back to using Max().
>
> Also in v4:
>
> - With the separate constant you suggested, I split up the comment into two parts.

> + * The minimum space on a page for it to be considered "empty" regardless
> + * of fillfactor. We base this on MaxHeapTupleSize, minus a small amount
> + * of slack. We allow slack equal to 1/8 the maximum space that could be
> + * taken by line pointers, which is somewhat arbitrary.

> + * We want to allow inserting a large tuple into an empty page even if
> + * that would violate the fillfactor. Otherwise, we would unnecessarily
> + * extend the relation. Instead, ask the FSM for maxPaddedFsmRequest
> + * bytes. This will allow it to return a page that is not quite empty
> + * because of unused line pointers

How about

+ * Because pages that have no items left can still have space allocated
+ * to item pointers, we consider pages "empty" for FSM requests when they
+ * have at most 1/8 of their MaxHeapTuplesPerPage item pointers' space
+ * allocated. This is a somewhat arbitrary number, but should prevent
+ * most unnecessary relation extensions due to not finding "empty" pages
+ * while inserting combinations of large tuples with low fillfactors.

+ * When the free space to be requested from the FSM is greater than
+ * maxPaddedFsmRequest, this is considered equivalent to requesting
+ * insertion on an "empty" page, so instead of failing to find a page
+ * with more empty space than an "empty" page and extend the relation,
+ * we try to find a page which is considered "emtpy".

This is slightly more verbose, but I think this clarifies the
reasoning why we need this a bit better. Feel free to reject or adapt
as needed.

> - I've added a regression test to insert.sql similar to your earlier test, but we cannot use vacuum, since in parallel tests there could still be tuples visible to other transactions. It's still possible to test almost-all-free by inserting a small tuple.
> - Draft commit message

Apart from these mainly readability changes in comments, I think this is ready.

> --
> John Naylor
> EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Euler Taveira 2021-03-18 21:44:53 Re: cleanup temporary files after crash
Previous Message Tomas Vondra 2021-03-18 21:14:03 Re: GROUP BY DISTINCT