Re: Problems with the FSM, heap fillfactor, and temporal locality

From: John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Problems with the FSM, heap fillfactor, and temporal locality
Date: 2020-08-21 10:09:59
Message-ID: CACPNZCvii_nZNL13QuzFAGQjH1OSnbeH+M1JHszWRSQEcXpWaA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 21, 2020 at 2:48 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> I'm concerned about how the FSM gives out pages to heapam. Disabling
> the FSM entirely helps TPC-C/BenchmarkSQL, which uses non-default heap
> fillfactors for most tables [1].

Hi Peter,

Interesting stuff. Is lower-than-default fillfactor important for the
behavior you see?

> located on the same blocks (or at least on neighboring blocks). It's
> possible that one orderline will span two neighboring blocks here, but
> it will never span three or more blocks. Each order has 5 - 15 order
> lines, and so I was surprised to see that a small minority or order
> line tuples end up occupying as many as 5 or 7 heap pages on the
> master branch (i.e. with the current FSM intact during bulk loading).

Hmm. You focus on FSM, but I'm thinking also multiple VM bits
potentially got cleared (maybe not this benchmark but other
scenarios).

> If the FSM tried to get free space from a close-by block, then we
> might at least see related updates that cannot fit a successor tuple
> on the same block behave in a coordinated fashion. We might at least
> have both updates relocate the successor tuple to the same
> mostly-empty block -- they both notice the same nearby free block, so
> both sets of successor tuples end up going on the same most-empty
> block. The two updating backends don't actually coordinate, of course
> -- this just happens as a consequence of looking for nearby free
> space.

I'm not sure I follow the "close-by" criterion. It doesn't seem
immediately relevant to what I understand as the main problem you
found, of free space. In other words, if we currently write to 5
blocks, but with smarter FSM logic we can find a single block, it
seems that should be preferred over close-by blocks each with less
space? Or are you envisioning scoring by both free space and distance
from the original block?

> supposed to be reserved. It should not be enough for the space used on
> the page to shrink by just 1% (to 89%). Maybe it should have to reach
> as low as 50% or less before we flip it back to "free to take space
> from for new unrelated tuples". The idea is that fill factor space is
> truly reserved for updates -- that should be "sticky" for all of this
> to work well.

Makes sense. If we go this route, I wonder if we should keep the
current behavior and use any free space if the fillfactor is 100%,
since that's in line with the intention. Also, the 50% (or whatever)
figure could be scaled depending on fillfactor. As in, check if
freespace > (100% - fillfactor * 0.6), or something.

I'm not sure how to distinguish blocks that have never reached
fillfactor vs. ones that did but haven't gained back enough free space
to be considered again. Naively, we could start by assuming the last
block can always be filled up to fillfactor, but earlier blocks must
use the stricter rule. That's easy since we already try the last block
anyway before extending the relation.

The flip side of this is: Why should vacuum be in a hurry to dirty a
page, emit WAL, and update the FSM if it only removes one dead tuple?
This presentation [1] (pages 35-43) from Masahiko Sawada had the idea
of a "garbage map", which keeps track of which parts of the heap have
the most dead space, and focus I/O efforts on those blocks first. It
may or may not be worth the extra complexity by itself, but it seems
it would work synergistically with your proposal: Updating related
tuples would concentrate dead tuples on fewer pages, and vacuums would
more quickly free up space that can actually be used to store multiple
new tuples.

[1] https://www.slideshare.net/masahikosawada98/vacuum-more-efficient-than-ever-99916671

--
John Naylor https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Emre Hasegeli 2020-08-21 11:00:45 Re: Bogus documentation for bogus geometric operators
Previous Message Pavel Stehule 2020-08-21 09:45:08 Re: proposal - function string_to_table