Re: The Free Space Map: Problems and Opportunities

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Jan Wieck <jan(at)wi3ck(dot)info>, gregsmithpgsql(at)gmail(dot)com, Robert Haas <robertmhaas(at)gmail(dot)com>, John Naylor <john(dot)naylor(at)enterprisedb(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: The Free Space Map: Problems and Opportunities
Date: 2021-08-18 01:32:24
Message-ID: CAH2-WznV2AHcarZs2AcEn96ky_yvKt6MS7_CDXkTUEKaZvme3A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 17, 2021 at 11:24 AM Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> OK, I am trying to think of something simple we could test to see the
> benefit, with few downsides. I assume the case you are considering is
> that you have a 10 8k-page table, and one page is 80% full and the
> others are 81% full, and if several backends start adding rows at the
> same time, they will all choose the 80%-full page.

That's one problem (there is so many). They start with that same page,
and then fight each other over other pages again and again. This
emergent behavior is a consequence of the heuristic that has the FSM
look for the heap page with the least space that still satisfies a
given request. I mentioned this problem earlier.

> For example:
>
> 1. page with most freespace is 95% free
> 2. pages 2,4,6,8,10 have between 86%-95% free
> 3. five pages
> 4. proc id 14293 % 5 = 3 so use the third page from #2, page 6

Right now my main concern is giving backends/XIDs their own space, in
bulk. Mostly for concurrent bulk inserts, just so we can say that
we're getting the easy cases right -- that is a good starting point
for my prototype to target, I think. But I am also thinking about
stuff like this, as one way to address contention.

> This should spread out page usage to be more even, but still favor pages
> with more freespace. Yes, this is simplistic, but it would seem to have
> few downsides and I would be interested to see how much it helps.

I thought about having whole free lists that were owned by XIDs, as
well as shared free lists that are determined by hashing MyProcPid --
or something like that. So I agree that it might very well make
sense.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2021-08-18 01:39:09 Re: The Free Space Map: Problems and Opportunities
Previous Message Masahiko Sawada 2021-08-18 01:23:05 Re: Skipping logical replication transactions on subscriber side