shared buffer manager problems and redesign

From: "Jamison, Kirk" <k(dot)jamison(at)jp(dot)fujitsu(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: shared buffer manager problems and redesign
Date: 2018-10-03 09:46:04
Message-ID: D09B13F772D2274BB348A310EE3027C637D787@g01jpexmbkw24
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, hackers

(Actually I’m not sure if I should post it here or in pgsql-general mailing list)
It's been discussed again a few times recently regarding the time-consuming behavior when mapping shared buffers
that happens in TRUNCATE, VACUUM when deleting the trailing empty pages in the shared buffer [1],
data corruption on file truncation error [2], etc.

Buffer Manager redesign/restructure

Andres Freund has been working on this before and described design and methods on how the buffer radix tree can be implemented. [4]
I think it is worth considering because there were a lot of proposed solutions in previous threads [1] [2] [3] [4],
but we could not arrive at consensus, as it has been pointed out that some methods lead to more complexities.
The ordered buffer mapping implementation (changing the current buffer mapping implementation) would always pop out
in these discussions as it would potentially address the problems.

But before we can work on POC patch, I think we should start a common discussion for potential

a.) data structure design of the modified buffer manager: open relations hash table, buffer radix tree, etc.

b.) buffer tag, locks

c.) implementation, operations (loading pages, flushing/writing out buffers), complexities, etc.

However, from what I understood, realistically speaking, it’s not possible to have it committed by PG12 given the complexity and time.
There is also question of how to resolve some/part of these problems like a potential solution without redesigning the shared buffer manager.
So, I really find it really important to be discussed soon.

I hope to hear more insights, ideas, suggestions, truth-bombs, or so. :)

Thank you very much.

Regards,
Kirk

References
[1] "reloption to prevent VACUUM from truncating empty pages at the end of relation"
https://www.postgresql.org/message-id/flat/CAHGQGwE5UqFqSq1%3DkV3QtTUtXphTdyHA-8rAj4A%3DY%2Be4kyp3BQ%40mail.gmail.com
[2] "Truncation failure in autovacuum results in data corruption (duplicate keys)"
https://www.postgresql.org/message-id/flat/5BBC590AE8DF4ED1A170E4D48F1B53AC(at)tunaPC
[3] "WIP: long transactions on hot standby feedback replica / proof of concept"
https://www.postgresql.org/message-id/flat/c9374921e50a5e8fb1ecf04eb8c6ebc3%40postgrespro.ru
[4] "Reducing the size of BufferTag & remodeling forks"
https://www.postgresql.org/message-id/flat/20150702133619.GB16267%40alap3.anarazel.de

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Klychkov 2018-10-03 11:51:29 Re[2]: Alter index rename concurrently to
Previous Message Chris Travers 2018-10-03 09:02:29 Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2