Re: POC: Cleaning up orphaned files using undo logs

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: POC: Cleaning up orphaned files using undo logs
Date: 2019-08-20 18:36:19
Message-ID: CA+Tgmoakh7WvD=be=Gc0oF9mUp1gOS4Dwz6qRfbEMBkXB4XFOw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 20, 2019 at 5:02 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> 3. UndoLogDiscard() uses DiscardBuffer() to invalidate any currently
> unpinned buffers, and marks as BM_DISCARDED any that happen to be
> pinned right now, so they can't be immediately invalidated. Such
> buffers are never written back and are eligible for reuse on the next
> clock sweep, even if they're written into by a backend that managed to
> do that when we were trying to discard.

This is definitely more concurrent, but it might be *too much*
concurrency. Suppose that backend #1 is inserting a row and updating
the transaction header for the previous transaction; meanwhile,
backend #2 is discarding the previous transaction. It could happen
that backend #1 locks the transaction header for the previous
transaction and is all set to log the insertion ... but then gets
context-switched out. Now backend #2 swoops in and logs the discard.
Backend #1 now wakes up and finishes logging a change to a page that,
according to the logic of the WAL stream, no longer exists.

It's probably possible to make this work by ignoring WAL references to
discarded pages during replay, but that seems a bit dangerous. At
least, it loses some sanity checking that you might like to have.

It seems to me that you can avoid this if you require that a backend
that wants to set BM_DISCARDED to acquire at least a shared content
lock before doing so. If you do that, then once a backend acquires
content lock(s) on the page(s) containing the transaction header for
the purposes of updating it, it can notice that the BM_DISCARDED flag
is set and choose not to update those pages after all. I think that
would be a smart design choice.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2019-08-20 19:06:15 Re: errbacktrace
Previous Message Andres Freund 2019-08-20 17:45:16 Re: POC: Cleaning up orphaned files using undo logs