Re: POC: Cleaning up orphaned files using undo logs

From: Shawn Debnath <sdn(at)amazon(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: Cleaning up orphaned files using undo logs
Date: 2019-04-23 17:34:59
Message-ID: 20190423173459.GA38699@f01898859afd.ant.amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 13, 2019 at 02:20:29AM +1300, Thomas Munro wrote:

> === 0002 "Add SmgrId to smgropen() and BufferTag." ===
>
> This is new, and is based on the discussion from another recent
> thread[1] about how we should identify buffers belonging to different
> storage managers. In earlier versions of the patch-set I had used a
> special reserved DB OID for undo data. Tom Lane didn't like that idea
> much, and Anton Shyrabokau (via Shawn Debnath) suggested making
> ForkNumber narrower so we can add a new field to BufferTag, and Andres
> Freund +1'd my proposal to add the extra value as a parameter to
> smgropen(). So, here is a patch that tries those ideas.
>
> Another way to do this would be to widen RelFileNode instead, to avoid
> having to pass around the SMGR ID separately in various places.
> Looking at the number of places that have to chance, you can probably
> see why we wanted to use a magic DB OID instead, and I'm not entirely
> convinced that it wasn't better that way, or that I've found all the
> places that need to carry an smgrid alongside a RelFileNode.
>
> Archeological note: smgropen() was like that ~15 years ago before
> commit 87bd9563, but buffer tags didn't include the SMGR ID.
>
> I decided to call md.c's ID "SMGR_RELATION", describing what it really
> holds -- regular relations -- rather than perpetuating the doubly
> anachronistic "magnetic disk" name.
>
> While here, I resurrected the ancient notion of a per-SMGR 'open'
> routine, so that a small amount of md.c-specific stuff could be kicked
> out of smgr.c and future implementations can do their own thing here
> too.
>
> While doing that work I realised that at least pg_rewind needs to
> learn about how different storage managers map blocks to files, so
> that's a new TODO item requiring more thought. I wonder what other
> places know how to map { RelFileNode, ForkNumber, BlockNumber } to a
> path + offset, and I wonder what to think about the fact that some of
> them may be non-backend code...

Given the scope of this patch, it might be prudent to start a separate
thread for it. So far, this discussion has been burried within other
discussions and I want to ensure folks don't miss this.

Thanks.

--
Shawn Debnath
Amazon Web Services (AWS)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-04-23 17:39:13 Re: Unhappy about API changes in the no-fsm-for-small-rels patch
Previous Message Tom Lane 2019-04-23 17:31:25 Re: Unhappy about API changes in the no-fsm-for-small-rels patch