Re: [HACKERS] Unlogged tables cleanup

From: Andres Freund <andres(at)anarazel(dot)de>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, konstantin knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Unlogged tables cleanup
Date: 2019-05-14 04:33:52
Message-ID: 20190514043352.jtbki3f4ifegk6g3@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-05-14 13:23:28 +0900, Michael Paquier wrote:
> On Mon, May 13, 2019 at 10:37:35AM -0700, Andres Freund wrote:
> > Ugh, this is all such a mess. But, isn't this broken independently of
> > the smgrimmedsync() issue? In a basebackup case, the basebackup could
> > have included the main fork, but not the init fork, and the reverse. WAL
> > replay *solely* needs to be able to recover from that. At the very
> > least we'd have to do the cleanup step after becoming consistent, not
> > just before recovery even started.
>
> Yes, the logic using smgrimmedsync() is race-prone and weaker than the
> index AMs in my opinion, even if the failure window is limited (I
> think that this is mentioned upthread a bit).

How's it limited? On a large database a base backup easily can take
*days*. And e.g. VM and FSM can easily have inodes that are much newer
than the the main/init forks, so typical base-backups (via OS/glibc
readdir) will sort them at a later point (or it'll be hashed, in which
case it's entirely random), so the window between when the different
forks are copied are large.

> What's actually the reason preventing us from delaying the
> checkpointer like the index AMs for the logging of heap init fork?

I'm not following. What do you mean by "delaying the checkpointer"?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2019-05-14 04:59:10 Re: [HACKERS] WAL logging problem in 9.4.3?
Previous Message Amit Langote 2019-05-14 04:29:33 Re: [HACKERS] advanced partition matching algorithm for partition-wise join