Re: Adding hook in BufferSync for backup purposes

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Adding hook in BufferSync for backup purposes
Date: 2017-08-08 05:18:59
Message-ID: A0E68ABA-AA62-4B56-879F-C61EECB34F23@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro, Tom, thank you for your valuable comments.

> Alvaro:
> I remember discussing the topic of differential base-backups with
> somebody (probably Marco and Gabriele). The idea we had was to have a
> new relation fork which stores an LSN for each group of pages,
> indicating the LSN of the newest change to those pages. The backup tool
> "scans" the whole LSN fork, and grabs images of all pages that have LSNs
> newer than the one used for the previous base backup.
Thanks for the pointer, I’ve found the discussions and now I’m in a process of extraction of the knowledge from there

> I think it should be at the point where the buffer is
> modified (i.e. when WAL is written) rather than when it's checkpointed
> out.
WAL is flushed before actual pages are written to disk(sent to kernel). I’d like to notify extensions right after we exactly sure pages were flushed.
But you are right, BufferSync is not good place for this:
1. It lacks LSNs
2. It’s not the only place to flush: bgwriter goes through nearby function FlushBuffer() and many AMs write directly to smgr (for example when matapge is created)

BufferSync() seemed sooo comfortable and efficient place for flashing info on dirty pages, already sorted and grouped by tablespace, but it is absolutely incorrect to do it there. I’ll look for the better place.

>
> 7 авг. 2017 г., в 18:37, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> написал(а):
>
> Yeah. Keep in mind that if the extension does anything at all that could
> possibly throw an error, and if that error condition persists across
> multiple tries, you will have broken the database completely: it will
> be impossible to complete a checkpoint, and your WAL segment pool will
> grow until it exhausts disk. So the idea of doing something that involves
> unspecified extension behavior, especially possible interaction with
> an external backup agent, right there is pretty terrifying.
I think that API for extensions should tend to protect developer from breaking everything, but may allow it with precaution warnings in docs and comments. Please let me know if this assumption is incorrect.

>
> Other problems with the proposed patch: it misses coverage of
> BgBufferSync, and I don't like exposing an ad-hoc structure like
> CkptTsStatus as part of an extension API. The algorithm used by
> BufferSync to schedule buffer writes has changed multiple times
> before and doubtless will again; if we're going to have a hook
> here it should depend as little as possible on those details.
OK, now I see that «buf_internals.h» had word internals for a reason. Thanks for pointing that out, I didn’t knew about changes in these algorithms.

Best regards, Andrey Borodin, Yandex.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-08-08 06:27:56 Re: Adding hook in BufferSync for backup purposes
Previous Message Masahiko Sawada 2017-08-08 04:49:03 Re: pgbench: Skipping the creating primary keys after initialization