Hooks to track changed pages for backup purposes

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Hooks to track changed pages for backup purposes
Date: 2017-08-31 06:02:34
Message-ID: 25EB4D10-2E54-474E-945A-9B1DE780B9A3@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers!

Here is the patch with hooks that I consider sufficient for implementation of incremental backup with pages tracking as extension.

Recently I was posting these things to the thread "Adding hook in BufferSync for backup purposes" [0], but here I start separate thread since Subj field of previous discussion is technically wrong.

Currently various incremental backups can use one of this methods to take diff of a cluster since some LSN:
1. Check LSN of every page
2. Scan WAL and collect block numbers of changed pages

I propose adding hooks:
1. When a buffer is registered in WAL insertion
This hook is supposed to place blocknumbers in a temporary storage, like backend-local static array.
2. When a WAL record insertion is started and finished, to transfer blocknumbers to more concurrency-protected storage.
3. When the WAL segment is switched to initiate async transfer of accumulated blocknumbers to durable storage.

When we have accumulated diff blocknumbers for most of segments we can significantly speed up method of WAL scanning. If we have blocknumbers for all segments we can skip WAL scanning at all.

I think that these proposed hooks can enable more efficient backups. How do you think?

Any ideas will be appreciated. This patch is influenced by the code of PTRACK (Yury Zhuravlev and Postgres Professional).

Best regards, Andrey Borodin.

[0] https://www.postgresql.org/message-id/flat/20051502087457%40webcorp01e(dot)yandex-team(dot)ru#20051502087457(at)webcorp01e(dot)yandex-team(dot)ru

Attachment Content-Type Size
0001-hooks-to-watch-for-changed-pages.patch application/octet-stream 5.4 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2017-08-31 06:02:48 Re: path toward faster partition pruning
Previous Message Amit Kapila 2017-08-31 05:53:02 Re: parallelize queries containing initplans