Re: Incomplete description of pg_start_backup?

From: Dmitry Koterov <dmitry(at)koterov(dot)ru>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Incomplete description of pg_start_backup?
Date: 2013-05-24 18:33:56
Message-ID: CA+CZih6L2w+BcLH4_EmhthdJDiiygGH5oApLuB2UvzSb6bCeag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I don't get still.

Suppose we have a data file with blocks with important (non-empty) data:

A B C D

1. I call pg_start_backup().
2. Tar starts to copy A block to the destination archive...
3. During this copying, somebody removes data from a table which is
situated in B block. So this data is a subject for vacuuming, and the block
is marked as a free space.
4. Somebody writes data to a table, and this data is placed to a free space
- to B block. This is also added to the WAL log (so the data is stored at 2
places: at B block and at WAL).
5. Tar (at last!) finishes copying of A block and begins to copy B block.
6. It finishes, then it copies C and D to the archive too.
7. Then we call pg_stop_backup() and also archive collected WAL (which
contains the new data of B block as we saw above).

The question is - *where is the OLD data of B block in this scheme?* Seems
it is NOT in the backup! So it cannot be restored. (And, in case when we
never overwrite blocks between pg_start_backup...pg_stop_backup, but always
append the new data, it is not a problem.) Seems to me this is not
documented at all! That is what my initial e-mail about.

(I have one hypothesis on that, but I am not sure. Here is it: does vacuum
saves ALL deleted data of B block to WAL on step 3 prior deletion? If yes,
it is, of course, a part of the backup. But it wastes space a lot...)

On Tue, May 14, 2013 at 6:05 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:

> On Mon, May 13, 2013 at 4:31 PM, Dmitry Koterov <dmitry(at)koterov(dot)ru> wrote:
>
>> Could you please provide a bit more detailed explanation on how it works?
>>
>> And how could postgres write at the middle of archiving files during an
>> active pg_start_backup? if it could, here might be a case when a part of
>> archived data file contains an overridden information "from the future",
>>
>
> The data files cannot contain information from the future. If the backup
> is restored, it must be restored to the time of pg_stop_backup (at least),
> which means the data would at that point be from the past/present, not the
> future.
>
> Cheers,
>
> Jeff
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2013-05-24 18:39:23 Re: Cost limited statements RFC
Previous Message Amit Langote 2013-05-24 18:24:17 Re: WAL segments (names) not in a sequence