From: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
---|---|
To: | Greg Stark <stark(at)mit(dot)edu> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com> |
Subject: | Re: Spreading full-page writes |
Date: | 2014-05-27 09:07:55 |
Message-ID: | 538455EB.6040006@vmware.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 05/26/2014 02:26 PM, Greg Stark wrote:
> On Mon, May 26, 2014 at 1:22 PM, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com
>> wrote:
>
>> The second record is generated before the checkpoint is finished and the
>> checkpoint record is written. So it will be there.
>>
>> (if you crash before the checkpoint is finished, the in-progress
>> checkpoint is no good for recovery anyway, and won't be used)
>
> Another idea would be to have separate checkpoints for each buffer
> partition. You would have to start recovery from the oldest checkpoint of
> any of the partitions.
Yeah. Simon suggested that when we talked about this, but I didn't
understand how that works at the time. I think I do now. The key to
making it work is distinguishing, when starting recovery from the latest
checkpoint, whether a record for a given page can be replayed safely. I
used flags on WAL records in my proposal to achieve this, but using
buffer partitions is simpler.
For simplicity, let's imagine that we have two Redo-pointers for each
checkpoint record: one for even-numbered pages, and another for
odd-numbered pages. When checkpoint begins, we first update the
Even-redo pointer to the current WAL insert location, and then flush all
the even-numbered buffers in the buffer cache. Then we do the same for Odd.
Recovery begins at the Even-redo pointer. Replay works as normal, but
until you reach the Odd-pointer, you refrain from replaying any changes
to Odd-numbered pages. After reaching the odd-pointer, you replay
everything as normal.
Hmm, that seems actually doable...
- Heikki
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Stark | 2014-05-27 11:42:52 | Re: Spreading full-page writes |
Previous Message | Sandro Santilli | 2014-05-27 08:57:31 | Re: postgres_fdw and connection management |