Re: Deriving Recovery Snapshots

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Deriving Recovery Snapshots
Date: 2008-10-22 14:29:48
Message-ID: 48FF38DC.9080206@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon Riggs wrote:
> On Wed, 2008-10-22 at 12:29 +0300, Heikki Linnakangas wrote:
>> Simon Riggs wrote:
>>> On Thu, 2008-10-16 at 18:52 +0300, Heikki Linnakangas wrote:
>>>> Simon Riggs wrote:
>>>>> * The backend slot may not be reused for some time, so we should take
>>>>> additional actions to keep state current and true. So we choose to log a
>>>>> snapshot from the master into WAL after each checkpoint. This can then
>>>>> be used to cleanup any unobserved xids. It also provides us with our
>>>>> initial state data, see later.
>>>> We don't need to log a complete snapshot, do we? Just oldestxmin should
>>>> be enough.
>>> Possibly, but you're thinking that once we're up and running we can use
>>> less info.
>>>
>>> Trouble is, you don't know when/if the standby will crash/be shutdown.
>>> So we need regular full snapshots to allow it to re-establish full
>>> information at regular points. So we may as well drop the whole snapshot
>>> to WAL every checkpoint. To do otherwise would mean more code and less
>>> flexibility.
>> Surely it's less code to write the OldestXmin to the checkpoint record,
>> rather than a full snapshot, no? And to read it off the checkpoint record.
>
> You may be missing my point.
>
> We need an initial state to work from.
>
> I am proposing we write a full snapshot after each checkpoint because it
> allows us to start recovery again from that point. If we wrote only
> OldestXmin as you suggest it would optimise the size of the WAL record
> but it would prevent us from restarting at that point.

Well, you'd just need to treat anything > oldestxmin, and not marked as
finished in clog, as unobserved. Which doesn't seem too bad. Not that
storing the full list of in-progress xids is that bad either, though.

Hmm. What about in-progress subtransactions that have overflowed the
shared mem cache? Can we rely that subtrans is up-to-date, up to the
checkpoint record that recovery starts from?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2008-10-22 14:35:09 Re: Deriving Recovery Snapshots
Previous Message Heikki Linnakangas 2008-10-22 14:18:29 Re: Deriving Recovery Snapshots