Re: Hot standby, slot ids and stuff

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-09 11:20:40
Message-ID: 1231500040.18005.350.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Fri, 2009-01-09 at 12:33 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Thu, 2009-01-08 at 15:50 -0500, Tom Lane wrote:
> >> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> >>> On Thu, 2009-01-08 at 22:31 +0200, Heikki Linnakangas wrote:
> >>>> When a backend dies with FATAL, it writes an abort record before exiting.
> >>>>
> >>>> (I was under the impression it doesn't until few minutes ago myself,
> >>>> when I actually read the shutdown code :-))
> >>> Not in all cases; keep reading :-)
> >> If it doesn't, that's a bug. A FATAL exit is not supposed to leave the
> >> shared state corrupted, it's only supposed to be a forcible session
> >> termination. Any open transaction should be rolled back.
> >
> > Please look back at the earlier discussion we had on this exact point:
> > http://archives.postgresql.org/pgsql-hackers/2008-09/msg01809.php
>
> I think the running-xacts list we dump to WAL at every checkpoint is
> enough to handle that. Just treat the dead transaction as in-progress
> until the next running-xacts record. It's presumably extremely rare to
> have a process die with FATAL, and not write an abort record.

I agree, but I'll wait for Tom to speak further.

> A related issue is that currently the recovery PANICs if it runs out of
> recovery procs. I think that's not acceptable, regardless of whether we
> use slotids or some other method to avoid it in normal operation,
> because it means you can't recover at all if you set max_connections too
> low in the standby (or in the primary, and you have to recover from
> crash), or you run out of recovery procs because of an abort failed in
> the primary like discussed on that thread.

> The standby should just
> fast-forward to the next running-xacts record in that case.

What do you mean by "fast forward"?

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2009-01-09 11:23:35 Re: Hot standby, slot ids and stuff
Previous Message Zeugswetter Andreas OSB sIT 2009-01-09 11:17:45 Re: Improving compressibility of WAL files