Skip site navigation (1) Skip section navigation (2)

Re: Hot Standby conflict resolution handling

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Hot Standby conflict resolution handling
Date: 2013-01-17 15:19:23
Message-ID: 8463.1358435963@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-hackers
Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com> writes:
> On Thu, Jan 17, 2013 at 12:08 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> ISTM that if we dare not interrupt for fear of confusing OpenSSL, we
>> cannot safely attempt to send an error message to the client either;
>> but ereport(FATAL) will try exactly that.

> I thought since FATAL will force the backend to exit, we don't care much
> about corrupted OpenSSL state. I even thought that's why we raise ERROR to
> FATAL so that the backend can start in a clean state. But clearly I'm
> missing a point here because you don't think that way.

If we were to simply exit(1), leaving the kernel to close the client
socket, it'd be safe enough because control would never have returned to
OpenSSL.  But this code doesn't do that.  What we're looking at is that
we've interrupted OpenSSL at some arbitrary point, and now we're going
to make fresh calls to it to try to pump the FATAL error message out to
the client.  It seems fairly unlikely that that's safe.  I'm not sure
I credit Andres' worry of arbitrary code execution, but I do fear that
OpenSSL could get confused to the point of freezing up, or even more
likely that it would transmit garbage to the client, which rather
defeats the purpose.

Don't see a nice fix.  The COMMERROR approach (ie, don't try to send
anything to the client, only the log) is not nice at all since the
client would get the impression that the server crashed.  On the other
hand, anything else requires waiting till we get control back from
OpenSSL, which might be a long time, and meanwhile we're still holding
locks that prevent WAL recovery from proceeding.

			regards, tom lane


In response to

Responses

pgsql-hackers by date

Next:From: Andres FreundDate: 2013-01-17 15:23:44
Subject: Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave
Previous:From: Heikki LinnakangasDate: 2013-01-17 15:18:14
Subject: Re: Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group