Re: How to simulate crashes of PostgreSQL?

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Sergey Samokhin <prikrutil(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: How to simulate crashes of PostgreSQL?
Date: 2009-08-25 02:42:50
Message-ID: 1251168170.12780.26.camel@ayaki
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, 2009-08-25 at 00:26 +0400, Sergey Samokhin wrote:

> Though I don't think there are any differences between the crash of
> PosgreSQL itself and the crash of the machine PostgreSQL is running on
> from the client's point of view.

There certainly are!

For one thing, if a client with an established connection sends a packet
to a machine where PostgreSQL has crashed (the backend process has
exited on a signal) it'll receive a TCP RST indicating that the
connection has been broken. The OS will also generally FIN to the client
when the backend crashes to inform it that the connection is closing, so
you'll often find out as soon as the backend dies or at least as soon as
you next try to use the connection. If the issue was just with that
backend, your client can just reconnect, retry its most recent work, and
keep on going.

Similarly, a new client trying to connect to a machine where the
postmaster has crashed will receive a TCP RST packet indicating that the
connection attempt was actively refused. It'll know immediately that
something's not right and will get a useful error from the TCP stack.

If, on the other hand, the server has crashed, clients may not receive
any response at all to packets. The server may even stop responding to
ARP requests, in which case the nearest router to it will - eventually,
maybe - send your client an ICMP destination-unreachable . There will be
long delays either way before the TCP/IP stack decides the connection
has died. Your client will probably block on recv(...) / read(...) for
an extended period.

If a backend is still running but in a nonresponsive state, the TCP/IP
stack on the server will still ACK packets you send to the backend (at
least until the buffers fill up), but the backend won't be doing
anything with the data. The local TCP stack won't see anything wrong
because, at the TCP level, there isn't - something that can't happen in
a server crash.

So, yes, there's a pretty big difference between a crash of PostgreSQL
and a server crash. Behaviour is different from the client perspective
and you need to consider that. Intermediate network issues are different
again, as you might encounter huge latency (possibly randomly only on
some packets), random packet loss, etc. This will cause weird pauses and
delays in communication that your client must cope with.

This, by the way, is one of the reasons you *really* should do all your
database work in a separate worker thread on GUI clients. The GUI must
remain responsive even when you're waiting for a response that'll never
come, or being held up by multi-second network latencies.

--
Craig Ringer

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Fred Janon 2009-08-25 09:29:47 Fwd: How to create a multi-column index with 2 dates using 'gist'?
Previous Message Craig Ringer 2009-08-25 02:22:55 Re: How to simulate crashes of PostgreSQL?