Re: Postgres 9.01, Amazon EC2/EBS, XFS, JDBC and lost connections

From: Sean Laurent <sean(at)studyblue(dot)com>
To: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
Cc: John R Pierce <pierce(at)hogranch(dot)com>, PostgreSQL <pgsql-general(at)postgresql(dot)org>
Subject: Re: Postgres 9.01, Amazon EC2/EBS, XFS, JDBC and lost connections
Date: 2011-10-11 23:00:17
Message-ID: CAK=aZ=nZ+n7PKehtdys_mxh3kHONHVJWo5wnT2LZpMbDSXrnHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Oct 11, 2011 at 12:04 AM, Craig Ringer <ringerc(at)ringerc(dot)id(dot)au> wrote:
> On 11/10/11 12:48, John R Pierce wrote:
>> On 10/10/11 7:44 PM, Craig Ringer wrote:
>>> If blocking writes causes a server failure that persists once writes
>>> have been unblocked, that's a bug IMO. You might have a bit of a backlog
>>> of writes to clear, but after that all should be well, and if it isn't
>>> then something needs fixing.
>>
>> the process is blocked waiting for this disk write to complete,
>> meanwhile, the packets are queuing up and waiting for service.
>>
>> best of luck with all that....
>
> xfs_freeze for long enough to take a snapshot doesn't take long, or it
> shouldn't, anyway.

On average, xfs_freeze takes about 2 seconds for us with 8 EBS volumes
at 60GB each in a software RAID-0 array.

> Even if it did, that shouldn't cause a server failure
> that persists past when disk I/O is resumed, though it might cause
> individual connections to drop.
<DELETED>
> It is totally unreasonable for Pg to *stay* nonfunctional once disk I/O
> resumes. Existing connections should receive responses they're waiting
> on or die, depending on how long it's been, and new connections should
> be accepted fine.

Exactly. I genuinely expect Postgres to be able to withstand a couple
of seconds of blocked disk I/O. Especially since this isn't a heavy
duty transaction processing system - it's under load, but not a
tremendously high load. During our busier times we average something
in the neighborhood of 300-400 transactions per second, which just
doesn't seem like that much.

As much as I would like Postgres to withstand a 2 second outage, I
don't honestly care. I'd just like to figure out whether I'm looking
at something that's actually a problem or if I should be looking
elsewhere for the problem.
--
Sean Laurent
Director of Operations
StudyBlue, Inc.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2011-10-11 23:02:52 Re: how to find primary key field name?
Previous Message J.V. 2011-10-11 22:54:09 how to find primary key field name?