Skip site navigation (1) Skip section navigation (2)

Re: select on 22 GB table causes "An I/O error occured while sending to the backend." exception

From: "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>
To: david(at)lang(dot)hm
Cc: "Matthew Wakeling" <matthew(at)flymine(dot)org>, "PostgreSQL Performance" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: select on 22 GB table causes "An I/O error occured while sending to the backend." exception
Date: 2008-08-29 01:11:50
Message-ID: dcc563d10808281811t5c98dcaexd98c5cfe679f428d@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
On Thu, Aug 28, 2008 at 5:08 PM,  <david(at)lang(dot)hm> wrote:
> On Thu, 28 Aug 2008, Scott Marlowe wrote:
>
>> On Thu, Aug 28, 2008 at 2:29 PM, Matthew Wakeling <matthew(at)flymine(dot)org>
>> wrote:
>>
>>> Another point is that from a business perspective, a database that has
>>> stopped responding is equally bad regardless of whether that is because
>>> the
>>> OOM killer has appeared or because the machine is thrashing. In both
>>> cases,
>>> there is a maximum throughput that the machine can handle, and if
>>> requests
>>> appear quicker than that the system will collapse, especially if the
>>> requests start timing out and being retried.
>>
>> But there's a HUGE difference between a machine that has bogged down
>> under load so badly that you have to reset it and a machine that's had
>> the postmaster slaughtered by the OOM killer.  In the first situation,
>> while the machine is unresponsive, it should come right back up with a
>> coherent database after the restart.
>>
>> OTOH, a machine with a dead postmaster is far more likely to have a
>> corrupted database when it gets restarted.
>
> wait a min here, postgres is supposed to be able to survive a complete box
> failure without corrupting the database, if killing a process can corrupt
> the database it sounds like a major problem.

Yes it is a major problem, but not with postgresql.  It's a major
problem with the linux OOM killer killing processes that should not be
killed.

Would it be postgresql's fault if it corrupted data because my machine
had bad memory?  Or a bad hard drive?  This is the same kind of
failure.  The postmaster should never be killed.  It's the one thing
holding it all together.

In response to

Responses

pgsql-performance by date

Next:From: davidDate: 2008-08-29 01:16:16
Subject: Re: select on 22 GB table causes "An I/O error occured while sending to the backend." exception
Previous:From: Brad EdigerDate: 2008-08-28 23:22:19
Subject: Re: Nested Loop join being improperly chosen

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group