Re: page is uninitialized --- fixing

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Brad Nicholson <bnichols(at)ca(dot)afilias(dot)info>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: page is uninitialized --- fixing
Date: 2008-03-27 14:37:21
Message-ID: 9000.1206628641@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Brad Nicholson <bnichols(at)ca(dot)afilias(dot)info> writes:
> On Thu, 2008-03-27 at 10:29 -0300, Alvaro Herrera wrote:
>> Brad Nicholson wrote:
>>> It was. This table is an insert only log table that was being heavily
>>> was being heavily written to at the time of the crash.
>>
>> Is it possible that there were *two* crashes?

> There was only one crash. However, there were two separate SAN switches
> that were pulled out from under the DB, not sure if that would matter.

To explain the pattern that was shown I think you'd have to assume
something like this:

* Some backend goes to do an insert, finds no space in FSM, obtains
and zeroes page 652139. But before it can do the insert (or at least
before it can emit the WAL record) it blocks for a long time.

* Some other backend does exactly the same thing with page 652140.

* While those guys are still blocked, yet other backends write into
pages 652141..652939. These writes do make it to WAL.

* One or two backends initialize pages 652940 and 652941, but these
writes don't make it to WAL. (This could be just one backend, if
you assume it had WAL-logged the first write but that didn't get
out of WAL buffers in time.)

* Crash.

This is not entirely out of the question, because of the designed-in
property that a freshly initialized page is only inserted into by
the backend that got it --- no one else will know there is any
free space in it until VACUUM first passes over it. So if there
are a lot of different sessions writing into this table you don't
need to assume more than about one tuple per page. Still, it's
kinda hard to believe that the first two backends could remain stuck
for so long as to let ~800 other insertions happen.

What do you mean by "two separate SAN switches pulled out" --- is the
DB spread across multiple SAN controllers?

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2008-03-27 14:43:31 Re: Timezones in 8.2.7
Previous Message Erik Jones 2008-03-27 14:35:18 Re: Timezones in 8.2.7