RE: [HACKERS] Frustration

From: "Hiroshi Inoue" <Inoue(at)tpf(dot)co(dot)jp>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Michael Simms" <grim(at)argh(dot)demon(dot)co(dot)uk>
Cc: <pgsql-hackers(at)postgreSQL(dot)org>
Subject: RE: [HACKERS] Frustration
Date: 1999-09-27 00:13:38
Message-ID: 001501bf087d$25ea27a0$2801007e@cadzone.tpf.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> -----Original Message-----
> From: owner-pgsql-hackers(at)postgreSQL(dot)org
> [mailto:owner-pgsql-hackers(at)postgreSQL(dot)org]On Behalf Of Tom Lane
> Sent: Friday, September 24, 1999 11:27 PM
> To: Michael Simms
> Cc: pgsql-hackers(at)postgreSQL(dot)org
> Subject: Re: [HACKERS] Frustration
>
>
> Michael Simms <grim(at)argh(dot)demon(dot)co(dot)uk> writes:
> > Well, thanks to tom, I know what was wrong, and I have found
> the problem,
> > or one of them at least...
> > FATAL: s_lock(0c9ef824) at bufmgr.c:1106, stuck spinlock. Aborting.
> > Okee, that segment of code is, well, its some deep down internals that
> > are as clear as mud to me.
>
> Hmph. Apparently, some backend was waiting for some other backend to
> finish reading a page in or writing it out, and gave up after deciding
> it had waited an unreasonable amount of time (~ 1 minute, which does
> seem plenty long enough). Probably, the I/O did in fact finish, but
> the waiting backend didn't get the word for some reason.
>

[snip]

>
> Another likely explanation is that there's something wrong in
> bufmgr.c's logic for setting and releasing the io_in_progress lock ---
> but a quick look doesn't show any obvious error, and I would have
> thought we'd have found out about any such problem long since.
> Since we're not being buried in reports of stuck-spinlock errors,
> I'm guessing there is some platform-specific problem on your machine.
> No good ideas what it is if it isn't a spinlock failure.
>

Different from other spinlocks,io_in_progress spinlock is a per bufpage
spinlock and ProcReleaseSpins() doesn't release the spinlock.
If an error(in md.c in most cases) occured while holding the spinlock
,the spinlock would necessarily freeze.

Michael Simms says
ERROR: cannot read block 641 of server
occured before the spinlock stuck abort.

Probably it is an original cause of the spinlock freeze.

However I don't understand the following status of his machine.

Filesystem 1k-blocks Used Available Use% Mounted on
/dev/hda3 1109780 704964 347461 67% /
/dev/hda1 33149 6140 25297 20% /boot
/dev/hdc1 9515145 3248272 5773207 36% /home
/dev/hdb1 402852 154144 227903 40% /tmp
/dev/sda1 30356106785018642307 43892061535609608 0 100%
/var/lib/pgsql

Regards.

Hiroshi Inoue
Inoue(at)tpf(dot)co(dot)jp

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 1999-09-27 00:17:11 Re: [HACKERS] psql issues
Previous Message Bruce Momjian 1999-09-27 00:08:56 Re: [HACKERS] create rule changes table to view ?