From: | "Hiroshi Inoue" <Inoue(at)tpf(dot)co(dot)jp> |
---|---|
To: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Michael Simms" <grim(at)argh(dot)demon(dot)co(dot)uk> |
Cc: | <pgsql-hackers(at)postgreSQL(dot)org> |
Subject: | RE: [HACKERS] Frustration |
Date: | 1999-09-27 00:13:38 |
Message-ID: | 001501bf087d$25ea27a0$2801007e@cadzone.tpf.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> -----Original Message-----
> From: owner-pgsql-hackers(at)postgreSQL(dot)org
> [mailto:owner-pgsql-hackers(at)postgreSQL(dot)org]On Behalf Of Tom Lane
> Sent: Friday, September 24, 1999 11:27 PM
> To: Michael Simms
> Cc: pgsql-hackers(at)postgreSQL(dot)org
> Subject: Re: [HACKERS] Frustration
>
>
> Michael Simms <grim(at)argh(dot)demon(dot)co(dot)uk> writes:
> > Well, thanks to tom, I know what was wrong, and I have found
> the problem,
> > or one of them at least...
> > FATAL: s_lock(0c9ef824) at bufmgr.c:1106, stuck spinlock. Aborting.
> > Okee, that segment of code is, well, its some deep down internals that
> > are as clear as mud to me.
>
> Hmph. Apparently, some backend was waiting for some other backend to
> finish reading a page in or writing it out, and gave up after deciding
> it had waited an unreasonable amount of time (~ 1 minute, which does
> seem plenty long enough). Probably, the I/O did in fact finish, but
> the waiting backend didn't get the word for some reason.
>
[snip]
>
> Another likely explanation is that there's something wrong in
> bufmgr.c's logic for setting and releasing the io_in_progress lock ---
> but a quick look doesn't show any obvious error, and I would have
> thought we'd have found out about any such problem long since.
> Since we're not being buried in reports of stuck-spinlock errors,
> I'm guessing there is some platform-specific problem on your machine.
> No good ideas what it is if it isn't a spinlock failure.
>
Different from other spinlocks,io_in_progress spinlock is a per bufpage
spinlock and ProcReleaseSpins() doesn't release the spinlock.
If an error(in md.c in most cases) occured while holding the spinlock
,the spinlock would necessarily freeze.
Michael Simms says
ERROR: cannot read block 641 of server
occured before the spinlock stuck abort.
Probably it is an original cause of the spinlock freeze.
However I don't understand the following status of his machine.
Filesystem 1k-blocks Used Available Use% Mounted on
/dev/hda3 1109780 704964 347461 67% /
/dev/hda1 33149 6140 25297 20% /boot
/dev/hdc1 9515145 3248272 5773207 36% /home
/dev/hdb1 402852 154144 227903 40% /tmp
/dev/sda1 30356106785018642307 43892061535609608 0 100%
/var/lib/pgsql
Regards.
Hiroshi Inoue
Inoue(at)tpf(dot)co(dot)jp
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 1999-09-27 00:17:11 | Re: [HACKERS] psql issues |
Previous Message | Bruce Momjian | 1999-09-27 00:08:56 | Re: [HACKERS] create rule changes table to view ? |