Quick Links

Apparent deadlock for simultaneous sequential scans

From:	pgsql-bugs(at)postgresql(dot)org
To:	pgsql-bugs(at)postgresql(dot)org
Subject:	Apparent deadlock for simultaneous sequential scans
Date:	2001-06-07 22:39:14
Message-ID:	200106072239.f57MdEm11440@hub.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

Robert Bruccoleri (bruc(at)stone(dot)congen(dot)com) reports a bug with a severity of 1
The lower the number the more severe it is.

Short Description
Apparent deadlock for simultaneous sequential scans

Long Description
On an SGI Origin 2000 with 32 CPU's, I'm running Postgresql 7.1beta4
using 32768 buffers. I have an application which does a two table join with a nested loop plan - one table is scanned sequentially, and
the other is index scanned for each hit.

If I run this application by itself, performance is fine. The queries
take a few minutes to execute, which is reasonable given the number
of tuples that must be returned.

However, if more than one application is run at once, the performance
deteriotates drastically. Monitoring the backends using the SGI par
command (which monitors system calls) shows that all the affected backends are running select in timeout mode. Running dbx on the
running backends reveals that all of them are waiting for the bufmgr
spinlock (BufMgrLock). Here is the traceback for all the backends:

> 0 __select(0x0, 0x0, 0x0, 0x0) ["select.s":17, 0xfa34e00]
1 _select(0x0, 0x0, 0x0, 0x0) ["selectSCI.c":30, 0xfa34e74]
2 s_lock_sleep(0x0, 0x0, 0x64000010, 0x100580b4) ["s_lock.c":90, 0x557d2c]
3 s_lock(0x64000010, 0x100580b4, 0x9c, 0x0) ["s_lock.c":113, 0x557db0]
4 SpinAcquire(0x0, 0x0, 0x0, 0x0) ["spin.c":156, 0x55ee74]
5 RelationGetBufferWithBuffer(0x103c5b78, 0x4, 0x171, 0x0) ["bufmgr.c":117, 0x551aa0]
6 heapgettup(0x103c5b78, 0x1040e81c, 0x1, 0x1040e850) ["heapam.c":411, 0x43f050]
7 heap_getnext(0x1040e800, 0x0, 0x0, 0x0) ["heapam.c":1072, 0x4416d4]
8 SeqNext(0x1040aa40, 0x0, 0x0, 0x0) ["nodeSeqscan.c":98, 0x4edc04]
9 ExecScan(0x1040aa40, 0x4edaf0, 0x0, 0x0) ["execScan.c":98, 0x4e2d64]
10 ExecSeqScan(0x1040aa40, 0x0, 0x0, 0x0) ["nodeSeqscan.c":137, 0x4edc74]
11 ExecProcNode(0x1040aa40, 0x1040aad0, 0x0, 0x0) ["execProcnode.c":285, 0x4df39c]
12 ExecNestLoop(0x1040aad0, 0x0, 0x0, 0x0) ["nodeNestloop.c":173, 0x4ed140]
13 ExecProcNode(0x1040aad0, 0x1040dee0, 0x0, 0x0) ["execProcnode.c":305, 0x4df40c]
14 ExecUnique(0x1040dee0, 0x0, 0x0, 0x0) ["nodeUnique.c":71, 0x4ef40c]
15 ExecProcNode(0x1040dee0, 0x1040dee0, 0x0, 0x0) ["execProcnode.c":333, 0x4df4b4]
16 ExecutePlan(0x1040e020, 0x1040dee0, 0x1, 0x0) ["execMain.c":965, 0x4dd3e0]
17 ExecutorRun(0x1040e000, 0x1040e020, 0x3, 0x0) ["execMain.c":199, 0x4dc06c]
18 ProcessQuery(0x103fa5e8, 0x1040dee0, 0x2, 0x0) ["pquery.c":305, 0x56f8d4]
19 pg_exec_query_string(0x103f9d98, 0x2, 0x1037e300, 0x0) ["postgres.c":810, 0x56d444]
20 PostgresMain(0x4, 0x7fff2560, 0xa, 0x7fff2eb4) ["postgres.c":1882, 0x56ef1c]
21 DoBackend(0x100ba550, 0x0, 0x0, 0x0) ["postmaster.c":2035, 0x540058]
22 BackendStartup(0x100ba550, 0x0, 0x0, 0x0) ["postmaster.c":1812, 0x53f8e4]
23 ServerLoop(0x0, 0x0, 0x0, 0x0) ["postmaster.c":967, 0x53dfa4]
24 PostmasterMain(0xa, 0x7fff2eb4, 0x0, 0x0) ["postmaster.c":666, 0x53d60c]
25 main(0xa, 0x7fff2eb4, 0x0, 0x0) ["main.c":142, 0x4fdbac]
26 __istart() ["crt1tinit.s":13, 0x4255f0]

It's not clear to me why the spinlock needs be grabbed at the beginning of RelationGetBufferWithBuffer, but that does seem to
be the problem.

If more information is required, please let me know.

I've compared the code for this file against PostgreSQL 7.1.1 and
this part is unchanged.

Sample Code

No file was uploaded with this report

Responses

Re: Apparent deadlock for simultaneous sequential scans at 2001-06-08 03:34:50 from Tom Lane

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Kemin Zhou	2001-06-08 00:20:29	C++ interface
Previous Message	Arcady Genkin	2001-06-07 22:02:53	REPOST: redefining location of the socket file /tmp/.s.PGSQL.5432