Re: BUG #6086: Segmentation fault

From: noordsij <noordsij(at)cs(dot)helsinki(dot)fi>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #6086: Segmentation fault
Date: 2011-07-24 21:06:46
Message-ID: e2e7673675e48059cf6712462467f4f6@cs.helsinki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs


> Any idea what query triggered this?

Only up to which stored procedure (which itself contains multiple
statements).

However, in a new fresh FreeBSD environment (under virtualbox) I can
trigger a similar (my guess is it is the same issue) segmentation fault
reliably.

Application is a Python web app, using psycopg2. It creates N worker
threads, where each worker thread creates its own connection to postgresql.
The threads then block on a global queue from which they read and process
requests. That is, connections are not shared between threads, and all
queries that are part of servicing a request are inside a single
transaction bound to a single connection. Connections are re-used. (please
assume this is implemented correctly on the client; a faulty client should
not crash a server)

If N=1, everything is OK. If N=3 (confirmed, but I suppose anything > 1)
then the following happens for one particular testcase every single time:

gdb `which postgres` /usr/local/pgsql/data/postgres.core
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...
Core was generated by `postgres'.
Program terminated with signal 10, Bus error.
Reading symbols from /usr/local/lib/libintl.so.9...done.
Loaded symbols for /usr/local/lib/libintl.so.9
Reading symbols from /usr/local/lib/libxml2.so.5...done.
Loaded symbols for /usr/local/lib/libxml2.so.5
Reading symbols from /usr/lib/libssl.so.6...done.
Loaded symbols for /usr/lib/libssl.so.6
Reading symbols from /lib/libcrypto.so.6...done.
Loaded symbols for /lib/libcrypto.so.6
Reading symbols from /lib/libm.so.5...done.
Loaded symbols for /lib/libm.so.5
Reading symbols from /lib/libc.so.7...done.
Loaded symbols for /lib/libc.so.7
Reading symbols from /usr/local/lib/libiconv.so.3...done.
Loaded symbols for /usr/local/lib/libiconv.so.3
Reading symbols from /lib/libz.so.5...done.
Loaded symbols for /lib/libz.so.5
Reading symbols from /usr/local/lib/postgresql/plpgsql.so...done.
Loaded symbols for /usr/local/lib/postgresql/plpgsql.so
Reading symbols from /usr/local/lib/postgresql/citext.so...done.
Loaded symbols for /usr/local/lib/postgresql/citext.so
Reading symbols from /lib/libthr.so.3...done.
Loaded symbols for /lib/libthr.so.3
Reading symbols from /usr/local/lib/postgresql/pg_trgm.so...done.
Loaded symbols for /usr/local/lib/postgresql/pg_trgm.so
Reading symbols from /usr/local/lib/postgresql/dict_snowball.so...done.
Loaded symbols for /usr/local/lib/postgresql/dict_snowball.so
Reading symbols from /libexec/ld-elf.so.1...done.
Loaded symbols for /libexec/ld-elf.so.1
#0 0x0000000802a0d5dc in pthread_mutex_lock () from /lib/libthr.so.3
[New Thread 801a42540 (LWP 100325)]
(gdb) bt
#0 0x0000000802a0d5dc in pthread_mutex_lock () from /lib/libthr.so.3
#1 0x0000000800d619ae in xmlRMutexLock (tok=0x801a12360) at threads.c:369
#2 0x0000000800dd6864 in xmlDictReference (dict=0x801a05760) at dict.c:510
#3 0x0000000800dd98e2 in xmlSAX2StartDocument (ctx=0x801a6c600) at
SAX2.c:999
#4 0x0000000800cdf992 in xmlParseDocument (ctxt=0x801a6c600) at
parser.c:10285
#5 0x0000000800ce8bc7 in xmlDoRead (ctxt=0x801a6c600, URL=0x0,
encoding=0x0, options=0, reuse=1) at parser.c:14612
#6 0x0000000800ce915d in xmlCtxtReadMemory (ctxt=0x801a6c600,
buffer=0x801b55440 "<anonymized xml"...,
size=272, URL=0x0, encoding=0x0, options=0) at parser.c:14890
#7 0x000000000076fcd2 in xpath (fcinfo=0x7fffffffd310) at xml.c:3400
#8 0x000000000059e7bb in ExecMakeFunctionResult (fcache=0x80293fee0,
econtext=0x802833140, isNull=0x7fffffffd84b "",
isDone=0x0) at execQual.c:1827
#9 0x000000000059f183 in ExecEvalFunc (fcache=0x80293fee0,
econtext=0x802833140, isNull=0x7fffffffd84b "", isDone=0x0)
at execQual.c:2263
#10 0x00000008025c7514 in exec_eval_simple_expr (estate=0x7fffffffdb20,
expr=0x8028df8e0, result=0x7fffffffd7e8,
isNull=0x7fffffffd84b "", rettype=0x7fffffffd84c) at pl_exec.c:4597
#11 0x00000008025c6c8c in exec_eval_expr (estate=0x7fffffffdb20,
expr=0x8028df8e0, isNull=0x7fffffffd84b "",
rettype=0x7fffffffd84c) at pl_exec.c:4188
#12 0x00000008025c3755 in exec_stmt_raise (estate=0x7fffffffdb20,
stmt=0x8028df800) at pl_exec.c:2485
#13 0x00000008025c18cb in exec_stmt (estate=0x7fffffffdb20,
stmt=0x8028df800) at pl_exec.c:1326
#14 0x00000008025c168a in exec_stmts (estate=0x7fffffffdb20,
stmts=0x801b5b0e8) at pl_exec.c:1233
#15 0x00000008025c14d5 in exec_stmt_block (estate=0x7fffffffdb20,
block=0x8028e99a0) at pl_exec.c:1170
#16 0x00000008025bfa85 in plpgsql_exec_function (func=0x801bd8440,
fcinfo=0x7fffffffdda0) at pl_exec.c:316
#17 0x00000008025ba99e in plpgsql_call_handler (fcinfo=0x7fffffffdda0) at
pl_handler.c:122
#18 0x000000000059e7bb in ExecMakeFunctionResult (fcache=0x801bec440,
econtext=0x801bec250, isNull=0x801bece98 "",
isDone=0x801becfb0) at execQual.c:1827
#19 0x000000000059f183 in ExecEvalFunc (fcache=0x801bec440,
econtext=0x801bec250, isNull=0x801bece98 "",
isDone=0x801becfb0) at execQual.c:2263
#20 0x00000000005a5347 in ExecTargetList (targetlist=0x801becf80,
econtext=0x801bec250, values=0x801bece80,
isnull=0x801bece98 "", itemIsDone=0x801becfb0, isDone=0x7fffffffe344)
at execQual.c:5089
#21 0x00000000005a5913 in ExecProject (projInfo=0x801beceb0,
isDone=0x7fffffffe344) at execQual.c:5304
#22 0x00000000005b9da3 in ExecResult (node=0x801bec140) at nodeResult.c:155
#23 0x000000000059b3bf in ExecProcNode (node=0x801bec140) at
execProcnode.c:355
#24 0x00000000005993dd in ExecutePlan (estate=0x801bec030,
planstate=0x801bec140, operation=CMD_SELECT,
sendTuples=1 '\001', numberTuples=0, direction=ForwardScanDirection,
dest=0x801b3fb30) at execMain.c:1188
#25 0x0000000000598063 in standard_ExecutorRun (queryDesc=0x801bbc830,
direction=ForwardScanDirection, count=0)
at execMain.c:280
#26 0x0000000000597f55 in ExecutorRun (queryDesc=0x801bbc830,
direction=ForwardScanDirection, count=0) at execMain.c:229
#27 0x0000000000697ab8 in PortalRunSelect (portal=0x801a68030, forward=1
'\001', count=0, dest=0x801b3fb30) at pquery.c:952
#28 0x00000000006977ae in PortalRun (portal=0x801a68030,
count=9223372036854775807, isTopLevel=1 '\001', dest=0x801b3fb30,
altdest=0x801b3fb30, completionTag=0x7fffffffe640 "") at pquery.c:796
#29 0x0000000000691cb1 in exec_simple_query (
query_string=0x801a2a030 "select some_proc(2, E'<anonymized xml
..."...)
at postgres.c:1058
#30 0x0000000000695c63 in PostgresMain (argc=2, argv=0x801a1a6f8,
username=0x801a1a6c0 "pgsql") at postgres.c:3936
#31 0x0000000000657a9d in BackendRun (port=0x801a6c300) at
postmaster.c:3555
#32 0x00000000006571da in BackendStartup (port=0x801a6c300) at
postmaster.c:3242
#33 0x000000000065463c in ServerLoop () at postmaster.c:1431
#34 0x0000000000653e10 in PostmasterMain (argc=3, argv=0x7fffffffebb8) at
postmaster.c:1092
#35 0x00000000005d7bca in main (argc=3, argv=0x7fffffffebb8) at main.c:188

Is some XML (parser) context somehow being shared / carried over between
postgresql processes? postmaster does not seem to want to run under
valgrind at all (perhaps low memory in the VM? should it work easily?) so I
don't have more details about what it thinks goes wrong. I'll keep looking,
but any suggestions are very welcome!

Best,
Dennis

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message p.de.waal 2011-07-25 10:43:48 Re: BUG #5953: pgadmin sql-query text pad doesn't work
Previous Message Robert Haas 2011-07-22 21:30:40 Re: BUG #6128: A boolean variable doesn't evaluate properly in an IF conditional...