Re: Segfaults and assertion failures with not too extraordinary views and queries

From: Phil Frost <phil(at)macprofessionals(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Re: Segfaults and assertion failures with not too extraordinary views and queries
Date: 2007-02-15 18:25:07
Message-ID: E50CC563-6C4C-46B5-8C8D-5F98AEB4BED9@macprofessionals.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Yeah, maybe I am the only one doing this to the planner. The
optimizations which caused this error will be greatly appreciated
when I stop getting segfaults, however :)

I rebuilt my pg with the attached patch and it did indeed resolve the
assertion failure the previous test produced. I tinkered some more
and got this patched build to segfault with a slightly different
script, below.

Part of the funkiness seems to be in that the view is recursive;
orderitem_with_prices calls sum_of_subitem_total_taxable_prices,
which queries orderitem_with_prices. I would not be suprised if no
one else is doing this, since I think the only way to get it done
under normal conditions is to create an empty function first (since
the view doesn't exist), make the view, then replace the function; or
to set check_function_bodies = false. As weird as it is though, it
worked in 8.1 and a segfault is bad in any case.

-------------------------------------

SET check_function_bodies = false;
SET client_min_messages = warning;

CREATE SCHEMA private;

CREATE TABLE private.orderitem (
objectid integer,
showsubitems boolean,
longdescription character varying,
showsubitemprices boolean,
quantity numeric,
sortorder real,
superitem integer,
notes character varying,
name character varying,
"order" integer
);

CREATE FUNCTION public.i_have_global_priv() RETURNS boolean
AS $$
select true
$$
LANGUAGE sql STABLE SECURITY DEFINER;

CREATE VIEW public.orderitem AS
SELECT orderitem.objectid,
orderitem.superitem
FROM private.orderitem WHERE i_have_global_priv();

CREATE FUNCTION public.sum_of_subitem_total_taxable_prices(integer)
RETURNS numeric
AS $_$
select 1.0 from orderitem_with_prices where superitem = $1 --
changing this to just "select 1.0" avoids the problem
$_$
LANGUAGE sql STABLE STRICT;

CREATE TABLE private.orderitemproduct (
objectid integer
);

CREATE VIEW public.orderitemproduct AS
SELECT orderitemproduct.objectid
FROM private.orderitemproduct WHERE i_have_global_priv();

CREATE VIEW public.orderitem_with_prices AS
SELECT orderitem.objectid,
orderitem.superitem,
sum_of_subitem_total_taxable_prices(orderitem.objectid) AS
taxablepriceeach
FROM (orderitem LEFT JOIN orderitemproduct USING (objectid));

insert into private.orderitem(objectid) select generate_series(1, 6000);
insert into private.orderitem(objectid, superitem) select
generate_series(6001, 12000), generate_series(1, 6000);

insert into private.orderitemproduct(objectid) select generate_series
(1, 6000);

ALTER TABLE ONLY private.orderitem
ADD CONSTRAINT orderitem_pkey PRIMARY KEY (objectid);

ALTER TABLE ONLY private.orderitemproduct
ADD CONSTRAINT orderitemproduct_pkey PRIMARY KEY (objectid);

--------------------------------

To produce the segfault, I do this:

vacuum analyze;
set enable_seqscan = 0;
prepare crash as select * from orderitem_with_prices limit 1;
explain execute crash;
execute crash;

The error seems to depend on the plan chosen; here it won't fail if I
don't vacuum. The plan generated for me is:

QUERY PLAN
------------------------------------------------------------------------
---------------------------------------------------
Limit (cost=0.01..0.06 rows=1 width=8)
-> Result (cost=0.01..680.51 rows=12000 width=8)
One-Time Filter: i_have_global_priv()
-> Merge Left Join (cost=0.01..650.51 rows=12000 width=8)
Merge Cond: (private.orderitem.objectid =
private.orderitemproduct.objectid)
-> Index Scan using orderitem_pkey on orderitem
(cost=0.00..361.25 rows=12000 width=8)
-> Result (cost=0.00..184.25 rows=6000 width=4)
One-Time Filter: i_have_global_priv()
-> Index Scan using orderitemproduct_pkey on
orderitemproduct (cost=0.00..184.25 rows=6000 width=4)

I get this backtrace:

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x0000000c
ExecEvalVar (exprstate=0xed0aec, econtext=0xed0b08, isNull=0xbfffc5c8
"????", isDone=0x0) at execQual.c:496
496 TupleDesc slot_tupdesc = slot-
>tts_tupleDescriptor;
(gdb) bt
#0 ExecEvalVar (exprstate=0xed0aec, econtext=0xed0b08,
isNull=0xbfffc5c8 "????", isDone=0x0) at execQual.c:496
#1 0x000e9304 in ExecEvalExprSwitchContext (expression=0xed0aec,
econtext=0xed0b08, isNull=0xbfffc5c8 "????", isDone=0x0) at
execQual.c:3302
#2 0x000f5750 in ExecIndexEvalRuntimeKeys (econtext=0xed0b08,
runtimeKeys=0xed0b08, numRuntimeKeys=1) at nodeIndexscan.c:249
#3 0x000f5810 in ExecIndexReScan (node=0xed0458, exprCtxt=0xed0b08)
at nodeIndexscan.c:202
#4 0x000e2040 in ExecReScan (node=0xed0458, exprCtxt=0x20692ec) at
execAmi.c:125
#5 0x000e1ff0 in ExecReScan (node=0x206a5a8, exprCtxt=0x20692ec) at
execAmi.c:105
#6 0x000f7f88 in ExecNestLoop (node=0x2069260) at nodeNestloop.c:162
#7 0x000e59e0 in ExecProcNode (node=0x2069260) at execProcnode.c:382
#8 0x000f8a5c in ExecResult (node=0x2068cb4) at nodeResult.c:130
#9 0x000e5950 in ExecProcNode (node=0x2068cb4) at execProcnode.c:334
#10 0x000e405c in ExecutorRun (queryDesc=0xeccef4,
direction=ForwardScanDirection, count=1) at execMain.c:1082
#11 0x000eee44 in postquel_getnext (es=0xece2b8) at functions.c:359
#12 0x000eff98 in fmgr_sql (fcinfo=0xbfffcdf8) at functions.c:460
#13 0x000e8fe8 in ExecMakeFunctionResult (fcache=0x2072794,
econtext=0x2072664, isNull=0xec4fca "", isDone=0xec5030) at
execQual.c:1269
#14 0x000e99b0 in ExecProject (projInfo=0x2072bec, isDone=0xbfffd108)
at execQual.c:4119
#15 0x000f8a90 in ExecResult (node=0x20725d8) at nodeResult.c:157
#16 0x000e5950 in ExecProcNode (node=0x20725d8) at execProcnode.c:334
#17 0x000fa268 in ExecLimit (node=0x20724b4) at nodeLimit.c:84
#18 0x000e5a80 in ExecProcNode (node=0x20724b4) at execProcnode.c:425
#19 0x000e405c in ExecutorRun (queryDesc=0x205f048,
direction=ForwardScanDirection, count=0) at execMain.c:1082
#20 0x0018567c in PortalRunSelect (portal=0x20466a8, forward=1
'\001', count=0, dest=0x204603c) at pquery.c:831
#21 0x00186ef8 in PortalRun (portal=0x20466a8, count=2147483647,
dest=0x204603c, altdest=0x204603c, completionTag=0x0) at pquery.c:656
#22 0x000bc7b4 in ExecuteQuery (stmt=0x2031cf0, params=0x0,
dest=0x204603c, completionTag=0x0) at prepare.c:216
#23 0x00185d60 in PortalRunUtility (portal=0x204661c,
query=0x2031d98, dest=0x204603c, completionTag=0x0) at pquery.c:1063
#24 0x00186900 in FillPortalStore (portal=0x204661c) at pquery.c:937
#25 0x00186fac in PortalRun (portal=0x204661c, count=2147483647,
dest=0x2031d4c, altdest=0x2031d4c, completionTag=0xbfffdc5a "") at
pquery.c:679
#26 0x001823c8 in exec_simple_query (query_string=0x2031a1c "execute
crash;") at postgres.c:939
#27 0x00183c54 in PostgresMain (argc=4, argv=0x201544c,
username=0x2015424 "pfrost") at postgres.c:3424
#28 0x0015483c in ServerLoop () at postmaster.c:2931
#29 0x00155cf4 in PostmasterMain (argc=3, argv=0x1900750) at
postmaster.c:963
#30 0x001094b8 in main (argc=3, argv=0x1900780) at main.c:188

On Feb 14, 2007, at 10:13 PM, Tom Lane wrote:

> I wrote:
>> Hm, I see the assert failure, but this example doesn't seem to crash
>> when asserts are off, and I'd not expect it to: it should either
>> work or
>> elog(ERROR) in ExecRestrPos. So maybe you've found more than one
>> issue.
>
> The attached patch fixes the failure you exhibited, but I still don't
> see how this problem would lead to a non-assert crash --- what you
> should get is elog(ERROR, "unrecognized node type") from ExecRestrPos.
> So please try some more test cases and see if there's another problem
> lurking.
>
> BTW, the problem was due to code added in response to an earlier
> gripe of
> yours (allowing gating conditions to be pushed further down in the
> join
> tree). Maybe you're the only person stressing that particular aspect
> of the planner ...
>
> regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message churi 2007-02-15 18:52:32 BUG #3015: libpq: PQftype() on a lo type result column returns Oid of type oid instead of Oid of type lo.
Previous Message Phil Endecott 2007-02-15 15:27:33 Fwd: Re: BUG #3002: PQexecParams only supports some commands; needs improved error reporting, documenting or fixing