Endless loop in ExecNestLoop

From: Philipp Reisner <philipp(dot)reisner(at)linbit(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Cc: sd-log(at)linbit(dot)com
Subject: Endless loop in ExecNestLoop
Date: 2006-01-30 16:46:49
Message-ID: 200601301745.07031.philipp.reisner@linbit.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

The symptom:

The PostgreSQL backend sometimes does not terminate a query. (Killed after
the backend consumed 1h30 of CPU time.)

That query usually executes in less than 2 seconds, see also the explain
analyze.

Version: 8.0.2

Query:
select cel.objid,
cel.shortname, ser.shortname,
con.isInventoryFlag,
dev.objid, dty.objid, loc.objid,
dev.shortname, loc.shortname, loc.name, man.shortName,
dty.type, dev.serNr, dev.invNr, dev.component,
loc.zip, loc.city, con.shortname, con.name,
CASE
WHEN CURRENT_DATE between loc.locationstateFrom and
loc.locationstateTo THEN loc.locationstateText
WHEN CURRENT_DATE between loc.locationstate2From and
loc.locationstate2To THEN loc.locationstate2Text
ELSE null
END
, CASE
WHEN CURRENT_DATE between loc.locationstateFrom and
loc.locationstateTo THEN loc.locationstateColorCode
WHEN CURRENT_DATE between loc.locationstate2From and
loc.locationstate2To THEN loc.locationstate2ColorCode
ELSE null
END
,
loc.locationcategory, loc.province
from contracts con, contractelements cel
left outer join servicetypes ser on ser.objid = cel.ser_objid,
devices dev, locations loc, devicetypes dty, bpartners man
where con.csp_objid in
(587703,587726,728917,969346,1043154,1368907,1368915,1368944,4176279,4492405,4493459,5682484,7465538)
and cel.con_objid = con.objid
and dev.cel_objid = cel.objid
and loc.objid = dev.loc_objid
and dty.objid = dev.dty_objid
and man.objid = dty.bpa_objid
and upper(cel.isActiv) = 'Y'
and upper(coalesce(dev.isActiv,'Y')) = 'Y'
and upper(coalesce(dev.IsCommittedSP,'Y')) = 'Y'
and upper(coalesce(dev.IsCommittedCust,'Y')) = 'Y'
and upper(loc.shortName) = 'WNB531'
order by 18 ASC

Explain Analyze:

Sort (cost=12084.89..12084.89 rows=1 width=477) (actual
time=1344.417..1344.424 rows=5 loops=1)
Sort Key: con.shortname
-> Nested Loop (cost=79.18..12084.88 rows=1 width=477) (actual
time=977.884..1343.987 rows=5 loops=1)
-> Nested Loop (cost=79.18..12081.71 rows=1 width=130) (actual
time=6.363..1215.421 rows=12829 loops=1)
-> Nested Loop (cost=79.18..12078.64 rows=1 width=125)
(actual time=6.315..1045.691 rows=12829 loops=1)
-> Nested Loop (cost=79.18..12075.58 rows=1 width=111)
(actual time=6.256..855.119 rows=12829 loops=1)
-> Hash Join (cost=79.18..4321.10 rows=8
width=52) (actual time=5.668..666.173 rows=101 loops=1)
Hash Cond: ("outer".con_objid =
"inner".objid)
-> Hash Left Join (cost=1.18..4241.09
rows=385 width=27) (actual time=3.067..535.397 rows=73383 loops=1)
Hash Cond: ("outer".ser_objid =
"inner".objid)
-> Seq Scan on contractelements cel
(cost=0.00..4237.98 rows=385 width=21) (actual time=2.869..285.115 rows=73383
loops=1)
Filter: (upper((isactiv)::text) =
'Y'::text)
-> Hash (cost=1.14..1.14 rows=14
width=14) (actual time=0.092..0.092 rows=14 loops=1)
-> Seq Scan on servicetypes ser
(cost=0.00..1.14 rows=14 width=14) (actual time=0.024..0.052 rows=14 loops=1)
-> Hash (cost=77.82..77.82 rows=73
width=33) (actual time=0.378..0.378 rows=31 loops=1)
-> Bitmap Heap Scan on contracts con
(cost=26.26..77.82 rows=73 width=33) (actual time=0.196..0.304 rows=31
loops=1)
Recheck Cond: ((csp_objid =
587703) OR (csp_objid = 587726) OR (csp_objid = 728917) OR (csp_objid =
969346) OR (csp_objid = 1043154) OR (csp_objid = 1368907) OR (csp_objid =
1368915) OR (csp_objid = 1368944) OR (csp_objid = 4176279) OR (csp_objid =
4492405) OR (csp_objid = 4493459) OR (csp_objid = 5682484) OR (csp_objid =
7465538))
-> BitmapOr (cost=26.26..26.26
rows=74 width=0) (actual time=0.160..0.160 rows=0 loops=1)
-> Bitmap Index Scan on
contracts_csp_objid (cost=0.00..2.12 rows=34 width=0) (actual
time=0.077..0.077 rows=31 loops=1)
Index Cond:
(csp_objid = 587703)
-> Bitmap Index Scan on
contracts_csp_objid (cost=0.00..2.01 rows=3 width=0) (actual
time=0.005..0.005 rows=0 loops=1)
Index Cond:
(csp_objid = 587726)
-> Bitmap Index Scan on
contracts_csp_objid (cost=0.00..2.01 rows=3 width=0) (actual
time=0.004..0.004 rows=0 loops=1)
Index Cond:
(csp_objid = 728917)
-> Bitmap Index Scan on
contracts_csp_objid (cost=0.00..2.01 rows=3 width=0) (actual
time=0.004..0.004 rows=0 loops=1)
Index Cond:
(csp_objid = 969346)
-> Bitmap Index Scan on
contracts_csp_objid (cost=0.00..2.01 rows=3 width=0) (actual
time=0.004..0.004 rows=0 loops=1)
Index Cond:
(csp_objid = 1043154)
-> Bitmap Index Scan on
contracts_csp_objid (cost=0.00..2.01 rows=3 width=0) (actual
time=0.010..0.010 rows=0 loops=1)
Index Cond:
(csp_objid = 1368907)
-> Bitmap Index Scan on
contracts_csp_objid (cost=0.00..2.01 rows=3 width=0) (actual
time=0.004..0.004 rows=0 loops=1)
Index Cond:
(csp_objid = 1368915)
-> Bitmap Index Scan on
contracts_csp_objid (cost=0.00..2.01 rows=3 width=0) (actual
time=0.004..0.004 rows=0 loops=1)
Index Cond:
(csp_objid = 1368944)
-> Bitmap Index Scan on
contracts_csp_objid (cost=0.00..2.01 rows=3 width=0) (actual
time=0.005..0.005 rows=0 loops=1)
Index Cond:
(csp_objid = 4176279)
-> Bitmap Index Scan on
contracts_csp_objid (cost=0.00..2.01 rows=3 width=0) (actual
time=0.004..0.004 rows=0 loops=1)
Index Cond:
(csp_objid = 4492405)
-> Bitmap Index Scan on
contracts_csp_objid (cost=0.00..2.01 rows=3 width=0) (actual
time=0.004..0.004 rows=0 loops=1)
Index Cond:
(csp_objid = 4493459)
-> Bitmap Index Scan on
contracts_csp_objid (cost=0.00..2.01 rows=3 width=0) (actual
time=0.004..0.004 rows=0 loops=1)
Index Cond:
(csp_objid = 5682484)
-> Bitmap Index Scan on
contracts_csp_objid (cost=0.00..2.01 rows=3 width=0) (actual
time=0.005..0.005 rows=0 loops=1)
Index Cond:
(csp_objid = 7465538)
-> Index Scan using devices_cel_objid on devices
dev (cost=0.00..969.30 rows=1 width=63) (actual time=0.588..1.401 rows=127
loops=101)
Index Cond: (dev.cel_objid = "outer".objid)
Filter: ((upper((COALESCE(isactiv,
'Y'::character varying))::text) = 'Y'::text) AND
(upper((COALESCE(iscommittedsp, 'Y'::character varying))::text) = 'Y'::text)
AND (upper((COALESCE(iscommittedcust, 'Y'::character varying))::text) =
'Y'::text))
-> Index Scan using devicetypes_pkey on devicetypes dty
(cost=0.00..3.04 rows=1 width=18) (actual time=0.005..0.007 rows=1
loops=12829)
Index Cond: (dty.objid = "outer".dty_objid)
-> Index Scan using bpartners_pkey on bpartners man
(cost=0.00..3.06 rows=1 width=13) (actual time=0.004..0.006 rows=1
loops=12829)
Index Cond: (man.objid = "outer".bpa_objid)
-> Index Scan using locations_pkey on locations loc
(cost=0.00..3.11 rows=1 width=351) (actual time=0.006..0.006 rows=0
loops=12829)
Index Cond: (loc.objid = "outer".loc_objid)
Filter: (upper((shortname)::text) = 'WNB531'::text)
Total runtime: 1345.103 ms
(55 rows)

Stack trace of the hanging backend.

#0 0x000000000043c743 in heap_deformtuple ()
#1 0x000000000043c949 in slot_getattr ()
#2 0x00000000004e3fae in ExecMakeTableFunctionResult ()
#3 0x00000000004e210d in ExecMakeFunctionResult ()
#4 0x00000000004e210d in ExecMakeFunctionResult ()
#5 0x00000000004e5063 in ExecQual ()
#6 0x00000000004e554e in ExecScan ()
#7 0x00000000004e0cc8 in ExecProcNode ()
#8 0x00000000004ebeb6 in ExecEndHashJoin ()
#9 0x00000000004eb7fa in ExecHashJoin ()
#10 0x00000000004e0d50 in ExecProcNode ()
#11 0x00000000004ee29d in ExecNestLoop ()
#12 0x00000000004e0d36 in ExecProcNode ()
#13 0x00000000004ee261 in ExecNestLoop () <-------
#14 0x00000000004e0d36 in ExecProcNode ()
#15 0x00000000004ee261 in ExecNestLoop ()
#16 0x00000000004e0d36 in ExecProcNode ()
#17 0x00000000004ee261 in ExecNestLoop ()
#18 0x00000000004e0d36 in ExecProcNode ()
#19 0x00000000004ee261 in ExecNestLoop ()
#20 0x00000000004e0d36 in ExecProcNode ()
#21 0x00000000004ef57c in ExecSort ()
#22 0x00000000004e0d6a in ExecProcNode ()
#23 0x00000000004df73f in ExecEndPlan ()
#24 0x00000000004dec09 in ExecutorRun ()
#25 0x000000000055f87a in PortalRun ()
#26 0x000000000055f5f9 in PortalRun ()
#27 0x000000000055b561 in pg_plan_queries ()
#28 0x000000000055df8c in PostgresMain ()
#29 0x0000000000539b01 in ClosePostmasterPorts ()
#30 0x00000000005395d0 in ClosePostmasterPorts ()
#31 0x0000000000537b4d in PostmasterMain ()
#32 0x0000000000536e59 in PostmasterMain ()
#33 0x00000000004fda0c in main ()

With gdb's ability to execute until the current function (stack frame) return,
I found out which function (stack frame) is the up-most in this endless loop.
That frame is marked with the arrow.

Is this bug-report of any use so far ?

best regards,
Philipp Reisner
--
: Dipl-Ing Philipp Reisner Tel +43-1-8178292-50 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Schönbrunnerstr 244, 1120 Vienna, Austria http://www.linbit.com :

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Fuhr 2006-01-30 17:12:22 Re: BUG #2224: unlogical syntax error
Previous Message Tom Lane 2006-01-30 16:42:57 Re: Endless loop in ExecNestLoop