More parallel-query fun

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: More parallel-query fun
Date: 2016-06-16 18:14:30
Message-ID: 22782.1466100870@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

As of HEAD you can exercise quite a lot of parallel query behavior
by running the regression tests with these settings applied:

force_parallel_mode = regress
max_parallel_workers_per_gather = 2 -- this is default at the moment
min_parallel_relation_size = 0
parallel_setup_cost = 0
parallel_tuple_cost = 0

This results in multiple interesting failures, including a core dump
here:

Program terminated with signal 11, Segmentation fault.
#0 shm_mq_set_handle (mqh=0x0, handle=0x1ac3090) at shm_mq.c:312
312 Assert(mqh->mqh_handle == NULL);
(gdb) bt
#0 shm_mq_set_handle (mqh=0x0, handle=0x1ac3090) at shm_mq.c:312
#1 0x00000000004e0fd9 in LaunchParallelWorkers (pcxt=0x1ac2dd8)
at parallel.c:479
#2 0x00000000005f40fd in ExecGather (node=0x1b05508) at nodeGather.c:168
#3 0x00000000005e3011 in ExecProcNode (node=0x1b05508) at execProcnode.c:515
#4 0x00000000005fe795 in ExecNestLoop (node=0x1afe7f0) at nodeNestloop.c:174
#5 0x00000000005e2f87 in ExecProcNode (node=0x1afe7f0) at execProcnode.c:476
#6 0x000000000060135b in ExecSort (node=0x1afe520) at nodeSort.c:103
#7 0x00000000005e2fc7 in ExecProcNode (node=0x1afe520) at execProcnode.c:495
#8 0x00000000005e15c8 in ExecutePlan (queryDesc=0x1a44f98,
direction=NoMovementScanDirection, count=0) at execMain.c:1567
#9 standard_ExecutorRun (queryDesc=0x1a44f98,
direction=NoMovementScanDirection, count=0) at execMain.c:338
#10 0x00000000005e16b6 in ExecutorRun (queryDesc=<value optimized out>,
direction=<value optimized out>, count=<value optimized out>)
at execMain.c:286

(gdb) p debug_query_string
$1 = 0x1a965e8 "SELECT n.nspname as \"Schema\",\n p.proname AS \"Name\",\n pg_catalog.format_type(p.prorettype, NULL) AS \"Result data type\",\n CASE WHEN p.pronargs = 0\n THEN CAST('*' AS pg_catalog.text)\n ELSE pg_ca"...

The statement that triggers this varies from run to run, but the proximate
cause, namely error_mqh being null at parallel.c:479, seems consistent.
It looks to me like parallel.c's handling of insufficiently-many-workers
is a few bricks shy of a load.

I saw another previously-unreported problem before getting to the crash:

*** /home/postgres/pgsql/src/test/regress/expected/enum.out Mon Oct 20 10:50:24 2014
--- /home/postgres/pgsql/src/test/regress/results/enum.out Thu Jun 16 14:00:58 2016
***************
*** 284,306 ****
-- Aggregates
--
SELECT min(col) FROM enumtest;
! min
! -----
! red
! (1 row)
!
SELECT max(col) FROM enumtest;
! max
! --------
! purple
! (1 row)
!
SELECT max(col) FROM enumtest WHERE col < 'green';
! max
! --------
! yellow
! (1 row)
!
--
-- Index tests, force use of index
--
--- 284,294 ----
-- Aggregates
--
SELECT min(col) FROM enumtest;
! ERROR: type matched to anyenum is not an enum type: anyenum
SELECT max(col) FROM enumtest;
! ERROR: type matched to anyenum is not an enum type: anyenum
SELECT max(col) FROM enumtest WHERE col < 'green';
! ERROR: type matched to anyenum is not an enum type: anyenum
--
-- Index tests, force use of index
--

Haven't tried to trace that one down yet.

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2016-06-16 18:16:35 Re: [HACKERS] Re: pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <
Previous Message Andres Freund 2016-06-16 18:01:41 Re: [HACKERS] Re: pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <