Re: Parallel Seq Scan

From: Thom Brown <thom(at)linux(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan
Date: 2015-03-12 17:05:48
Message-ID: CAA-aLv5VvRXMf-Ov1NThoLmHDLwP8S1+ngtS-Xzfp8pTrtibow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12 March 2015 at 16:20, Thom Brown <thom(at)linux(dot)com> wrote:
> On 12 March 2015 at 15:29, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> On Thu, Mar 12, 2015 at 8:33 PM, Thom Brown <thom(at)linux(dot)com> wrote:
>>>
>>> On 12 March 2015 at 14:46, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>> > One additional change (we need to SetLatch() in
>>> > HandleParallelMessageInterrupt)
>>> > is done to handle the hang issue reported on parallel-mode thread.
>>> > Without this change it is difficult to verify the patch (will remove
>>> > this
>>> > change
>>> > once new version of parallel-mode patch containing this change will be
>>> > posted).
>>>
>>> Applied parallel-mode-v7.patch and parallel_seqscan_v10.patch, but
>>> getting this error when building:
>>>
>>> gcc -Wall -Wmissing-prototypes -Wpointer-arith
>>> -Wdeclaration-after-statement -Wendif-labels
>>> -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
>>> -fwrapv -fexcess-precision=standard -O2 -I../../../../src/include
>>> -D_GNU_SOURCE -c -o brin.o brin.c -MMD -MP -MF .deps/brin.Po
>>> In file included from ../../../../src/include/nodes/execnodes.h:18:0,
>>> from ../../../../src/include/access/brin.h:14,
>>> from brin.c:18:
>>> ../../../../src/include/access/heapam.h:119:34: error: unknown type
>>> name ‘ParallelHeapScanDesc’
>>> extern void heap_parallel_rescan(ParallelHeapScanDesc pscan,
>>> HeapScanDesc scan);
>>> ^
>>>
>>> Am I missing another patch here?
>>
>> Yes, the below parallel-heap-scan patch.
>> http://www.postgresql.org/message-id/CA+TgmoYJETgeAXUsZROnA7BdtWzPtqExPJNTV1GKcaVMgSdhug@mail.gmail.com
>>
>> Please note that parallel_setup_cost and parallel_startup_cost are
>> still set to zero by default, so you need to set it to higher values
>> if you don't want the parallel plans once parallel_seqscan_degree
>> is set. I have yet to comeup with default values for them, needs
>> some tests.
>
> Thanks. Getting a problem:
>
> createdb pgbench
> pgbench -i -s 200 pgbench
>
> CREATE TABLE pgbench_accounts_1 (CHECK (bid = 1)) INHERITS (pgbench_accounts);
> ...
> CREATE TABLE pgbench_accounts_200 (CHECK (bid = 200)) INHERITS
> (pgbench_accounts);
>
> WITH del AS (DELETE FROM pgbench_accounts WHERE bid = 1 RETURNING *)
> INSERT INTO pgbench_accounts_1 SELECT * FROM del;
> ...
> WITH del AS (DELETE FROM pgbench_accounts WHERE bid = 200 RETURNING *)
> INSERT INTO pgbench_accounts_200 SELECT * FROM del;
>
> VACUUM ANALYSE;
>
> # SELECT name, setting FROM pg_settings WHERE name IN
> ('parallel_seqscan_degree','max_worker_processes','seq_page_cost');
> name | setting
> -------------------------+---------
> max_worker_processes | 20
> parallel_seqscan_degree | 8
> seq_page_cost | 1000
> (3 rows)
>
> # EXPLAIN SELECT DISTINCT bid FROM pgbench_accounts;
> ERROR: too many dynamic shared memory segments
>
>
> And separately, I've seen this in the logs:
>
> 2015-03-12 16:09:30 GMT [7880]: [4-1] user=,db=,client= LOG:
> registering background worker "parallel worker for PID 7889"
> 2015-03-12 16:09:30 GMT [7880]: [5-1] user=,db=,client= LOG:
> registering background worker "parallel worker for PID 7889"
> 2015-03-12 16:09:30 GMT [7880]: [6-1] user=,db=,client= LOG:
> registering background worker "parallel worker for PID 7889"
> 2015-03-12 16:09:30 GMT [7880]: [7-1] user=,db=,client= LOG:
> registering background worker "parallel worker for PID 7889"
> 2015-03-12 16:09:30 GMT [7880]: [8-1] user=,db=,client= LOG:
> registering background worker "parallel worker for PID 7889"
> 2015-03-12 16:09:30 GMT [7880]: [9-1] user=,db=,client= LOG:
> registering background worker "parallel worker for PID 7889"
> 2015-03-12 16:09:30 GMT [7880]: [10-1] user=,db=,client= LOG:
> registering background worker "parallel worker for PID 7889"
> 2015-03-12 16:09:30 GMT [7880]: [11-1] user=,db=,client= LOG:
> registering background worker "parallel worker for PID 7889"
> 2015-03-12 16:09:30 GMT [7880]: [12-1] user=,db=,client= LOG:
> starting background worker process "parallel worker for PID 7889"
> 2015-03-12 16:09:30 GMT [7880]: [13-1] user=,db=,client= LOG:
> starting background worker process "parallel worker for PID 7889"
> 2015-03-12 16:09:30 GMT [7880]: [14-1] user=,db=,client= LOG:
> starting background worker process "parallel worker for PID 7889"
> 2015-03-12 16:09:30 GMT [7880]: [15-1] user=,db=,client= LOG:
> starting background worker process "parallel worker for PID 7889"
> 2015-03-12 16:09:30 GMT [7880]: [16-1] user=,db=,client= LOG:
> starting background worker process "parallel worker for PID 7889"
> 2015-03-12 16:09:30 GMT [7880]: [17-1] user=,db=,client= LOG:
> starting background worker process "parallel worker for PID 7889"
> 2015-03-12 16:09:30 GMT [7880]: [18-1] user=,db=,client= LOG:
> starting background worker process "parallel worker for PID 7889"
> 2015-03-12 16:09:30 GMT [7880]: [19-1] user=,db=,client= LOG:
> starting background worker process "parallel worker for PID 7889"
> 2015-03-12 16:09:30 GMT [7880]: [20-1] user=,db=,client= LOG: worker
> process: parallel worker for PID 7889 (PID 7913) exited with exit code
> 0
> 2015-03-12 16:09:30 GMT [7880]: [21-1] user=,db=,client= LOG:
> unregistering background worker "parallel worker for PID 7889"
> 2015-03-12 16:09:30 GMT [7880]: [22-1] user=,db=,client= LOG: worker
> process: parallel worker for PID 7889 (PID 7919) exited with exit code
> 0
> 2015-03-12 16:09:30 GMT [7880]: [23-1] user=,db=,client= LOG:
> unregistering background worker "parallel worker for PID 7889"
> 2015-03-12 16:09:30 GMT [7880]: [24-1] user=,db=,client= LOG: worker
> process: parallel worker for PID 7889 (PID 7916) exited with exit code
> 0
> 2015-03-12 16:09:30 GMT [7880]: [25-1] user=,db=,client= LOG:
> unregistering background worker "parallel worker for PID 7889"
> 2015-03-12 16:09:30 GMT [7880]: [26-1] user=,db=,client= LOG: worker
> process: parallel worker for PID 7889 (PID 7918) exited with exit code
> 0
> 2015-03-12 16:09:30 GMT [7880]: [27-1] user=,db=,client= LOG:
> unregistering background worker "parallel worker for PID 7889"
> 2015-03-12 16:09:30 GMT [7880]: [28-1] user=,db=,client= LOG: worker
> process: parallel worker for PID 7889 (PID 7917) exited with exit code
> 0
> 2015-03-12 16:09:30 GMT [7880]: [29-1] user=,db=,client= LOG:
> unregistering background worker "parallel worker for PID 7889"
> 2015-03-12 16:09:30 GMT [7880]: [30-1] user=,db=,client= LOG: worker
> process: parallel worker for PID 7889 (PID 7914) exited with exit code
> 0
> 2015-03-12 16:09:30 GMT [7880]: [31-1] user=,db=,client= LOG:
> unregistering background worker "parallel worker for PID 7889"
> 2015-03-12 16:09:30 GMT [7880]: [32-1] user=,db=,client= LOG: worker
> process: parallel worker for PID 7889 (PID 7915) exited with exit code
> 0
> 2015-03-12 16:09:30 GMT [7880]: [33-1] user=,db=,client= LOG:
> unregistering background worker "parallel worker for PID 7889"
> 2015-03-12 16:09:30 GMT [7880]: [34-1] user=,db=,client= LOG: worker
> process: parallel worker for PID 7889 (PID 7912) exited with exit code
> 0
> 2015-03-12 16:09:30 GMT [7880]: [35-1] user=,db=,client= LOG:
> unregistering background worker "parallel worker for PID 7889"
> 2015-03-12 16:09:30 GMT [7880]: [36-1] user=,db=,client= LOG: server
> process (PID 7889) was terminated by signal 11: Segmentation fault
> 2015-03-12 16:09:30 GMT [7880]: [37-1] user=,db=,client= DETAIL:
> Failed process was running: SELECT pg_catalog.quote_ident(c.relname)
> FROM pg_catalog.pg_class c WHERE c.relkind IN ('r', 'S', 'v', 'm',
> 'f') AND substring(pg_catalog.quote_ident(c.relname),1,10)='pgbench_br'
> AND pg_catalog.pg_table_is_visible(c.oid) AND c.relnamespace <>
> (SELECT oid FROM pg_catalog.pg_namespace WHERE nspname = 'pg_catalog')
> UNION
> SELECT pg_catalog.quote_ident(n.nspname) || '.' FROM
> pg_catalog.pg_namespace n WHERE
> substring(pg_catalog.quote_ident(n.nspname) || '.',1,10)='pgbench_br'
> AND (SELECT pg_catalog.count(*) FROM pg_catalog.pg_namespace WHERE
> substring(pg_catalog.quote_ident(nspname) || '.',1,10) =
> substring('pgbench_br',1,pg_catalog.length(pg_catalog.quote_ident(nspname))+1))
>> 1
> UNION
> SELECT pg_catalog.quote_ident(n.nspname) || '.' ||
> pg_catalog.quote_ident(c.relname) FROM pg_catalog.pg_class c,
> pg_catalog.pg_namespace n WHERE c.relnamespace = n.oid AND c.relkind
> IN ('r', 'S', 'v', 'm', 'f') AND
> substring(pg_catalog.quote_ident(n.nspname) || '.' ||
> pg_catalog.quote_ident(c.relname),1,10)='pgbench_br' AND substri
> 2015-03-12 16:09:30 GMT [7880]: [38-1] user=,db=,client= LOG:
> terminating any other active server processes
> 2015-03-12 16:09:30 GMT [7886]: [2-1] user=,db=,client= WARNING:
> terminating connection because of crash of another server process
> 2015-03-12 16:09:30 GMT [7886]: [3-1] user=,db=,client= DETAIL: The
> postmaster has commanded this server process to roll back the current
> transaction and exit, because another server process exited abnormally
> and possibly corrupted shared memory.
> 2015-03-12 16:09:30 GMT [7886]: [4-1] user=,db=,client= HINT: In a
> moment you should be able to reconnect to the database and repeat your
> command.
> 2015-03-12 16:09:30 GMT [7880]: [39-1] user=,db=,client= LOG: all
> server processes terminated; reinitializing
> 2015-03-12 16:09:30 GMT [7920]: [1-1] user=,db=,client= LOG: database
> system was interrupted; last known up at 2015-03-12 16:07:26 GMT
> 2015-03-12 16:09:30 GMT [7920]: [2-1] user=,db=,client= LOG: database
> system was not properly shut down; automatic recovery in progress
> 2015-03-12 16:09:30 GMT [7920]: [3-1] user=,db=,client= LOG: invalid
> record length at 2/7E269A0
> 2015-03-12 16:09:30 GMT [7920]: [4-1] user=,db=,client= LOG: redo is
> not required
> 2015-03-12 16:09:30 GMT [7880]: [40-1] user=,db=,client= LOG:
> database system is ready to accept connections
> 2015-03-12 16:09:30 GMT [7924]: [1-1] user=,db=,client= LOG:
> autovacuum launcher started
>
> I can recreate this by typing:
>
> EXPLAIN SELECT DISTINCT bid FROM pgbench_<tab>
>
> This happens with seq_page_cost = 1000, but not when it's set to 1.

Another problem. I restarted the instance (just in case), and get this error:

# \df+ *.*
ERROR: cannot retain locks acquired while in parallel mode

I get this even with seq_page_cost = 1, parallel_seqscan_degree = 1
and max_worker_processes = 1.
--
Thom

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-03-12 17:06:16 Re: pg_dump: CREATE TABLE + CREATE RULE vs. relreplident
Previous Message Robert Haas 2015-03-12 16:58:55 Re: pg_dump: CREATE TABLE + CREATE RULE vs. relreplident