min_parallel_table_size and inheritence

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Robert Haas <rhaas(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
Subject: min_parallel_table_size and inheritence
Date: 2018-12-09 00:54:14
Message-ID: 20181209005414.GA4848@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The docs say:
https://www.postgresql.org/docs/current/runtime-config-query.html
|min_parallel_table_scan_size Sets the minimum amount of table data that must be scanned in order for a parallel scan to be considered. [...]

I'd like to set parallel_min_table_size=32MB, but it looks like that won't do
what I intend for at least one of our tables using inheritence.

It seems to me that an individual table should not be scanned in parallel if
its size is below the threshold, even if it's a child and has siblings which
are larger and scanned in parallel.

I found that the current behavior seems to be more or less deliberate, but
maybe should be revisited following implementation of "parallel append" node,
as previously discussed.

commit 2609e91fcf9dcf36af40cd0c5b755dccf6057df6
Author: Robert Haas <rhaas(at)postgresql(dot)org>
Date: Tue Mar 14 14:33:14 2017 -0400
| Fix regression in parallel planning against inheritance tables.
|
| Commit 51ee6f3160d2e1515ed6197594bda67eb99dc2cc accidentally changed
| the behavior around inheritance hierarchies; before, we always
| considered parallel paths even for very small inheritance children,
| because otherwise an inheritance hierarchy with even one small child
| wouldn't be eligible for parallelism. That exception was inadverently
| removed; put it back.
[...]
| Discussion: http://postgr.es/m/CAE9k0PmgSoOHRd60SHu09aRVTHRSs8s6pmyhJKWHxWw9C_x+XA@mail.gmail.com

postgres=# CREATE TABLE x(i int);
postgres=# CREATE TABLE x1() INHERITS(x);
postgres=# ANALYZE x,x1;
postgres=# INSERT INTO x1 SELECT * FROM generate_series(1,999999);

postgres=# SET min_parallel_table_scan_size='99MB';
postgres=# explain (COSTS OFF) SELECT * FROM x a NATURAL JOIN x b WHERE a.i<99;
| Gather
| Workers Planned: 2
| -> Parallel Hash Join
| Hash Cond: (b_1.i = a_1.i)
| -> Parallel Append
| -> Parallel Seq Scan on x1 b_1
| -> Parallel Seq Scan on x b
| -> Parallel Hash
| -> Parallel Append
| -> Parallel Seq Scan on x1 a_1
| Filter: (i < 99)
| -> Parallel Seq Scan on x a
| Filter: (i < 99)

Does parallel seq scan on table which is less than half the threshold.
| public | x1 | table | pryzbyj | 35 MB |

Similar if x is a partitioned/relkind=p:

|postgres=# explain (COSTS OFF) SELECT * FROM x a NATURAL JOIN x b WHERE a.i<99;
| Gather
| Workers Planned: 1
| -> Parallel Append
| -> Parallel Hash Join
| Hash Cond: (b.i = a.i)
| -> Parallel Seq Scan on x1 b
| -> Parallel Hash
| -> Parallel Seq Scan on x1 a
| Filter: (i < 99)

But not parallel if I join x1 directly:

postgres=# explain (COSTS OFF) SELECT * FROM x1 a NATURAL JOIN x1 b WHERE a.i<99;
| Hash Join
| Hash Cond: (b.i = a.i)
| -> Seq Scan on x1 b
| -> Hash
| -> Seq Scan on x1 a
| Filter: (i < 99)

..unless I lower threshold:

postgres=# SET min_parallel_table_scan_size='34MB';
postgres=# explain (COSTS OFF) SELECT * FROM x1 a NATURAL JOIN x1 b WHERE a.i<99;
| Gather
| Workers Planned: 1
| -> Parallel Hash Join
| Hash Cond: (b.i = a.i)
| -> Parallel Seq Scan on x1 b
| -> Parallel Hash
| -> Parallel Seq Scan on x1 a
| Filter: (i < 99)

Justin

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2018-12-09 00:57:17 automatically assigning catalog toast oids
Previous Message Jeremy Finzel 2018-12-09 00:34:49 Re: No such file or directory in pg_replslot