Unwanted expression simplification in PG12b2

From: Darafei "Komяpa" Praliaskouski <me(at)komzpa(dot)net>
To: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Unwanted expression simplification in PG12b2
Date: 2019-07-17 18:54:21
Message-ID: CAC8Q8tJkKaG8CirjKV_7bHBXJYcwdW11faTLyZDGB5CFKXTzQg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Many thanks for the parallel improvements in Postgres 12. Here is one of
cases where a costy function gets moved from a parallel worker into main
one, rendering spatial processing single core once again on some queries.
Perhaps an assumption "expressions should be mashed together as much as
possible" should be reviewed and something along "biggest part of
expression should be pushed down into parallel worker"?

PostgreSQL 12beta2 (Ubuntu 12~beta2-1.pgdg19.04+1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0, 64-bit

Here is a reproducer:

-- setup
create extension postgis;
create table postgis_test_table (a geometry, b geometry, id int);
set force_parallel_mode to on;
insert into postgis_test_table (select 'POINT EMPTY', 'POINT EMPTY',
generate_series(0,1000) );

-- unwanted inlining moves difference and unary union calculation into
master worker
21:43:06 [gis] > explain verbose select ST_Collect(geom), id from
(select ST_Difference(a,ST_UnaryUnion(b)) as geom, id from
postgis_test_table) z group by id;
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN

├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Gather (cost=159.86..42668.93 rows=200 width=36)

│ Output: (st_collect(st_difference(postgis_test_table.a,
st_unaryunion(postgis_test_table.b)))), postgis_test_table.id │
│ Workers Planned: 1

│ Single Copy: true

│ -> GroupAggregate (cost=59.86..42568.73 rows=200 width=36)

│ Output: st_collect(st_difference(postgis_test_table.a,
st_unaryunion(postgis_test_table.b))), postgis_test_table.id │
│ Group Key: postgis_test_table.id

│ -> Sort (cost=59.86..61.98 rows=850 width=68)

│ Output: postgis_test_table.id, postgis_test_table.a,
postgis_test_table.b │
│ Sort Key: postgis_test_table.id

│ -> Seq Scan on public.postgis_test_table
(cost=0.00..18.50 rows=850 width=68) │
│ Output: postgis_test_table.id,
postgis_test_table.a, postgis_test_table.b

└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(12 rows)

-- when constrained by OFFSET 0, costy calculation is kept in parallel workers
21:43:12 [gis] > explain verbose select ST_Collect(geom), id from
(select ST_Difference(a,ST_UnaryUnion(b)) as geom, id from
postgis_test_table offset 0) z group by id;
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY
PLAN │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ GroupAggregate (cost=13863.45..13872.33 rows=200 width=36)

│ Output: st_collect(z.geom), z.id

│ Group Key: z.id

│ -> Sort (cost=13863.45..13865.58 rows=850 width=36)

│ Output: z.id, z.geom

│ Sort Key: z.id

│ -> Subquery Scan on z (cost=100.00..13822.09 rows=850
width=36) │
│ Output: z.id, z.geom

│ -> Gather (cost=100.00..13813.59 rows=850 width=36)

│ Output: (st_difference(postgis_test_table.a,
st_unaryunion(postgis_test_table.b))), postgis_test_table.id │
│ Workers Planned: 3

│ -> Parallel Seq Scan on
public.postgis_test_table (cost=0.00..13712.74 rows=274 width=36)

│ Output:
st_difference(postgis_test_table.a,
st_unaryunion(postgis_test_table.b)), postgis_test_table.id │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(13 rows)

-- teardown
drop table postgis_test_table;

--
Darafei Praliaskouski
Support me: http://patreon.com/komzpa

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-07-17 18:59:40 Re: Allow simplehash to use already-calculated hash values
Previous Message Andres Freund 2019-07-17 18:48:20 Re: refactoring - share str2*int64 functions