Re: Parallel Query - Can it be used within functions?

From: Michael Krüger <michael(at)kruegers(dot)email>
To: Andreas Kretschmer <andreas(at)a-kretschmer(dot)de>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Parallel Query - Can it be used within functions?
Date: 2018-02-28 08:05:51
Message-ID: CACSnzzWCYPBrKdNzu5cT4YDcCwRxr2k2gDTMMrqQvtkxWVtE1w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Ok, to close this thread. The problem is, that plpgsql function do seem to
return data using a cursor. That fact is disabling parallel execution. So
if we instead hand over the SQL to a function with e.g. a python body, then
parallel execution is happening, because the data is first assembled in
memory before it is returned, without using a cursor:

mkrueger=# explain analyze select * from reports.generic_query_python($$
select mediatrunkid,count(*)::numeric from reports.mediatrunkkpi where
mediatrunkid in
(2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028)
group by 1 $$) as foo (mediatrunkid bigint, count numeric);
LOG: 00000: duration: 35.158 ms plan:
Query Text: select mediatrunkid,count(*)::numeric from
reports.mediatrunkkpi where mediatrunkid in
(2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028)
group by 1
Partial HashAggregate (cost=56693.72..56696.72 rows=300 width=16) (actual
time=35.144..35.149 rows=17 loops=1)
Group Key: mediatrunkid
Buffers: shared hit=1641
-> Parallel Bitmap Heap Scan on mediatrunkkpi (cost=4525.01..56279.28
rows=82889 width=8) (actual time=2.350..24.584 rows=63794 loops=1)
Recheck Cond: (mediatrunkid = ANY
('{2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028}'::bigint[]))
Heap Blocks: exact=1641
Buffers: shared hit=1641
-> Bitmap Index Scan on idx_mediatrunkkpi_trunk
(cost=0.00..4475.27 rows=198933 width=0) (never executed)
Index Cond: (mediatrunkid = ANY
('{2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028}'::bigint[]))
LOCATION: explain_ExecutorEnd, auto_explain.c:359
LOG: 00000: duration: 35.165 ms plan:
Query Text: select mediatrunkid,count(*)::numeric from
reports.mediatrunkkpi where mediatrunkid in
(2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028)
group by 1
Partial HashAggregate (cost=56693.72..56696.72 rows=300 width=16) (actual
time=35.152..35.157 rows=17 loops=1)
Group Key: mediatrunkid
Buffers: shared hit=1630
-> Parallel Bitmap Heap Scan on mediatrunkkpi (cost=4525.01..56279.28
rows=82889 width=8) (actual time=2.364..24.702 rows=63272 loops=1)
Recheck Cond: (mediatrunkid = ANY
('{2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028}'::bigint[]))
Heap Blocks: exact=1630
Buffers: shared hit=1630
-> Bitmap Index Scan on idx_mediatrunkkpi_trunk
(cost=0.00..4475.27 rows=198933 width=0) (never executed)
Index Cond: (mediatrunkid = ANY
('{2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028}'::bigint[]))
LOCATION: explain_ExecutorEnd, auto_explain.c:359
LOG: 00000: duration: 47.855 ms plan:
Query Text: select mediatrunkid,count(*)::numeric from
reports.mediatrunkkpi where mediatrunkid in
(2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028)
group by 1
Finalize GroupAggregate (cost=57784.41..57792.66 rows=300 width=40)
(actual time=45.331..45.344 rows=17 loops=1)
Group Key: mediatrunkid
Buffers: shared hit=2735
-> Sort (cost=57784.41..57785.91 rows=600 width=16) (actual
time=45.322..45.325 rows=51 loops=1)
Sort Key: mediatrunkid
Sort Method: quicksort Memory: 27kB
Buffers: shared hit=2735
-> Gather (cost=57693.72..57756.72 rows=600 width=16) (actual
time=45.270..45.295 rows=51 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=2735
-> Partial HashAggregate (cost=56693.72..56696.72 rows=300
width=16) (actual time=38.387..38.391 rows=17 loops=3)
Group Key: mediatrunkid
Buffers: shared hit=6006
-> Parallel Bitmap Heap Scan on mediatrunkkpi
(cost=4525.01..56279.28 rows=82889 width=8) (actual time=5.564..27.399
rows=67080 loops=3)
Recheck Cond: (mediatrunkid = ANY
('{2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028}'::bigint[]))
Heap Blocks: exact=1912
Buffers: shared hit=6006
-> Bitmap Index Scan on idx_mediatrunkkpi_trunk
(cost=0.00..4475.27 rows=198933 width=0) (actual time=11.229..11.229
rows=201241 loops=1)
Index Cond: (mediatrunkid = ANY
('{2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028}'::bigint[]))
Buffers: shared hit=823
LOCATION: explain_ExecutorEnd, auto_explain.c:359
LOG: 00000: duration: 49.924 ms plan:
Query Text: explain analyze select * from reports.generic_query_python($$
select mediatrunkid,count(*)::numeric from reports.mediatrunkkpi where
mediatrunkid in
(2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028)
group by 1 $$) as foo (mediatrunkid bigint, count numeric);
Function Scan on generic_query_python foo (cost=0.25..50.25 rows=5000
width=40) (actual time=49.920..49.922 rows=17 loops=1)
Buffers: shared hit=6388
LOCATION: explain_ExecutorEnd, auto_explain.c:359
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Function Scan on generic_query_python foo (cost=0.25..50.25 rows=5000
width=40) (actual time=49.920..49.922 rows=17 loops=1)
Planning time: 0.029 ms
Execution time: 49.977 ms
(3 rows)

Michael Krüger <michael(at)kruegers(dot)email> schrieb am Fr., 16. Feb. 2018 um
11:42 Uhr:

> Dear all,
>
> still same behavior with Postgres 10.2 ...
>
> Just as a reminder that the issue still exists.
>
> Regards,
> Michael
>
> Andreas Kretschmer <andreas(at)a-kretschmer(dot)de> schrieb am Di., 6. Feb. 2018
> um 08:35 Uhr:
>
>> Hi,
>>
>>
>> Am 06.02.2018 um 08:24 schrieb Michael Krüger:
>> > create or replace function reports.generic_query(_sql text)
>> > RETURNS SETOF record
>> > LANGUAGE 'plpgsql'
>> > PARALLEL SAFE
>> > COST 100
>>
>> there is an other parameter, parallel_setup_cost, with default = 1000. I
>> think, you should set this parameter too.
>>
>> Please keep me informed, it is interessting me.
>>
>>
>> Regards, Andreas
>>
>> --
>> 2ndQuadrant - The PostgreSQL Support Company.
>> www.2ndQuadrant.com
>>
>>
>>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Michael Krüger 2018-02-28 08:08:47 Re: Parallel Query - Can it be used within functions?
Previous Message Andres Freund 2018-02-28 00:19:40 Re: index-only-scan when there is an index on all columns