Re: [HACKERS] [POC] Faster processing at Gather node

From: Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] [POC] Faster processing at Gather node
Date: 2017-11-16 12:05:22
Message-ID: CAOGQiiMtsWKXN1xu3_aTDq2RxtMPasFtrzsRGVRK62AqFxwi0g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On Thu, Nov 16, 2017 at 12:18 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> I suspect that one factor is that many of the queries actually send
> very few rows through the Gather. You didn't send EXPLAIN ANALYZE
> outputs for these runs, but I went back and looked at some old tests I

Please find the attached zip for the same. The results are for head
and Case 2c. Since, there was no difference in plan or in performance
for the other cases except for Q12, I haven't kept the runs for each
of the cases mentioned upthread.

> Now obviously your plans are different -- otherwise you couldn't be
> seeing a speedup on Q12. So you have to look at the plans and try to
> understand what the big picture is here. Spending a lot of time
> running queries where the time taken by Gather is not the bottleneck
> is not a good way to figure out whether we've successfully sped up
> Gather. What would be more useful? How about:
>
For this scale factor, I found that the queries where gather or
gather-merge process relatively large number of rows were - Q2, Q3,
Q10, Q12, Q16, Q18, Q20, and Q21. However, as per the respective
explain analyse outputs, for all these queries except Q12, the
contribution of gather/gather-merge node individually in the total
execution time of the respective query is insignificant, so IMO we
can't expect any performance improvement from such cases for this set
of patches. We have already discussed the case of Q12 enough, so need
not to say anything about it here again.

> - Once you've identified the queries where Gather seems like it might
> be a bottleneck, run perf without the patch set and see whether Gather
> or shm_mq related functions show up high in the profile. If they do,
> run perf which the patch set and see if they become less prominent.
>
Sure, I'll do that.

> - Try running the test cases that Andres and I tried with and without
> the patch set. See if it helps on those queries. That will help
> verify that your testing procedure is correct, and might also reveal
> differences in the effectiveness of that patch set on different
> hardware.
The only TPC-H query I could find upthread analysed by either you or Andres is,
explain analyze SELECT * FROM lineitem WHERE l_suppkey > '5012' OFFSET
1000000000 LIMIT 1;

So, here are the results for it with the parameter settings as
suggested by Andres upthread,
set parallel_setup_cost=0;set parallel_tuple_cost=0;set
min_parallel_table_scan_size=0;set max_parallel_workers_per_gather=8;
with the addition of max_parallel_workers = 100, just to ensure that
it uses as many workers as it planned.

With the patch-set,
explain analyze SELECT * FROM lineitem WHERE l_suppkey > '5012' OFFSET
1000000000 LIMIT 1;

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=430530.95..430530.95 rows=1 width=129) (actual
time=57651.076..57651.076 rows=0 loops=1)
-> Gather (cost=0.00..430530.95 rows=116888930 width=129) (actual
time=0.581..50528.386 rows=116988791 loops=1)
Workers Planned: 8
Workers Launched: 8
-> Parallel Seq Scan on lineitem (cost=0.00..430530.95
rows=14611116 width=129) (actual time=0.015..3904.101 rows=12998755
loops=9)
Filter: (l_suppkey > '5012'::bigint)
Rows Removed by Filter: 333980
Planning time: 0.143 ms
Execution time: 57651.722 ms
(9 rows)

on head,
explain analyze SELECT * FROM lineitem WHERE l_suppkey > '5012' OFFSET
1000000000 LIMIT 1;

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=430530.95..430530.95 rows=1 width=129) (actual
time=100024.995..100024.995 rows=0 loops=1)
-> Gather (cost=0.00..430530.95 rows=116888930 width=129) (actual
time=0.282..93607.947 rows=116988791 loops=1)
Workers Planned: 8
Workers Launched: 8
-> Parallel Seq Scan on lineitem (cost=0.00..430530.95
rows=14611116 width=129) (actual time=0.029..3866.321 rows=12998755
loops=9)
Filter: (l_suppkey > '5012'::bigint)
Rows Removed by Filter: 333980
Planning time: 0.409 ms
Execution time: 100025.303 ms
(9 rows)

So, there is a significant improvement in performance with the
patch-set. The only point that confuses me is that Andres mentioned
upthread,

EXPLAIN ANALYZE SELECT * FROM lineitem WHERE l_suppkey > '5012' OFFSET
1000000000 LIMIT 1;
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
│ QUERY PLAN
├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
│ Limit (cost=635802.67..635802.69 rows=1 width=127) (actual
time=5984.916..5984.916 rows=0 loops=1)
│ -> Gather (cost=0.00..635802.67 rows=27003243 width=127) (actual
time=0.214..5123.238 rows=26989780 loops=1)
│ Workers Planned: 8
│ Workers Launched: 7
│ -> Parallel Seq Scan on lineitem (cost=0.00..635802.67
rows=3375405 width=127) (actual time=0.025..649.887 rows=3373722
loops=8)
│ Filter: (l_suppkey > 5012)
│ Rows Removed by Filter: 376252
│ Planning time: 0.076 ms
│ Execution time: 5986.171 ms
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
(9 rows)

so there clearly is still benefit (this is scale 100, but that shouldn't
make much of a difference).

In my tests, the scale factor is 20 and the number of rows in gather
is 116988791, however, for Andres it is 26989780, plus, the time taken
by query in 20 scale factor is some 100s without patch and for Andres
it is 8s. So, may be when Andres wrote scale 100 it is typo for scale
10 or what he meant by scale is not scale factor of TPC-H, in that
case I'd like to know what he meant there.

--
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/

Attachment Content-Type Size
gather_analysis_all.zip application/zip 215.3 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message a.akenteva 2017-11-16 12:05:59 Re: [HACKERS] A weird bit in pg_upgrade/exec.c
Previous Message legrand legrand 2017-11-16 12:04:59 RE: Transaction control in procedures