Re: How to set parallel_tuple_cost

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-performa(dot)" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: How to set parallel_tuple_cost
Date: 2019-12-21 01:37:15
Message-ID: CAMkU=1xfwNb89JT4PvZwRas5BgxxyNkh+BGQmhnMc7==SVNwtQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, Dec 20, 2019 at 1:58 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
> > The docs for parallel_tuple_cost are quite terse, as the reference
> section
> > of the docs usually are:
> > "Sets the planner's estimate of the cost of transferring one tuple from a
> > parallel worker process to another process. The default is 0.1."
>
> > If you were take the doc description literally, then the default value
> > seems much too high, as it doesn't take 10x the (default) cpu_tuple_cost
> to
> > transfer a tuple up from a parallel worker.
>
> Really? If anything, I'd have thought it might be worse than 10x.
> Cross-process communication isn't cheap, at least not according to
> my instincts.
>

I was a bit surprised. I set it up so that there was a fine-tunable filter
which can be applied in the parallel workers, and then only the surviving
tuples get passed up to the leader. The use of a parallel seq scan didn't
become slower than the non-parallel version until over 95% of the tuples
were surviving the filter. If I wanted to make the estimated cost
cross-over point match the measured time cross-over point, I had to mark
the parallel_tuple_cost down to about 0.011. This was an 8 CPU machine, an
AWS m5.4xlarge, with max_parallel_workers_per_gather=7. (On my crummy
2-CPU Windows 10 laptop running ubuntu via VirtualBox, the cross-over point
was closer to 40% of the tuples surviving, and the parallel_tuple_cost to
match cross-over point would be about 0.016, but I don't have enough RAM to
make a large enough all-in-shared-buffer table to really get a good
assessments).

My method was to make shared_buffers be a large fraction of RAM (55GB, out
of 64GB), then make a table slightly smaller than that and forced it into
shared_buffers with pg_prewarm. I set seq_page_cost = random_age_cost = 0,
to accurately reflect the fact that no IO is occuring.

create table para_seq as select floor(random()*10000)::int as id, random()
as x, md5(random()::text)||md5(random()::text) t from
generate_series(1,8000000*55);
vacuum ANALYZE para_seq ;
select pg_prewarm('para_seq');

explain (analyze, buffers, settings, timing off) select * from para_seq
where id<9500;

Where you can change the 9500 to tune the selectivity of the filter. Is
this the correct way to try to isolate just the overhead of transferring of
a tuple away from other considerations so it can be measured?

I don't think the fact that EXPLAIN ANALYZE throws away the result set
without reading it should change anything. Reading it should add the same
fixed overhead to both parallel and non-parallel, so would dilute out
percentage difference without change absolute differences.

I tried it with wider tuples as well, but not so wide they would activate
TOAST, and didn't really see a difference in the conclusion.

> > On the other hand, you
> > probably don't want a query which consumes 8x the CPU resources just to
> > finish only 5% faster (on an otherwise idle server with 8 CPUs). Maybe
> > this Amdahl factor is what inspired the high default value?
>
> I think the large value of parallel_setup_cost is what's meant to
> discourage that scenario.
>

I think that can only account for overhead like forking and setting up
memory segments. The overhead of moving around tuples (more than
single-threaded execution already moves them around) would need to scale
with the number of tuples moved around.

Cheers,

Jeff

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Timothy Garnett 2019-12-26 20:07:41 Merge join doesn't seem to break early when I (and planner) think it should - 10.4
Previous Message Andres Freund 2019-12-21 00:24:53 Re: How to set parallel_tuple_cost