Re: [HACKERS] Parallel Hash take II

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Prabhat Sahu <prabhat(dot)sahu(at)enterprisedb(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Oleg Golovanov <rentech(at)mail(dot)ru>
Subject: Re: [HACKERS] Parallel Hash take II
Date: 2017-11-15 19:09:13
Message-ID: CA+TgmoZTUKK_fRyRLH1dCk00gqXnJDZW6i+h+N=9E0cpZ6aZTQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 15, 2017 at 1:35 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> But this does bug me, and I think it's what made me pause here to make a
> bad joke. The way that parallelism treats work_mem makes it even more
> useless of a config knob than it was before. Parallelism, especially
> after this patch, shouldn't compete / be benchmarked against a
> single-process run with the same work_mem. To make it "fair" you need to
> compare parallelism against a single threaded run with work_mem *
> max_parallelism.

I don't really know how to do a fair comparison between a parallel
plan and a non-parallel plan. Even if the parallel plan contains zero
nodes that use work_mem, it might still use more memory than the
non-parallel plan, because a new backend uses a bunch of memory. If
you really want a comparison that is fair on the basis of memory
usage, you have to take that into account somehow.

But even then, the parallel plan is also almost certainly consuming
more CPU cycles to produce the same results. Parallelism is all about
trading away efficiency for execution time. Not just because of
current planner and executor limitations, but intrinsically, parallel
plans are less efficient. The globally optimal solution on a system
that is short on either memory or CPU cycles is to turn parallelism
off.

> Thomas argues that this makes hashjoins be treated faily vis-a-vi
> parallel-oblivious hash join etc. And I think he has somewhat of a
> point. But I don't think it's quite right either: In several of these
> cases the planner will not prefer the multi-process plan because it uses
> more work_mem, it's a cost to be paid. Whereas this'll optimize towards
> using work_mem * max_parallel_workers_per_gather amount of memory.

In general, we have no framework for evaluating the global impact on
the system of our decisions. Not just with parallelism, but in
general, plans that use memory are going to typically beat plans that
don't, because using more memory is a good way to make things run
faster, so the cost goes down, and the cost is what matters.
Everything optimizes for eating as many resources as possible, even if
there is only an extremely marginal gain over a more costly plan that
uses dramatically fewer resources.

A good example of this is a parallel bitmap heap scan vs. a parallel
sequential scan. With enough workers, we switch from having one
worker scan the index and then having all workers do a joint scan of
the heap -- to just performing a parallel sequential scan. Because of
Moore's law, as you add workers, waiting for the non-parallel index
scan to build the bitmap eventually looks less desirable than
accepting that you're going to uselessly scan a lot of pages -- you
stop caring, because you have enough raw power to just blast through
it.

The best non-parallel example I can think of off-hand is sorts.
Sometimes, reducing the amount of memory available for a sort really
doesn't cost very much in terms of execution time, but we always run a
sort with the full allotted work_mem, even if that's gigantic. We
won't use it all if the data is smaller than work_mem, but if there's
batching going on then we will, even if it doesn't really help.

> This makes it pretty much impossible to afterwards tune work_mem on a
> server in a reasonable manner. Previously you'd tune it to something
> like free_server_memory - (max_connections * work_mem *
> 80%_most_complex_query). Which you can't really do anymore now, you'd
> also need to multiply by max_parallel_workers_per_gather. Which means
> that you might end up "forcing" paralellism on a bunch of plans that'd
> normally execute in too short a time to make parallelism worth it.

I think you just need to use max_connections +
Min(max_parallel_workers, max_worker_processes) instead of
max_connections. You can't use parallel query for every query at the
same time with reasonable settings...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2017-11-15 19:17:26 Re: [HACKERS] pg audit requirements
Previous Message Andres Freund 2017-11-15 19:05:45 Re: [HACKERS] Parallel Hash take II