Re: modeling parallel contention (was: Parallel Append implementation)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: modeling parallel contention (was: Parallel Append implementation)
Date: 2017-05-02 19:13:58
Message-ID: CA+TgmoYL-SQZ2gRL2DpenAzOBd5+SW30QB=A4CseWtOgejz4aQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 18, 2017 at 2:48 AM, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
> After searching through earlier mails about parallel scan, I am not
> sure whether the shared state was considered to be a potential factor
> that might reduce parallel query gains, when deciding the calculation
> for number of workers for a parallel seq scan. I mean even today if we
> allocate 10 workers as against a calculated 4 workers count for a
> parallel seq scan, they might help. I think it's just that we don't
> know if they would *always* help or it would regress sometimes.

No, that's not considered, currently. This is actually an issue even
for nodes that are not parallel-aware at all. For example, consider
this:

Hash Join
-> Parallel Seq Scan
-> Hash
-> Seq Scan

It is of course possible that the Parallel Seq Scan could run into
contention problems if the number of workers is large, but in my
experience there are bigger problems here. The non-parallel Seq Scan
can also contend -- not of course over the shared mutex because there
isn't one, but over access to the blocks themselves. Every one of
those blocks has a content lock and a buffer header and so on, and
having multiple processes accessing those things at the same time
scales well, but not perfectly. The Hash node can also contend: if
the hash join spills to disk, you've got multiple processes reading
and writing to the temp directory at the same time and, of course,
that can be worse than just one process doing it -- sometimes much
worse. It can also be better, depending on how much I/O gets
generated and how much I/O bandwidth you have.

The main things that keeps this from being a crippling issue right now
is the fact that we tend not to use that many parallel workers in the
first place. We're trying to scale a query that would otherwise use 1
process out to 3 or 5 processes, and so the contention effects, in
many cases, are not too bad. Multiple people (including David Rowley
as well as folks here at EnterpriseDB) have demonstrated that for
certain queries, we can actually use a lot more workers and everything
works great. The problem is that for other queries, using a lot of
workers works terribly. The planner doesn't know how to figure out
which it'll be - and honestly, I don't either.

/me crosses fingers, hopes someone smarter will work on this problem.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-05-02 19:16:15 Re: Row Level Security UPDATE Confusion
Previous Message Konstantin Knizhnik 2017-05-02 19:10:09 Re: Bug in prepared statement cache invalidation?