Re: System load consideration before spawning parallel workers

From: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: System load consideration before spawning parallel workers
Date: 2016-09-06 00:17:03
Message-ID: CAJrrPGdDzDgU8XgDJW1BvgZj72DcHVy3PdH5Ya-z4_7TGLW_6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 2, 2016 at 3:01 AM, Peter Eisentraut <
peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:

> On 8/16/16 3:39 AM, Haribabu Kommi wrote:
> > Yes, we need to consider many parameters as a system load, not just only
> > the CPU. Here I attached a POC patch that implements the CPU load
> > calculation and decide the number of workers based on the available CPU
> > load. The load calculation code is not an optimized one, there are many
> ways
> > that can used to calculate the system load. This is just for an example.
>
> I see a number of discussion points here:
>
> We don't yet have enough field experience with the parallel query
> facilities to know what kind of use patterns there are and what systems
> for load management we need. So I think building a highly specific
> system like this seems premature. We have settings to limit process
> numbers, which seems OK as a start, and those knobs have worked
> reasonably well in other areas (e.g., max connections, autovacuum). We
> might well want to enhance this area, but we'll need more experience and
> information.
>

Yes, I agree that parallel query is a new feature and we cannot decide it's
affect now itself.

> If we think that checking the CPU load is a useful way to manage process
> resources, why not apply this to more kinds of processes? I could
> imagine that limiting connections by load could be useful. Parallel
> workers is only one specific niche of this problem.
>

Yes, I agree that parallel is only one problem.

How about Postmater calculates the CPU and etc load on the system and
update it in a shared location where every backend can access the details.
Using that, we can decide what operations to control. Using some GUC
specified interval, Postmater updates the system load, so this will not
affect
the performance of other backends.

> As I just wrote in another message in this thread, I don't trust system
> load metrics very much as a gatekeeper. They are reasonable for
> long-term charting to discover trends, but there are numerous potential
> problems for using them for this kind of resource control thing.
>
> All of this seems very platform specific, too. You have
> Windows-specific code, but the rest seems very Linux-specific. The
> dstat tool I had never heard of before. There is stuff with cgroups,
> which I don't know how portable they are across different Linux
> installations. Something about Solaris was mentioned. What about the
> rest? How can we maintain this in the long term? How do we know that
> these facilities actually work correctly and not cause mysterious problems?
>

The CPU load calculation patch is a POC patch, i didn't evaluate it's
behavior
in all platforms.

> Maybe a couple of hooks could be useful to allow people to experiment
> with this. But the hooks should be more general, as described above.
> But I think a few GUC settings that can be adjusted at run time could be
> sufficient as well.

With the GUC settings of parallel it is possible to control the behavior
where
it improves the performance because of more parallel workers when there is
very less load on the system. In case if the system load increases and use
of
more parallel workers can add the overhead instead of improvement to
existing
current behavior when the load is high.

In such cases, the number of parallel workers needs to be reduced with
change
in GUC settings. Instead of that, I just thought, how about if we do the
same
automatically.

Regards,
Hari Babu
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2016-09-06 01:10:18 Re: Stopping logical replication protocol
Previous Message David Fetter 2016-09-05 23:18:58 Re: Suggestions for first contribution?