Re: Parallel query execution introduces performance regressions

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Jinho Jung <jinhojun(at)usc(dot)edu>
Cc: PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: Parallel query execution introduces performance regressions
Date: 2019-04-01 18:52:54
Message-ID: CAH2-Wz=wzQKkGBCRjB7f0yCo1ddPS7T3O51hHiQ1a63JnTpuyA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, Apr 1, 2019 at 11:30 AM Jinho Jung <jinhojun(at)usc(dot)edu> wrote:
> Surprisingly, we found that even on a larger TPC-C database (scale factor of 50, roughly 4GB of size), parallel scan is still slower than the non-parallel execution plan in the old version.

That's not a large database, and it's certainly not a large TPC-C
database. If you attempt to stay under the spec's maximum
tpmC/throughput per warehouse, which is 12.86 tpmC per warehouse, then
you'll need several thousand warehouses on modern hardware. We're
talking several hundred gigabytes. Otherwise, as far as the spec is
concerned you're testing an unrealistic workload. There will be
individual customers that make many more purchases than is humanly
possible. You're modelling an app involving hypothetical warehouse
employees that must enter data into their terminals at a rate that is
not humanly possible.

More importantly, this kind of analysis seems much too simplistic to
be useful. *Any* change to the optimizer or optimizer settings is
certain to regress some queries. We expect users that are very
sensitive to small regressions to take an active interest in
performance tuning their database. It would certainly be very useful
if somebody came up with a less complicated way of assessing these
questions, but that seems to be elusive.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2019-04-01 19:00:22 Re: Parallel query execution introduces performance regressions
Previous Message Andres Freund 2019-04-01 18:34:55 Re: Parallel query execution introduces performance regressions