Re: pgsql: Add parallel-aware hash joins.

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgsql: Add parallel-aware hash joins.
Date: 2018-01-24 20:08:09
Message-ID: 20180124200809.5g62oaifo7s75mge@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Hi,

On 2018-01-24 14:31:47 -0500, Tom Lane wrote:
> However ... if you spend any time looking at the behavior of that,
> the hashjoin tests are still problematic.

I think my main problem with your arguments is that you basically seem
to say that one of the more complex features in postgres can't increase
the test time. And I just don't agree with that. If we can reduce some
unnecessary overhead (as Thomas iirc has done somwhere nearby) - great,
if we can hide the overhead by scheduling the test better or breaking it
up - also great. But if that's a good chunk of work I think it's
entirely reasonable to not necessarily consider that the best use of
time.

It doesn't seem too surprising that a test that relies on starting
multiple background processes in multiple places will be among the more
expensive ones. We clearly would e.g. benefit from being able to reuse
workers, to avoid constantly starting/stopping them.

> (The overall runtime for "make installcheck-parallel" on this machine
> is about 17.3 seconds right now.) The next slowest test script in
> the join test's group is "update", at 0.373 seconds; so over 1.5 sec
> of the total 17.3 sec runtime is being spent solely in the join script.

Might be worth breaking up join a bit, that won't get rid of all the
wall time overhead, but should reduce it. Reordering to run parallel to
other slow tests might also be worthwhile.

> So I continue to maintain that an unreasonable fraction of the total
> resources devoted to the regular regression tests is going into these
> new hashjoin tests.

> > One caveat is that old machines also
> > somewhat approximate testing with more instrumentation / debugging
> > enabled (say valgrind, CLOBBER_CACHE_ALWAYS, etc). So removing excessive
> > test overhead has still quite some benefits. But I definitely do not
> > want to lower coverage to achieve it.
>
> I don't want to lower coverage either. I do want some effort to be
> spent on achieving test coverage intelligently, rather than just throwing
> large test cases at the code without consideration of the costs.

I think this accusation is unfair. Are you really suggesting that nobody
else cares about the runtime of the new tests? Just because other
people's tradeoffs come down at a somewhat different place, doesn't mean
they add tests "without consideration of the costs".

> Based on these numbers, it seems like one easy thing we could do to
> reduce parallel check time is to split the plpgsql test into several
> scripts that could run in parallel. But independently of that,
> I think we need to make an effort to push hashjoin's time back down.

If we had a dependency based system as I suggested nearby, we could have
pg_regress order the tests so that the slowest ones that have
dependencies fulfilled are started first...

Greetings,

Andres Freund

In response to

Browse pgsql-committers by date

  From Date Subject
Next Message Bruce Momjian 2018-01-24 20:13:14 pgsql: doc: properly indent CREATE TRIGGER paragraph
Previous Message Tom Lane 2018-01-24 19:57:04 Re: pgsql: Add parallel-aware hash joins.

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-01-24 20:10:56 Re: WIP Patch: Precalculate stable functions, infrastructure v1
Previous Message Alvaro Herrera 2018-01-24 20:07:01 Re: copy.c allocation constant