Re: pgsql: Add parallel-aware hash joins.

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgsql: Add parallel-aware hash joins.
Date: 2018-01-24 21:07:07
Message-ID: 20180124210707.k3sju7bv4xxeoyik@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Hi,

On 2018-01-24 15:58:16 -0500, Tom Lane wrote:
> Yeah. We already have topo sort code in pg_dump, maybe we could push that
> into someplace like src/common or src/fe_utils? Although pg_dump hasn't
> got any need for edge weights, so maybe sharing code isn't worth it.

I suspect it may be more work to share than worth it, but either way, it
shouldn't be too hard. Hm, isn't dbObjectTypePriority kinda an edge
weight? Seems like we properly could implement it as that.

> We could flush the existing schedule files and use a simple format like
> testname: list of earlier tests it depends on
> (I guess there would be more properties than just the dependencies,
> but still not hard to parse.)

Yea, I think there'd need to be a few more. There's some tests that use
multiple connections, and I suspect it'll be useful to have "implicit"
ordering dependencies for a few test, like a "barrier". Otherwise
e.g. the tablespace test will be annoying to order.

> > If we keep the timings from an earlier
> > run somwhere, we can use the timing of runs as edge weights, making the
> > schedule better.
>
> I think we could just use constant values hand-coded in the schedule file.
> It might occasionally be worth updating them, but realistically it's not
> going to matter that they be very accurate. Probably weights like 1, 2,
> and 3 would be plenty ;-)

The reason I like the idea of using prior tests as scheduling input is
that the slowness actually depends a lot on the type of machine its run
on, and more importantly on things like valgrind, CCA, fsync=on/off,
jit=on/off (far most expensive tests is e.g. the recursion test in
errors.sql :)).

Greetings,

Andres Freund

In response to

Browse pgsql-committers by date

  From Date Subject
Next Message Robert Haas 2018-01-24 21:08:50 Re: pgsql: Add parallel-aware hash joins.
Previous Message Tom Lane 2018-01-24 21:01:38 Re: pgsql: Add parallel-aware hash joins.

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2018-01-24 21:08:50 Re: pgsql: Add parallel-aware hash joins.
Previous Message Adam Brightwell 2018-01-24 21:02:01 Re: PATCH: Exclude unlogged tables from base backups