Re: pgsql: Add parallel-aware hash joins.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgsql: Add parallel-aware hash joins.
Date: 2018-01-24 20:58:16
Message-ID: 19265.1516827496@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2018-01-24 17:18:26 -0300, Alvaro Herrera wrote:
>> Yeah, I proposed this a decade ago but never had the wits to write the
>> code.

> It shouldn't be too hard, right? Leaving defining the file format,
> parsing it, creating the new schedule with depencencies and adapting
> tests aside (hah), it mostly seems a relatively simple graph ordering /
> topological sort problem, right?

Yeah. We already have topo sort code in pg_dump, maybe we could push that
into someplace like src/common or src/fe_utils? Although pg_dump hasn't
got any need for edge weights, so maybe sharing code isn't worth it.

We could flush the existing schedule files and use a simple format like
testname: list of earlier tests it depends on
(I guess there would be more properties than just the dependencies,
but still not hard to parse.)

> If we keep the timings from an earlier
> run somwhere, we can use the timing of runs as edge weights, making the
> schedule better.

I think we could just use constant values hand-coded in the schedule file.
It might occasionally be worth updating them, but realistically it's not
going to matter that they be very accurate. Probably weights like 1, 2,
and 3 would be plenty ;-)

regards, tom lane

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2018-01-24 21:01:38 Re: pgsql: Add parallel-aware hash joins.
Previous Message Andres Freund 2018-01-24 20:47:33 Re: pgsql: Add parallel-aware hash joins.

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-01-24 21:01:38 Re: pgsql: Add parallel-aware hash joins.
Previous Message Stephen Frost 2018-01-24 20:54:54 Re: [HACKERS] Patch: Add --no-comments to skip COMMENTs with pg_dump