Re: [HACKERS] Parallel Hash take II

From: Andres Freund <andres(at)anarazel(dot)de>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Prabhat Sahu <prabhat(dot)sahu(at)enterprisedb(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Oleg Golovanov <rentech(at)mail(dot)ru>
Subject: Re: [HACKERS] Parallel Hash take II
Date: 2017-11-17 23:20:00
Message-ID: 20171117232000.uwgpndkhlbbt3hpo@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2017-11-14 01:30:30 +1300, Thomas Munro wrote:
> New patch attached.

0002-Add-a-barrier-primitive-for-synchronizing-backends.patch
- Intro starts with:
+ *
+ * This implementation of barriers allows for static sets of participants
+ * known up front, or dynamic sets of participants which processes can join or
that's a bit low-level, no? Should explain what they're for, not
everyone's going to be familiar with the concept.

- The comment for BarrierInit() reads a bit weird:
+ * Initialize this barrier, setting a static number of participants that we
+ * will wait for at each computation phase. To use a dynamic number of
+ * participants, this number should be zero, and BarrierAttach and
Set a static number! Except maybe not?

- I'm not entirely convinced that we want the barrier debug stuff
merged, what are your feelings about that? It's like half the code,
and adds some complexity to the non debug code... If we indeed want to
keep it, it probably should be documented in config.sgml? And get an
entry in pg_config_manual.h?

- Can we add assertions ensuring nobody attaches to a static barrier?

- If I understand correctly currently the first participant to arrive at
a barrier is going to be selected, and the last wakes everyone
up. Wouldn't it be better to do have the last arrived participant be
selected? It already got a scheduler timeslice, it's memory is most
likely to be in cache etc? Given that in cases where selection plays a
role it's going to be blocking everyone else, using the process most
likely to finish first seems like a good idea. I guess the
BarrierDetach() implementation would be slightly more complex?

Regards,

Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-11-17 23:23:40 Re: Treating work_mem as a shared resource (Was: Parallel Hash take II)
Previous Message Jeremy Schneider 2017-11-17 22:46:42 Re: [HACKERS] pg_upgrade to clusters with a different WAL segment size