Re: [HACKERS] [POC] Faster processing at Gather node

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] [POC] Faster processing at Gather node
Date: 2017-11-16 02:34:36
Message-ID: CAA4eK1LJ880y7ny0rXybwEWBrZx7GLWxU0LpqLEY6126B+kQzg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 16, 2017 at 12:18 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Nov 14, 2017 at 7:31 AM, Rafia Sabih
> <rafia(dot)sabih(at)enterprisedb(dot)com> wrote:
> Similarly, I think that faster_gather_v3.patch is effectively here
> because it lets all the workers run at the same time, not because
> Gather gets any faster. The local queue is 100x bigger than the
> shared queue, and that's big enough that the workers never have to
> block, so they all run at the same time and things are great. I don't
> see much advantage in pursuing this route. For the local queue to
> make sense it needs to have some advantage that we can't get by just
> making the shared queue bigger, which is easier and less code.
>

The main advantage of local queue idea is that it won't consume any
memory by default for running parallel queries. It would consume
memory when required and accordingly help in speeding up those cases.
However, increasing the size of shared queues by default will increase
memory usage for cases where it is even not required. Even, if we
provide a GUC to tune the amount of shared memory, I am not sure how
convenient it will be for the user to use it as it needs different
values for different workloads and it is not easy to make a general
recommendation. I am not telling we can't work-around this with the
help of GUC, but it seems like it will be better if we have some
autotune mechanism and I think Rafia's patch is one way to achieve it.

> The
> original idea was that we'd reduce latch traffic and spinlock
> contention by moving data from the local queue to the shared queue in
> bulk, but the patches I posted attack those problems more directly.
>

I think the idea was to solve both the problems (shm_mq communication
overhead and Gather Merge related pipeline stalls) with local queue
stuff [1].

[1] - https://www.postgresql.org/message-id/CAA4eK1Jk465W2TTWT4J-RP3RXK2bJWEtYY0xhYpnSc1mcEXfkA%40mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-11-16 02:41:18 Re: pgsql: Disable installcheck tests for test_session_hooks
Previous Message Robert Haas 2017-11-16 02:24:49 Re: [HACKERS] ginInsertCleanup called from vacuum could still miss tuples to be deleted