Re: Gather performance analysis

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Gather performance analysis
Date: 2021-09-18 13:18:43
Message-ID: f0f20798-10ff-c773-d645-992cdbebc790@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 9/8/21 8:05 AM, Dilip Kumar wrote:
> On Tue, Sep 7, 2021 at 8:41 PM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com <mailto:tomas(dot)vondra(at)enterprisedb(dot)com>>
> wrote:
>
> Hi,
>
> The numbers presented in this thread seem very promising - clearly
> there's significant potential for improvements. I'll run similar
> benchmarks too, to get a better understanding of this.
>
>
> Thanks for showing interest.
>
>
> Can you share some basic details about the hardware you used?
> Particularly the CPU model - I guess this might explain some of the
> results, e.g. if CPU caches are ~1MB, that'd explain why setting
> tup_queue_size to 1MB improves things, but 4MB is a bit slower.
> Similarly, number of cores might explain why 4 workers perform better
> than 8 or 16 workers.
>
>
> I have attached the output of the lscpu.  I think batching the data
> before updating in the shared memory will win because we are avoiding
> the frequent cache misses and IMHO the benefit will be more in the
> machine with more CPU sockets.
>
> Now, this is mostly expected, but the consequence is that maybe things
> like queue size should be tunable/dynamic, not hard-coded?
>
>
> Actually, my intention behind the tuple queue size was to just see the
> behavior. Do we really have the problem of workers stalling on queue
> while sending the tuple, the perf report showed some load on WaitLatch
> on the worker side so I did this experiment.  I saw some benefits but it
> was not really huge.  I am not sure whether we want to just increase the
> tuple queue size or make it tunable,  but if we want to support
> redistribute operators in future sometime then maybe we should make it
> dynamically growing at runtime, maybe using dsa or dsa + shared files.
>

Thanks. I ran a couple more benchmarks, with different queue sizes
(16kB, 64kB, 256kB and 1MB) and according to the results the queue size
really makes almost no difference. It might make a difference for some
queries, but I wouldn't bother tuning this until we have a plausible
example - increasing the queue size is not free either.

So it was worth checking, but I'd just leave it as 64kB for now. We may
revisit this later in a separate patch/thread.

> As for the patches, I think the proposed changes are sensible, but I
> wonder what queries might get slower. For example with the batching
> (updating the counter only once every 4kB, that pretty much transfers
> data in larger chunks with higher latency. So what if the query needs
> only a small chunk, like a LIMIT query? Similarly, this might mean the
> upper parts of the plan have to wait for the data for longer, and thus
> can't start some async operation (like send them to a FDW, or something
> like that). I do admit those are theoretical queries, I haven't tried
> creating such query.
>
>
> Yeah, I was thinking about such cases, basically, this design can
> increase the startup cost of the Gather node, I will also try to derive
> such cases and test them.
>
>
> FWIW I've tried applying both patches at the same time, but there's a
> conflict in shm_mq_sendv - not a complex one, but I'm not sure what's
> the correct solution. Can you share a "combined" patch?
>
>
> Actually, these both patches are the same,
> "v1-0001-Optimize-parallel-tuple-send-shm_mq_send_bytes.patch" is the
> cleaner version of the first patch.  For configurable tuple queue size I
> did not send a patch, because that is I just used for the testing
> purpose and never intended to to propose anything.  My most of the
> latest performance data I sent with only
> "v1-0001-Optimize-parallel-tuple-send-shm_mq_send_bytes.patch" and with
> default tuple queue size.
>
> But I am attaching both the patches in case you want to play around.
>

Ah, silly me. I should have noticed the second patch is just a refined
version of the first one.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-09-18 17:06:47 So, about that cast-to-typmod-minus-one business
Previous Message Cameron Murdoch 2021-09-18 12:20:27 Re: [PATCH] Add `verify-system` sslmode to use system CA pool for server cert