Re: Flushing large data immediately in pqcomm

From: Andres Freund <andres(at)anarazel(dot)de>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Flushing large data immediately in pqcomm
Date: 2024-04-06 20:21:27
Message-ID: 20240406202127.2asud6cjfq3exqew@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2024-04-06 14:34:17 +1300, David Rowley wrote:
> I don't see any issues with v5, so based on the performance numbers
> shown on this thread for the latest patch, it would make sense to push
> it. The problem is, I just can't recreate the performance numbers.
>
> I've tried both on my AMD 3990x machine and an Apple M2 with a script
> similar to the test.sh from above. I mostly just stripped out the
> buffer size stuff and adjusted the timing code to something that would
> work with mac.

I think there are a few issues with the test script leading to not seeing a
gain:

1) I think using the textual protocol, with the text datatype, will make it
harder to spot differences. That's a lot of overhead.

2) Afaict the test is connecting over the unix socket, I think we expect
bigger wins for tcp

3) Particularly the larger string is bottlenecked due to pglz compression in
toast.

Where I had noticed the overhead of the current approach badly, was streaming
out basebackups. Which is all binary, of course.

I added WITH BINARY, SET STORAGE EXTERNAL and tested both unix socket and
localhost. I also reduced row counts and iteration counts, because I am
impatient, and I don't think it matters much here. Attached the modified
version.

On a dual xeon Gold 5215, turbo boost disabled, server pinned to one core,
script pinned to another:

unix:

master:
Run 100 100 1000000: 0.058482377
Run 1024 10240 100000: 0.120909810
Run 1024 1048576 2000: 0.153027916
Run 1048576 1048576 1000: 0.154953512

v5:
Run 100 100 1000000: 0.058760126
Run 1024 10240 100000: 0.118831396
Run 1024 1048576 2000: 0.124282503
Run 1048576 1048576 1000: 0.123894962

localhost:

master:
Run 100 100 1000000: 0.067088000
Run 1024 10240 100000: 0.170894273
Run 1024 1048576 2000: 0.230346632
Run 1048576 1048576 1000: 0.230336078

v5:
Run 100 100 1000000: 0.067144036
Run 1024 10240 100000: 0.167950948
Run 1024 1048576 2000: 0.135167027
Run 1048576 1048576 1000: 0.135347867

The perf difference for 1MB via TCP is really impressive.

The small regression for small results is still kinda visible, I haven't yet
tested the patch downthread.

Greetings,

Andres Freund

Attachment Content-Type Size
test1a.sh.txt text/plain 1.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Erik Wienhold 2024-04-06 21:14:23 Re: CASE control block broken by a single line comment
Previous Message Nathan Bossart 2024-04-06 19:41:01 Re: Popcount optimization using AVX512