Re: Flushing large data immediately in pqcomm

From: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>
Subject: Re: Flushing large data immediately in pqcomm
Date: 2024-03-21 23:45:52
Message-ID: CAGPVpCQ-265P-DY8hZADEKE5GO0-1NVB9kn7dH82BQgEUbdv1g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

PSA v3.

Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, 21 Mar 2024 Per, 12:58 tarihinde
şunu yazdı:

> On Thu, 21 Mar 2024 at 01:24, Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com> wrote:
> > What if I do a simple comparison like PqSendStart == PqSendPointer
> instead of calling pq_is_send_pending()
>
> Yeah, that sounds worth trying out. So the new suggestions to fix the
> perf issues on small message sizes would be:
>
> 1. add "inline" to internal_flush function
> 2. replace pq_is_send_pending() with PqSendStart == PqSendPointer
> 3. (optional) swap the order of PqSendStart == PqSendPointer and len
> >= PqSendBufferSize
>

I did all of the above changes and it seems like those resolved the
regression issue.
Since the previous results were with unix sockets, I share here the results
of v3 when using unix sockets for comparison.
Sharing only the case where all messages are 100 bytes, since this was when
the regression was most visible.

row size = 100 bytes, # of rows = 1000000
┌───────────┬────────────┬──────┬──────┬──────┬──────┬──────┐
│ │ 1400 bytes │ 2KB │ 4KB │ 8KB │ 16KB │ 32KB │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ HEAD │ 1106 │ 1006 │ 947 │ 920 │ 899 │ 888 │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ patch │ 1094 │ 997 │ 943 │ 913 │ 894 │ 881 │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ no buffer │ 6389 │ 6195 │ 6214 │ 6271 │ 6325 │ 6211 │
└───────────┴────────────┴──────┴──────┴──────┴──────┴──────┘

David Rowley <dgrowleyml(at)gmail(dot)com>, 21 Mar 2024 Per, 00:57 tarihinde şunu
yazdı:

> On Fri, 15 Mar 2024 at 01:46, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> > - the "(int *) &len)" cast is not ok, and will break visibly on
> > big-endian systems where sizeof(int) != sizeof(size_t).
>
> I think fixing this requires adjusting the signature of
> internal_flush_buffer() to use size_t instead of int. That also
> means that PqSendStart and PqSendPointer must also become size_t, or
> internal_flush() must add local size_t variables to pass to
> internal_flush_buffer and assign these back again to the global after
> the call. Upgrading the globals might be the cleaner option.
>
> David

This is done too.

I actually tried to test it over a real network for a while. However, I
couldn't get reliable-enough numbers with both HEAD and the patch due to
network related issues.
I've decided to go with Jelte's suggestion [1] which is decreasing MTU of
the loopback interface to 1500 and using localhost.

Here are the results:

1- row size = 100 bytes, # of rows = 1000000
┌───────────┬────────────┬───────┬───────┬───────┬───────┬───────┐
│ │ 1400 bytes │ 2KB │ 4KB │ 8KB │ 16KB │ 32KB │
├───────────┼────────────┼───────┼───────┼───────┼───────┼───────┤
│ HEAD │ 1351 │ 1233 │ 1074 │ 988 │ 944 │ 916 │
├───────────┼────────────┼───────┼───────┼───────┼───────┼───────┤
│ patch │ 1369 │ 1232 │ 1073 │ 981 │ 928 │ 907 │
├───────────┼────────────┼───────┼───────┼───────┼───────┼───────┤
│ no buffer │ 14949 │ 14533 │ 14791 │ 14864 │ 14612 │ 14751 │
└───────────┴────────────┴───────┴───────┴───────┴───────┴───────┘

2- row size = half of the rows are 1KB and rest is 10KB , # of rows =
1000000
┌───────────┬────────────┬───────┬───────┬───────┬───────┬───────┐
│ │ 1400 bytes │ 2KB │ 4KB │ 8KB │ 16KB │ 32KB │
├───────────┼────────────┼───────┼───────┼───────┼───────┼───────┤
│ HEAD │ 37212 │ 31372 │ 25520 │ 21980 │ 20311 │ 18864 │
├───────────┼────────────┼───────┼───────┼───────┼───────┼───────┤
│ patch │ 23006 │ 23127 │ 23147 │ 22229 │ 20367 │ 19155 │
├───────────┼────────────┼───────┼───────┼───────┼───────┼───────┤
│ no buffer │ 30725 │ 31090 │ 30917 │ 30796 │ 30984 │ 30813 │
└───────────┴────────────┴───────┴───────┴───────┴───────┴───────┘

3- row size = half of the rows are 1KB and rest is 1MB , # of rows = 1000
┌───────────┬────────────┬──────┬──────┬──────┬──────┬──────┐
│ │ 1400 bytes │ 2KB │ 4KB │ 8KB │ 16KB │ 32KB │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ HEAD │ 4296 │ 3713 │ 3040 │ 2711 │ 2528 │ 2449 │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ patch │ 2401 │ 2411 │ 2404 │ 2374 │ 2395 │ 2408 │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ no buffer │ 2399 │ 2403 │ 2408 │ 2389 │ 2402 │ 2403 │
└───────────┴────────────┴──────┴──────┴──────┴──────┴──────┘

4- row size = all rows are 1MB , # of rows = 1000
┌───────────┬────────────┬──────┬──────┬──────┬──────┬──────┐
│ │ 1400 bytes │ 2KB │ 4KB │ 8KB │ 16KB │ 32KB │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ HEAD │ 8335 │ 7370 │ 6017 │ 5368 │ 5009 │ 4843 │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ patch │ 4711 │ 4722 │ 4708 │ 4693 │ 4724 │ 4717 │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ no buffer │ 4704 │ 4712 │ 4746 │ 4728 │ 4709 │ 4730 │
└───────────┴────────────┴──────┴──────┴──────┴──────┴──────┘

[1]
https://www.postgresql.org/message-id/CAGECzQQMktuTj8ijJgBRXCwLEqfJyAFxg1h7rCTej-6%3DcR0r%3DQ%40mail.gmail.com

Thanks,
--
Melih Mutlu
Microsoft

Attachment Content-Type Size
v3-0001-Flush-large-data-immediately-in-pqcomm.patch application/octet-stream 4.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2024-03-22 00:07:14 Re: Regression tests fail with musl libc because libpq.so can't be loaded
Previous Message Peter Eisentraut 2024-03-21 23:40:38 Re: documentation structure