Re: Performance degradation on concurrent COPY into a single relation in PG16.

From: Andres Freund <andres(at)anarazel(dot)de>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Performance degradation on concurrent COPY into a single relation in PG16.
Date: 2023-07-25 05:34:36
Message-ID: 20230725053436.ez65s2vjri5c6sqh@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Hm, in some cases your patch is better, but in others both the old code
(8692f6644e7) and HEAD beat yours on my machine. TBH, not entirely sure why.

prep:
COPY (SELECT generate_series(1, 2000000) a, (random() * 100000 - 50000)::int b, 3243423 c) TO '/tmp/lotsaints.copy';
DROP TABLE lotsaints; CREATE UNLOGGED TABLE lotsaints(a int, b int, c int);

benchmark:
psql -qX -c 'truncate lotsaints' && pgbench -n -P1 -f <( echo "COPY lotsaints FROM '/tmp/lotsaints.copy';") -t 15

I disabled turbo mode, pinned the server to a single core of my Xeon Gold 5215:

HEAD: 812.690

your patch: 821.354

strtoint from 8692f6644e7: 824.543

strtoint from 6b423ec677d^: 806.678

(when I say strtoint from, I did not replace the goto labels, so that part is
unchanged and unrelated)

IOW, for me the code from 15 is the fastest by a good bit... There's an imul,
sure, but the fact that it sets a flag makes it faster than having to add more
tests and branches.

Looking at a profile reminded me of how silly it is that we are wasting a good
chunk of the time in these isdigit() checks, even though we already rely on on
the ascii values via (via *ptr++ - '0'). I think that's done in the headers
for some platforms, but not others (glibc). And we've even done already for
octal and binary!

Open coding isdigit() gives us:

HEAD: 797.434

your patch: 803.570

strtoint from 8692f6644e7: 778.943

strtoint from 6b423ec677d^: 777.741

It's somewhat odd that HEAD and your patch switch position here...

> - else if (ptr[0] == '0' && (ptr[1] == 'o' || ptr[1] == 'O'))
> + /* process hex digits */
> + else if (ptr[1] == 'x' || ptr[1] == 'X')
> {
>
> firstdigit = ptr += 2;

I find this unnecessarily hard to read. I realize it's been added in
6fcda9aba83, but I don't see a reason to use such a construct here.

I find it somewhat grating how much duplication there now is in this
code due to being repeated for all the bases...

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-07-25 06:28:35 Re: logical decoding and replication of sequences, take 2
Previous Message Andres Freund 2023-07-25 03:24:11 Re: [BUG] Crash on pgbench initialization.