Re: sequences vs. synchronous replication

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: sequences vs. synchronous replication
Date: 2021-12-22 18:49:15
Message-ID: 683c128a-5961-9b7b-ad68-0ea850e62363@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/21/21 03:49, Tomas Vondra wrote:
> On 12/21/21 02:01, Tom Lane wrote:
>> Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> writes:
>>> OK, I did a quick test with two very simple benchmarks - simple select
>>> from a sequence, and 'pgbench -N' on scale 1. Benchmark was on current
>>> master, patched means SEQ_LOG_VALS was set to 1.
>>
>> But ... pgbench -N doesn't use sequences at all, does it?
>>
>> Probably inserts into a table with a serial column would constitute a
>> plausible real-world case.
>>
>
> D'oh! For some reason I thought pgbench has a sequence on the history
> table, but clearly I was mistaken. There's another thinko, because after
> inspecting pg_waldump output I realized "SEQ_LOG_VALS 1" actually logs
> only every 2nd increment. So it should be "SEQ_LOG_VALS 0".
>
> So I repeated the test fixing SEQ_LOG_VALS, and doing the pgbench with a
> table like this:
>
>   create table test (a serial, b int);
>
> and a script doing
>
>   insert into test (b) values (1);
>
> The results look like this:
>
> 1) select nextval('s');
>
>      clients          1         4
>     ------------------------------
>      master       39533    124998
>      patched       3748      9114
>     ------------------------------
>      diff          -91%      -93%
>
>
> 2) insert into test (b) values (1);
>
>      clients          1         4
>     ------------------------------
>      master        3718      9188
>      patched       3698      9209
>     ------------------------------
>      diff            0%        0%
>
> So the nextval() results are a bit worse, due to not caching 1/2 the
> nextval calls. The -90% is roughly expected, due to generating about 32x
> more WAL (and having to wait for commit).
>
> But results for the more realistic insert workload are about the same as
> before (i.e. no measurable difference). Also kinda expected, because
> those transactions have to wait for WAL anyway.
>

Attached is a patch tweaking WAL logging - in wal_level=minimal we do
the same thing as now, in higher levels we log every sequence fetch.

After thinking about this a bit more, I think even the nextval workload
is not such a big issue, because we can set cache for the sequences.
Until now this had fairly limited impact, but it can significantly
reduce the performance drop caused by WAL-logging every sequence fetch.

I've repeated the nextval test on a different machine (the one I used
before is busy with something else), and the results look like this:

1) 1 client

cache 1 32 128
--------------------------------------
master 13975 14425 19886
patched 886 7900 18397
--------------------------------------
diff -94% -45% -7%

4) 4 clients

cache 1 32 128
-----------------------------------------
master 8338 12849 18248
patched 331 8124 18983
-----------------------------------------
diff -96% -37% 4%

So I think this makes it acceptable / manageable. Of course, this means
the values are much less monotonous (across backends), but I don't think
we really promised that. And I doubt anyone is really using sequences
like this (just nextval) in performance critical use cases.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0001-WAL-log-individual-sequence-fetches-20211222.patch text/x-patch 1.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2021-12-22 19:00:09 Re: sequences vs. synchronous replication
Previous Message Fujii Masao 2021-12-22 17:50:49 Re: sequences vs. synchronous replication