Re: Speed up transaction completion faster after many relations are accessed in a transaction

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, "Imai, Yoshikazu" <imai(dot)yoshikazu(at)jp(dot)fujitsu(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Speed up transaction completion faster after many relations are accessed in a transaction
Date: 2022-01-01 01:44:46
Message-ID: CAApHDvr=Xx8j05g3Op9chdtft4PhAu6Quzy8BFQO5ORpxVSngw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 3 Dec 2021 at 20:36, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> Two months later, this has been switched to RwF.

I was discussing this patch with Andres. He's not very keen on my
densehash hash table idea and suggested that instead of relying on
trying to make the hash table iteration faster, why don't we just
ditch the lossiness of the resource owner code which only records the
first 16 locks, and just make it have a linked list of all locks.

This is a little more along the lines of the original patch, however,
it does not increase the size of the LOCALLOCK struct.

I've attached a patch which does this. This was mostly written
(quickly) by Andres, I just did a cleanup, fixed up a few mistakes and
fixed a few bugs.

I ran the same performance tests on this patch as I did back in [1]:

-- Test 1. Select 1 record from a 140 partitioned table. Tests
creating a large number of locks with a fast query.

create table hp (a int, b int) partition by hash(a);
select 'create table hp'||x||' partition of hp for values with
(modulus 140, remainder ' || x || ');' from generate_series(0,139)x;
create index on hp (b);
insert into hp select x,x from generate_series(1, 140000) x;
analyze hp;

select3.sql: select * from hp where b = 1

select3.sql master

drowley(at)amd3990x:~$ pgbench -n -f select3.sql -T 60 -M prepared postgres
tps = 2099.708748 (without initial connection time)
tps = 2100.398516 (without initial connection time)
tps = 2094.882341 (without initial connection time)
tps = 2113.218820 (without initial connection time)
tps = 2104.717597 (without initial connection time)

select3.sql patched

drowley(at)amd3990x:~$ pgbench -n -f select3.sql -T 60 -M prepared postgres
tps = 2010.070738 (without initial connection time)
tps = 1994.963606 (without initial connection time)
tps = 1994.668849 (without initial connection time)
tps = 1995.948168 (without initial connection time)
tps = 1985.650945 (without initial connection time)

You can see that there's a performance regression here. I've not yet
studied why this appears.

-- Test 2. Tests a prepared query which will perform a generic plan on
the 6th execution then fallback on a custom plan due to it pruning all
but one partition. Master suffers from the lock table becoming
bloated after locking all partitions when planning the generic plan.

create table ht (a int primary key, b int, c int) partition by hash (a);
select 'create table ht' || x::text || ' partition of ht for values
with (modulus 8192, remainder ' || (x)::text || ');' from
generate_series(0,8191) x;
\gexec

select.sql:
\set p 1
select * from ht where a = :p

select.sql master
drowley(at)amd3990x:~$ pgbench -n -f select.sql -T 60 -M prepared postgres
tps = 18014.460090 (without initial connection time)
tps = 17973.358889 (without initial connection time)
tps = 17847.480647 (without initial connection time)
tps = 18038.332507 (without initial connection time)
tps = 17776.143206 (without initial connection time)

select.sql patched
drowley(at)amd3990x:~$ pgbench -n -f select.sql -T 60 -M prepared postgres
tps = 32393.457106 (without initial connection time)
tps = 32277.204349 (without initial connection time)
tps = 32160.719830 (without initial connection time)
tps = 32530.038130 (without initial connection time)
tps = 32299.019657 (without initial connection time)

You can see that there are some quite good performance gains with this test.

I'm going to add this to the January commitfest.

David

[1] https://www.postgresql.org/message-id/CAKJS1f8Lt0kS4bb5EH%3DhV%2BksqBZNnmVa8jujoYBYu5PVhWbZZg%40mail.gmail.com

Attachment Content-Type Size
speedup_releasing_all_locks.patch application/octet-stream 37.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2022-01-01 02:19:18 Re: Fix BUG #17335: Duplicate result rows in Gather node
Previous Message Tomas Vondra 2021-12-31 22:06:42 Collecting statistics about contents of JSONB columns