Quick Links

Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock

From:	"Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>
To:	Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc:	Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock
Date:	2023-11-06 07:35:55
Message-ID:	4D5FCD99-1BAC-4353-942F-B022457A3D80@yandex-team.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> On 6 Nov 2023, at 09:09, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
>
>> Having hashtable to find SLRU page in the buffer IMV is too slow. Some comments on this approach can be found here [0].
>> I'm OK with having HTAB for that if we are sure performance does not degrade significantly, but I really doubt this is the case.
>> I even think SLRU buffers used HTAB in some ancient times, but I could not find commit when it was changed to linear search.
>
> The main intention of having this buffer mapping hash is to find the
> SLRU page faster than sequence search when banks are relatively bigger
> in size, but if we find the cases where having hash creates more
> overhead than providing gain then I am fine to remove the hash because
> the whole purpose of adding hash here to make the lookup faster. So
> far in my test I did not find the slowness. Do you or anyone else
> have any test case based on the previous research on whether it
> creates any slowness?
PFA test benchmark_slru_page_readonly(). In this test we run SimpleLruReadPage_ReadOnly() (essential part of TransactionIdGetStatus())
before introducing HTAB for buffer mapping I get
Time: 14837.851 ms (00:14.838)
with buffer HTAB I get
Time: 22723.243 ms (00:22.723)

This hash table makes getting transaction status ~50% slower.

Benchmark script I used:
make -C $HOME/postgresMX -j 8 install && (pkill -9 postgres; rm -rf test; ./initdb test && echo "shared_preload_libraries = 'test_slru'">> test/postgresql.conf && ./pg_ctl -D test start && ./psql -c 'create extension test_slru' postgres && ./pg_ctl -D test restart && ./psql -c "SELECT count(test_slru_page_write(a, 'Test SLRU'))
FROM generate_series(12346, 12393, 1) as a;" -c '\timing' -c "SELECT benchmark_slru_page_readonly(12377);" postgres)

>
>> Maybe we could decouple locks and counters from SLRU banks? Banks were meant to be small to exploit performance of local linear search. Lock partitions have to be bigger for sure.
>
> Yeah, that could also be an idea if we plan to drop the hash. I mean
> bank-wise counter is fine as we are finding a victim buffer within a
> bank itself, but each lock could cover more slots than one bank size
> or in other words, it can protect multiple banks. Let's hear more
> opinion on this.
+1

>
>>
>> On 30 Oct 2023, at 09:20, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>>
>> I have taken 0001 and 0002 from [1], done some bug fixes in 0001
>>
>>
>> BTW can you please describe in more detail what kind of bugs?
>
> Yeah, actually that patch was using the same GUC
> (multixact_offsets_buffers) in SimpleLruInit for MultiXactOffsetCtl as
> well as for MultiXactMemberCtl, see the below patch snippet from the
> original patch.

Ouch. We were running this for serveral years with this bug... Thanks!

Best regards, Andrey Borodin.

Attachment	Content-Type	Size
0001-Implement-benchmark_slru_page_readonly-to-assess-SLR.patch	application/octet-stream	2.4 KB
unknown_filename	text/plain	2 bytes

In response to

Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock at 2023-11-06 04:09:54 from Dilip Kumar

Responses

Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock at 2023-11-06 09:21:04 from Dilip Kumar

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Richard Guo	2023-11-06 07:46:01	Re: Compiling warnings on old GCC
Previous Message	Michael Paquier	2023-11-06 07:05:56	Re: Requiring recovery.signal or standby.signal when recovering with a backup_label