From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | Peter Geoghegan <pg(at)bowt(dot)ie>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: ReadRecentBuffer() doesn't scale well |
Date: | 2023-06-29 15:39:48 |
Message-ID: | 20230629153948.rk7re27qz3qresdt@awork3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2023-06-29 19:35:30 +1200, Thomas Munro wrote:
> I (re)discovered why I used the lock-then-pin approach. In the
> comments I mentioned InvalidBuffer(), but the main problem is in its
> caller GetVictimBuffer() which has various sanity checks about
> reference counts that can occasionally fail if you have code randomly
> pinning any old buffer.
You're right. Specifically non-valid buffers are the issue.
> New idea: use the standard PinBuffer() function, but add a mode that
> doesn't pin invalid buffers (with caveat that you can perhaps get a
> false negative due to unlocked read, but never a false positive; see
> commit message). Otherwise we'd have to duplicate all the same logic
> to use cmpxchg for ReadRecentBuffer(), or rethink the assumptions in
> that other code.
It might be worth using lock free code in more places before long, but I agree
with the solution here.
> As for the lack of usage bump in the back-branches, I think the
> options are: teach PinBuffer_Locked() to increment it optionally, or
> back-patch whatever we come up with for this.
Hm, or just leave it as is.
> For the root buffer optimisation, the obvious place for storage seems
> to be under rd_amcache. It was originally invented for the cached
> metapage (commit d2896a9ed14) but could accommodate a new struct
> holding whatever we want. Here is a patch to try that out.
> BTAMCacheData would also be a natural place to put future things
> including a copy of the root page itself, in later work on lock-free
> tricks.
I am wondering if we don't want something more generic than stashing this in
rd_amcache. Don't want to end up duplicating relevant code across the uses of
rd_amcache in every AM.
> @@ -663,38 +663,17 @@ ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockN
> else
> {
> bufHdr = GetBufferDescriptor(recent_buffer - 1);
> - have_private_ref = GetPrivateRefCount(recent_buffer) > 0;
>
> - /*
> - * Do we already have this buffer pinned with a private reference? If
> - * so, it must be valid and it is safe to check the tag without
> - * locking. If not, we have to lock the header first and then check.
> - */
> - if (have_private_ref)
> - buf_state = pg_atomic_read_u32(&bufHdr->state);
> - else
> - buf_state = LockBufHdr(bufHdr);
> -
> - if ((buf_state & BM_VALID) && BufferTagsEqual(&tag, &bufHdr->tag))
> + /* Is it still valid and holding the right tag? */
> + if (PinBuffer(bufHdr, NULL, true))
I do wonder if we should have an unlocked pre-check for a) the buffer being
valid and b) BufferTagsEqual() matching. With such a pre-check the race for
increasing the usage count of the wrong buffer is quite small, without the
pre-check it doesn't seem that small anymore.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Jacob Champion | 2023-06-29 16:05:59 | Re: [PATCH] Honor PG_TEST_NOCLEAN for tempdirs |
Previous Message | Robert Haas | 2023-06-29 15:19:38 | Re: pgsql: Fix search_path to a safe value during maintenance operations. |