From: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
---|---|
To: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
Cc: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Proposal: Limitations of palloc inside checkpointer |
Date: | 2025-06-04 03:11:17 |
Message-ID: | CABPTF7UBRAFHbd5iM=QYLDeSVSwJxqE=XYJKxH8D58x8+B79mg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi all,
Sorry—I forgot to Cc on my previous message. Resending here so they’re
on the thread:
On Wed, Jun 4, 2025 at 11:07 AM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
>
> Hi Alexander,
>
> Thanks again for the feedback!
>
> 1) Batch-processing CompactCheckpointerRequestQueue() and AbsorbSyncRequests()?
>
> After some thoughts, I realized my previous take was incomplete—sorry
> for the confusion. Heikki suggested capping num_requests at 10 million
> [1]. With that limit, the largest hash table is ~500 MB and the
> skip_slot[] array is ~10 MB in CompactCheckpointerRequestQueue and the
> max size of request array in AbsorbSyncRequests is well under 400 MB,
> so we never exceed 1 GB. Even without batching, compaction stays under
> the cap. Batching in AbsorbSyncRequests may still help by amortizing
> memory allocation, but it adds extra lock/unlock overhead. Not sure if
> that overhead is worth it under the cap.
>
> Of course, all of this depends on having a cap in place. Picking the
> right cap size can be tricky (see point 2). If we decide not to
> enforce a cap now or in future versions, then batching both
> CompactCheckpointerRequestQueue(maybe?) and AbsorbSyncRequests become
> essential. We also need to consider the batch size—Heikki suggested 10
> k for AbsorbSyncRequests—but I’m not sure whether that suits typical
> or extreme workloads.
>
> > Right, but another point is to avoid lengthy holding of
> > CheckpointerCommLock. What do you think about that?
>
> I am not clear on this. Could you elaborate on it?
>
> [1] https://www.postgresql.org/message-id/c1993b75-a5bc-42fd-bbf1-6f06a1b37107%40iki.fi
>
>
> 2) Back-branch fixes with MAX_CHECKPOINT_REQUESTS?
>
> This is simple and effective, but can be hard to get the value right.
> I think we should think more of it. For very large-scale use cases,
> like hundreds of GB shared_buffers, 10 million seems small if the
> checkpointer is not able to absorb the changes before the queue fills
> up. In this case, making compaction more efficient like 3) would be
> helpful. However, if we do this for back-branch as well, the solution
> is not that simple any more.
>
>
> 3) Fill gaps by pulling from the tail instead of rewriting the whole queue?
>
> I misunderstood at first—this is a generally helpful optimization.
> I'll integrate it into the current patch.
From | Date | Subject | |
---|---|---|---|
Next Message | Japin Li | 2025-06-04 03:12:07 | Re: Encapsulate io_uring process count calculation |
Previous Message | wenhui qiu | 2025-06-04 01:30:25 | Re: Add log_autovacuum_{vacuum|analyze}_min_duration |