Re: Bypassing shared_buffers

From: Konstantin Knizhnik <knizhnik(at)garret(dot)ru>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Bypassing shared_buffers
Date: 2023-06-15 07:32:03
Message-ID: 1bb973d7-f047-032c-6375-970ca2cda7f7@garret.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 15.06.2023 4:37 AM, Vladimir Churyukin wrote:
> Ok, got it, thanks.
> Is there any alternative approach to measuring the performance as if
> the cache was empty?
> The goal is basically to calculate the max possible I/O time for a
> query, to get a range between min and max timing.
> It's ok if it's done during EXPLAIN ANALYZE call only, not for regular
> executions.
> One thing I can think of is even if the data in storage might be
> stale, issue read calls from it anyway, for measuring purposes.
> For EXPLAIN ANALYZE it should be fine as it doesn't return real data
> anyway.
> Is it possible that some pages do not exist in storage at all? Is
> there a different way to simulate something like that?
>

I do not completely understand what you want to measure: how fast cache
be prewarmed or what is the performance
when working set doesn't fit in memory?

Why not changing `shared_buffers` size to some very small values (i.e.
1MB) doesn't work?
As it was already noticed, there are levels of caching: shared buffers
and OS file cache.
By reducing size of shared buffers you rely mostly on OS file cache.
And actually there is no big gap in performance here - at most workloads
I didn't see more than 15% difference).

You can certainly flush OS cache `echo 3 > /proc/sys/vm/drop_caches` and
so simulate cold start.
But OS cached will be prewarmed quite fast (unlike shared buffer because
of strange Postgres ring-buffer strategies which cause eviction of pages
from shared buffers even if there is a lot of free space).

So please more precisely specify the goal of your experiment.
"max possible I/O time for a query" depends on so many factors...
Do you consider just one client working in isolation or there will be
many concurrent queries and background tasks like autovacuum and
checkpointer  competing for the resources?

My point is that if you need some deterministic result then you will
have to exclude a lot of different factors which may affect performance
and then ... you calculate speed of horse in vacuum, which has almost no
relation to real performance.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2023-06-15 07:41:57 New WAL record to detect the checkpoint redo location
Previous Message Peter Smith 2023-06-15 07:29:16 Re: Initial Schema Sync for Logical Replication