Quick Links

Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc:	Melanie Plageman <melanieplageman(at)gmail(dot)com>, Maciek Sakrejda <m(dot)sakrejda(at)gmail(dot)com>, Lukas Fittl <lukas(at)fittl(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject:	Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
Date:	2022-11-25 22:46:11
Message-ID:	20221125224611.efs6mtrijedkjhzs@awork3.anarazel.de
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

On 2022-11-22 23:43:29 -0600, Justin Pryzby wrote:
> I think there may be a problem/deficiency with hint bits:
>
> |postgres=# DROP TABLE u2; CREATE TABLE u2 AS SELECT generate_series(1,999999)a; SELECT pg_stat_reset_shared('io'); explain (analyze,buffers) SELECT * FROM u2;
> |...
> | Seq Scan on u2 (cost=0.00..15708.75 rows=1128375 width=4) (actual time=0.111..458.239 rows=999999 loops=1)
> | Buffers: shared hit=2048 read=2377 dirtied=2377 written=2345
>
> |postgres=# SELECT COUNT(1), relname, COUNT(1) FILTER(WHERE isdirty) FROM pg_buffercache b LEFT JOIN pg_class c ON pg_relation_filenode(c.oid)=b.relfilenode GROUP BY 2 ORDER BY 1 DESC LIMIT 11;
> | count | relname | count
> |-------+---------------------------------+-------
> | 13619 | | 0
> | 2080 | u2 | 2080
> | 104 | pg_attribute | 4
> | 71 | pg_statistic | 1
> | 51 | pg_class | 1
>
> It says that SELECT caused 2377 buffers to be dirtied, of which 2080 are
> associated with the new table in pg_buffercache.

Note that there's 2048 dirty buffers for u2 in shared_buffers before the
SELECT, despite the relation being 4425 blocks long, due to the CTAS using
BAS_BULKWRITE.

I don't think there's any "circumvention" of the ringbuffer here. There's 2048
buffers for u2 in s_b before, all dirty, there's 2080 after, also all
dirty. So the ringbuffer restricted the increase in shared buffers used for u2
to 2080-2048=32 additional buffers.

The reason hint bits don't prevent pages from being written out here is that a
BAS_BULKREAD strategy doesn't cause all buffer writes to be rejected, it just
causes buffer writes to be rejected when the page LSN would require a WAL
flush. And that's not typically the case when you just set a hint bit, unless
you use wal_log_hint_bits = true.

If I turn on wal_log_hints=true and add a CHECKPOINT after the CTAS I see 0
reuses (and 4425 dirty buffers), which is what I'd expect.

> But the IO view says that 2345 of the pages were "reused", which seems
> misleading to me. Maybe that just follows from the behavior and the view is
> fine. If the view is fine, maybe this case should still be specifically
> mentioned in the docs.

I think that's just confusing due to the reset. 2048 + 2345 = 4393, but we
only have 2080 buffers for u2 in s_b.

Greetings,

Andres Freund

In response to

Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?) at 2022-11-23 05:43:29 from Justin Pryzby

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Geoghegan	2022-11-25 22:47:57	Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
Previous Message	Zheng Li	2022-11-25 22:23:09	Re: Support logical replication of DDLs