Re: PATCH: Attempt to make dbsize a bit more consistent

From: gkokolatos(at)pm(dot)me
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, David Zhang <david(dot)zhang(at)highgo(dot)ca>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: PATCH: Attempt to make dbsize a bit more consistent
Date: 2021-03-15 12:18:12
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Monday, March 15, 2021 7:10 AM, Michael Paquier <michael(at)paquier(dot)xyz> wrote:

&gt; On Wed, Feb 24, 2021 at 02:35:51PM +0000, gkokolatos(at)pm(dot)me wrote:
&gt; &gt; Now with attachment. Apologies for the chatter.
&gt; The patch has no documentation for the two new functions, so it is a
&gt; bit hard to understand what is the value brought here, and what is the
&gt; goal wanted just by reading the patch, except that this switches the
&gt; size reported for views to NULL instead of zero bytes by reading the
&gt; regression tests.

Understood and agreed. Thank you very much for looking.

&gt; Please note that reporting zero is not wrong for views IMO as these
&gt; have no physical storage, so, while it is possible to argue that a
&gt; size of zero could imply that this relation size could not be zero, it
&gt; will never change, so I'd rather let this behavior as it is. I
&gt; would bet, additionally, that this could break existing use cases.
&gt; Reporting NULL, on the contrary, as your patch does, makes things
&gt; worse in some ways because that's what pg_relation_size would report
&gt; when a relation does not exist anymore. Imagine for example the case
&gt; of a DROP TABLE run in parallel of a scan of pg_class using
&gt; pg_relation_size(). So it becomes impossible to know if a relation
&gt; has been removed or not. This joins some points raised by Soumyadeep
&gt; upthread.

Yeah, I would have agreed with you before actually looking a bit closer.
I think that the gist of it is that the intention of the functions and
the actual usage/documentation of those have diverged. I honestly thought
that pg_relation_size() was a general function intended for any kind of
relation, whereas pg_table_size() was a function intended only for tables
(toast and mat views included).

Then I read the comment describing pg_relation_size() which reads
'disk space usage for the main fork of the specified table or index' and
'disk space usage for the specified fork of a table or index', found
initially in commit 358a897fa1d and maintained since.

But I digress.

&gt; Anyway, as mentioned by other people upthread, I am not really
&gt; convinced either that we should have more flavors of size functions,
&gt; particularly depending on the relkind as this would be confusing for
&gt; the end-user. pg_relation_size() can already do this job for all
&gt; relkinds that use segment files, so it should still be able to hold
&gt; its ground and maintain a consistent behavior with what it does
&gt; currently.

I must have missunderstood the other people upthread and I thought
that new flavours were requested. Thank you for clarifying and
correcting me.

&gt; +static int64
&gt; +calculate_table_fork_size(Relation rel, ForkNumber forkNum)
&gt; +{
&gt; - return (int64)table_relation_size(rel, forkNum);
&gt; +}
&gt; Why isn't that just unsigned, like table_relation_size()? This is an
&gt; internal function so it does not matter to changes its signature, but
&gt; I think that you had better do a cast at a higher level instead.
&gt; for (int i = 0; i &lt; MAX_FORKNUM; i++)
&gt; - nblocks += smgrnblocks(rel-&gt;rd_smgr, i);
&gt; - nblocks += smgrexists(rel-&gt;rd_smgr, i)
&gt; - ? smgrnblocks(rel-&gt;rd_smgr, i)
&gt; - : 0;
&gt; Is that actually the part for views? Why is that done this way?

No, it is not for views. The codebase should never reach this function
for a view or it would be a serious bug elsewhere.

This is addressing a bug in table_relation_size(). This function is correctly
not requiring for its caller to know any specifics about the forks. A heap
table is not required to have created any fsm, or vm, forks neither. Finally
smgrnblocks() assumes that the fork actually exists or errors out.

This change, makes certain that calling table_relation_size() with a
non existing fork will not generate an error.

&gt; - if (rel-&gt;rd_rel-&gt;relkind == RELKIND_RELATION ||
&gt; - rel-&gt;rd_rel-&gt;relkind == RELKIND_TOASTVALUE ||
&gt; - rel-&gt;rd_rel-&gt;relkind == RELKIND_MATVIEW)
&gt; - size = calculate_table_fork_size(rel,
&gt; - forkname_to_number(text_to_cstring(forkName)));
&gt; - else if (rel-&gt;rd_rel-&gt;relkind == RELKIND_INDEX)
&gt; - size = calculate_relation_size(&amp;(rel-&gt;rd_node), rel-&gt;rd_backend,
&gt; - forkname_to_number(text_to_cstring(forkName)));
&gt; - else
&gt; - {
&gt; - relation_close(rel, AccessShareLock);
&gt; - PG_RETURN_NULL();
&gt; - }
&gt; The approach you are taking to bring some consistency in all that by
&gt; making the size estimations go through table AMs via
&gt; table_relation_size() is promising. However, this code breaks the
&gt; size estimation for sequences, which is not good. If attempting to
&gt; use an evaluation that's based on a table AM, shouldn't this code use
&gt; a check based on rd_tableam rather than the relkind then? We assume
&gt; that rd_tableam is set depending on the relkind in relcache.c, for
&gt; one.

Thank you for the remarks. Please find v5 attached which is a minimal
patch to try to use the table am api if the relation is using the table
am api. Otherwise all else remains the same.


&gt; --
&gt; Michael


Attachment Content-Type Size
v5-0001-Attempt-to-make-dbsize-a-bit-more-consistent.patch application/octet-stream 6.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2021-03-15 12:18:29 Re: Since '2001-09-09 01:46:40'::timestamp microseconds are lost when extracting epoch
Previous Message Amul Sul 2021-03-15 12:15:22 Re: [Patch] ALTER SYSTEM READ ONLY