Re: PATCH: Attempt to make dbsize a bit more consistent

From: gkokolatos(at)pm(dot)me
To: Daniel Gustafsson <daniel(at)yesql(dot)se>
Cc: David Zhang <david(dot)zhang(at)highgo(dot)ca>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>
Subject: Re: PATCH: Attempt to make dbsize a bit more consistent
Date: 2020-10-13 13:28:02
Message-ID: 4kwHwcWKttIHpkklBxFGV9DqvkBTF1PyJ8m0ba38ukztGFnW1DLU7W7MSAwNLVfG_lsNvR7wSk-Ga_Hav-oK0rDypd56eiPhrUoo_ii2K9I=@pm.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Thursday, September 10, 2020 12:51 PM, Daniel Gustafsson <daniel(at)yesql(dot)se> wrote:

[snip]

> Since we have introduced the table AM api I support going throug it for all
> table accesses, so +1 on the overall idea.
>

Thank you for reviewing! Please find v2 of the patch attached.

In addition to addressing the comments, this patch contains a slightly opinionated approach during describe. In short, only relations that have storage, are returning non null size during when \d*+ commands are emitted. Such an example is a view which can be found in the psql tests. If a view was returning a size of 0 Bytes, it would indicate that it can potentially be non zero, which is of course wrong.

> Some comments on the patch:
>
> - - Note that this also behaves sanely if applied to an index or toast table;
>
> - - Note that this also behaves sanely if applied to a toast table;
> - those won't have attached toast tables, but they can have multiple forks.
> This comment reads a bit odd now and should probably be reworded.
>

Agreed and amended.

>
> - return size;
>
> - Assert(size < PG_INT64_MAX);
>
> -
> - return (int64)size;
> I assume that this change, and the other ones like that, aim to handle int64
> overflow? Using the extra legroom of uint64 can still lead to an overflow,
> however theoretical it may be. Wouldn't it be better to check for overflow
> before adding to size rather than after the fact? Something along the lines of
> checking for headroom left:
>
> rel_size = table_relation_size(..);
> if (rel_size > (PG_INT64_MAX - total_size))
> < error codepath >
>
>
> total_size += rel_size;

Actually not, the intention is not to handle overflow. The table_relation_size() returns a uint64 and the calling function returns int64.

The Assert() has been placed in order to denote that it is acknowledged that the two functions return different types. I was of the opinion that a run time check will not be needed as even the smaller type can cover more than 9200 PetaByte tables.

If we were to change anything, then I would prefer to change the signature of the pg_*_size family of functions to return uint64 instead.

>
> - if (rel->rd_rel->relkind != RELKIND_INDEX)
>
> - {
>
> - relation_close(rel, AccessShareLock);
>
>
> - PG_RETURN_NULL();
>
>
> - }
> pg_indexes_size is defined as returning the size of the indexes attached to the
> specified relation, so this hunk is wrong as it instead requires rel to be an
> index?

You are absolutely correct, amended.

>
> cheers ./daniel
>

Attachment Content-Type Size
v2-0001-Attempt-to-make-dbsize-a-bit-more-consistent.patch application/octet-stream 10.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Li Japin 2020-10-13 13:30:47 Remove unnecessary else branch
Previous Message Russell Foster 2020-10-13 13:10:43 [Patch] Using Windows groups for SSPI authentication