Quick Links

Re: PATCH: Attempt to make dbsize a bit more consistent

From:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To:	Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com>
Cc:	gkokolatos(at)pm(dot)me, Daniel Gustafsson <daniel(at)yesql(dot)se>, David Zhang <david(dot)zhang(at)highgo(dot)ca>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>
Subject:	Re: PATCH: Attempt to make dbsize a bit more consistent
Date:	2021-02-01 12:18:52
Message-ID:	CAD21AoC49DM1TE-XfsJ2qswbeM9UT8ns2XWtOXuYqXtchRHTQw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Nov 12, 2020 at 2:54 AM Soumyadeep Chakraborty
<soumyadeep2007(at)gmail(dot)com> wrote:
>
> Hey Georgios,
>
> On Tue, Nov 10, 2020 at 6:20 AM <gkokolatos(at)pm(dot)me> wrote:
> >
> >
> >
> >
> >
> >
> > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> > On Monday, November 9, 2020 7:50 PM, Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com> wrote:
> >
> > > Hey Georgios,
> > >
> > > Thanks for looking for more avenues to invoke tableAM APIS! Please find
> > > my review below:
> >
> > A great review Soumyadeep, it is much appreciated.
> > Please remember to add yourself as a reviewer in the commitfest
> > [https://commitfest.postgresql.org/30/2701/]
>
> Ah yes. Sorry, I forgot that!
>
> > >
> > > On Tue, Oct 13, 2020 at 6:28 AM gkokolatos(at)pm(dot)me wrote:
> > >
> > > 1.
> > >
> > > > /*
> > > >
> > > > - - heap size, including FSM and VM
> > > >
> > > > - - table size, including FSM and VM
> > > > */
> > > >
> > >
> > > We should not mention FSM and VM in dbsize.c at all as these are
> > > heap AM specific. We can say:
> > > table size, excluding TOAST relation
> >
> > Yeah, I was thinking that the notion that FSM and VM are still taken
> > into account should be stated. We are iterating over ForkNumber
> > after all.
> >
> > How about phrasing it as:
> >
> > + table size, including all implemented forks from the AM (e.g. FSM, VM)
> > + excluding TOAST relations
> >
> > Thoughts?
>
> Yes, I was thinking along the same lines. The concept of a "fork" forced
> should not be forced down into the tableAM. But that is a discussion for
> another day. We can probably say:
>
> + table size, including all implemented forks from the AM (e.g. FSM, VM
> + for the heap AM) excluding TOAST relations
>
> > >
> > > 2.
> > >
> > > > /*
> > > >
> > > > - Size of toast relation
> > > > */
> > > > if (OidIsValid(rel->rd_rel->reltoastrelid))
> > > >
> > > >
> > > > - size += calculate_toast_table_size(rel->rd_rel->reltoastrelid);
> > > >
> > > > - {
> > > > - Relation toastRel;
> > > > -
> > > > - toastRel = relation_open(rel->rd_rel->reltoastrelid, AccessShareLock);
> > >
> > > We can replace the OidIsValid check with relation_needs_toast_table()
> > > and then have the OidIsValid() check in an Assert. Perhaps, that will
> > > make things more readable.
> >
> > Please, allow me to kindly disagree.
> >
> > Relation is already open at this stage. Even create_toast_table(), the
> > internal workhorse for creating toast relations, does check reltoastrelid
> > to test if the relation is already toasted.
> >
> > Furthermore, we do call:
> >
> > + toastRel = relation_open(rel->rd_rel->reltoastrelid, AccessShareLock);
> >
> > and in order to avoid elog() errors underneath us, we ought to have
> > verified the validity of reltoastrelid.
> >
> > In short, I think that the code in the proposal is not too unreadable
> > nor that it breaks the coding patterns throughout the codebase.
> >
> > Am I too wrong?
>
> No not at all. The code in the proposal is indeed adhering to the
> codebase. What I was going for here was to increase the usage of
> relation_needs_toast_table(). What I meant was:
>
> if (table_relation_needs_toast_table(rel))
> {
> if (!OidIsValid(rel->rd_rel->reltoastrelid))
> {
> elog(ERROR, <errmsg that toast table wasn't found>);
> }
> //size calculation here..
> }
>
> We want to error out if the toast table can't be found and the relation
> is expected to have one, which the existing code doesn't handle.
>
>
> >
> > >
> > > 3.
> > >
> > > > - if (rel->rd_rel->relkind == RELKIND_RELATION ||
> > > > - rel->rd_rel->relkind == RELKIND_TOASTVALUE ||
> > > > - rel->rd_rel->relkind == RELKIND_MATVIEW)
> > > > - size = calculate_table_size(rel);
> > > > - else
> > > > - {
> > > > - relation_close(rel, AccessShareLock);
> > > > - PG_RETURN_NULL();
> > > > - }
> > >
> > > This leads to behavioral changes:
> > >
> > > I am talking about the case where one calls pg_table_size() on an index.
> > > W/o your patch, it returns the size of the index. W/ your patch it
> > > returns NULL. Judging by the documentation, this function should not
> > > ideally apply to indexes but it does! I have a sinking feeling that lots
> > > of users use it for this purpose, as there is no function to calculate
> > > the size of a single specific index (except pg_relation_size()).
> > > The same argument I have made above applies to sequences. Users may have
> > > trial-and-errored their way into finding that pg_table_size() can tell
> > > them the size of a specific index/sequence! I don't know how widespread
> > > the use is in the user community, so IMO maybe we should be conservative
> > > and not introduce this change? Alternatively, we could call out that
> > > pg_table_size() is only for tables by throwing an error if anything
> > > other than a table is passed in.
> > >
> > > If we decide to preserve the existing behavior of the pg_table_size():
> > > It seems that for things not backed by the tableAM (indexes and
> > > sequences), they should still go through calculate_relation_size().
> > > We can call table_relation_size() based on if relkind is
> > > RELKIND_RELATION, RELKIND_TOASTVALUE or RELKIND_MATVIEW. Perhaps it
> > > might be worthwhile making a new macro RELKIND_HAS_TABLE_STORAGE to
> > > capture these three cases (See RELKIND_HAS_STORAGE). This also ensures
> > > that we return 0 for things that don't qualify as RELKIND_HAS_STORAGE,
> > > such as a partitioned table (Currently w/ the patch applied, we return
> > > NULL for those cases, which is another behavior change)
> >
> >
> > Excellent point. This is the discussion I was longing to have.
> >
> > I stand by the decision coded in the patch, that pg_table_size() should
> > return NULL for other kinds of relations, such as indexes, sequences
> > etc.
> >
> > It is a conscious decision based on the following:
> >
> > * Supported by the documentation, pg_table_size() applies to tables only.
> > For other uses the higher-level functions pg_total_relation_size() or
> > pg_relation_size() should be used.
> > * Commit fa352d662e taught pg_relation_size() and friends to return NULL if the object doesn't exist. This makes perfect sense for the
> > scenarios described in the commit:
> >
> > That avoids errors when the functions are used in queries like
> > "SELECT pg_relation_size(oid) FROM pg_class",
> > and a table is dropped concurrently.
> >
> > IMHO: It is more consistent to return NULL when the relation does exist
> > OR it is not a table kind.
> > * Returning 0 for things that do not have storage, is nonsensical because
> > it implies that it can be NON zero at some point. Things that do not
> > have storage have an unknown size.
>
> Fair. We will have to document the behavior change.
>
> >
> > As far as for the argument that users might have trialed and errored
> > their way into undocumented behaviour, I do not think it is strong
> > enough to stop us from implementing the documented behaviour.
>
> Fair. I would strongly vote for having two additional functions
> (pg_index_size() and pg_sequence_size()) to strongly signal our intent
> of banning that kind of use of pg_table_size(). I think it would help
> users a lot. It is not easy to find what function to call when you want
> the size of a single index/sequence.
>
> > >
> > > 4.
> > >
> > > > @@ -3776,10 +3776,24 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
> > > > gettext_noop("Access Method"));
> > > >
> > > > /*
> > > >
> > > >
> > > > - * As of PostgreSQL 14, do not use pg_table_size() for indexes and
> > > >
> > > >
> > > > - * sequences as it does not behave sanely for those.
> > > >
> > > >
> > > > - *
> > > > * As of PostgreSQL 9.0, use pg_table_size() to show a more accurate
> > > > * size of a table, including FSM, VM and TOAST tables.
> > > > */
> > > >
> > > >
> > > >
> > > > - if (pset.sversion >= 90000)
> > > >
> > > >
> > > >
> > > > - if (pset.sversion >= 140000)
> > > >
> > > >
> > > > - appendPQExpBuffer(&buf,
> > > >
> > > >
> > > > - ",\\n CASE"
> > > >
> > > >
> > > > - " WHEN c.relkind in ("CppAsString2(RELKIND_INDEX)", "
> > > >
> > > >
> > > > - CppAsString2(RELKIND_PARTITIONED_INDEX)", "
> > > >
> > > >
> > > > - CppAsString2(RELKIND_SEQUENCE)") THEN"
> > > >
> > > >
> > > > - " pg_catalog.pg_size_pretty(pg_catalog.pg_relation_size(c.oid))"
> > > >
> > > >
> > > > - " ELSE"
> > > >
> > > >
> > > > - " pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid))"
> > > >
> > > >
> > > > - " END as \\"%s\\"",
> > > >
> > > >
> > > > - gettext_noop("Size"));
> > > >
> > > >
> > > > - else if (pset.sversion >= 90000)
> > > > appendPQExpBuffer(&buf,
> > > > ",\\n pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid)) as \\"%s\\"",
> > > > gettext_noop("Size"));
> > > >
> > > >
> > >
> > > Following on from point 3, if we decide to preserve the existing behavior,
> > > we wouldn't need this change, as it would be internalized within
> > > pg_table_size().
> >
> > We really should not decide to preserve the existing behaviour.
> >
> > I will reiterate my point: Returning 0 for things that do not have
> > storage, implies that it can be NON zero at some point. We should not
> > treat pg_table_size() as an alias for pg_relation_size().
>
> +1
>
> > >
> > > 4.
> > >
> > > > > - return size;
> > > > >
> > > > > - Assert(size < PG_INT64_MAX);
> > > > >
> > > > > -
> > > > > - return (int64)size;
> > > > > I assume that this change, and the other ones like that, aim to handle int64
> > > > > overflow? Using the extra legroom of uint64 can still lead to an overflow,
> > > > > however theoretical it may be. Wouldn't it be better to check for overflow
> > > > > before adding to size rather than after the fact? Something along the lines of
> > > > > checking for headroom left:
> > > > > rel_size = table_relation_size(..);
> > > > > if (rel_size > (PG_INT64_MAX - total_size))
> > > > > < error codepath >
> > > > >
> > > > >
> > > > > total_size += rel_size;
> > > >
> > > > Actually not, the intention is not to handle overflow. The table_relation_size() returns a uint64 and the calling function returns int64.
> > > > The Assert() has been placed in order to denote that it is acknowledged that the two functions return different types. I was of the opinion that a run time check will not be needed as even the > smaller type can cover more than 9200 PetaByte tables.
> > > > If we were to change anything, then I would prefer to change the signature of the pg_*_size family of functions to return uint64 instead.
> > >
> > > Changing the signature would be the ideal change for all of this here.
> > > But Postgres does not have support for an unsigned 64bit integer (bigint
> > > is signed). One would need to turn to extensions such as [1]. Thus, +1
> > > to what Daniel said above.
> >
> > Apologies, I do not follow. Are you suggesting that we should
> > introduce overflow tests?
>
> Yes, to introduce the same overflow test that Daniel suggested above as
> returning a uint64 in PG does not really return a uint64 AFAIU (since
> the pg_**size() functions all return bigint which is signed and there
> is no uint64 user-facing type).

Status update for a commitfest entry.

This patch gets review comments and there was some discussion. It
seems we're waiting for the patch update. So I've moved this patch to
the next commitfest and set it to "Waiting on Author".

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Re: PATCH: Attempt to make dbsize a bit more consistent at 2020-11-11 17:53:48 from Soumyadeep Chakraborty

Responses

Re: PATCH: Attempt to make dbsize a bit more consistent at 2021-02-19 15:57:22 from gkokolatos

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Amit Kapila	2021-02-01 12:26:01	Re: Single transaction in the tablesync worker?
Previous Message	Dilip Kumar	2021-02-01 10:52:34	Re: Faulty HEAP_XMAX_LOCK_ONLY & HEAP_KEYS_UPDATED hintbit combination