Re: ToDo: show size of partitioned table

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, mathias(at)brossard(dot)org, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: ToDo: show size of partitioned table
Date: 2019-02-07 10:54:46
Message-ID: CAFj8pRBNkLsVhHpTsZGHg9H=cOc7xNQVF6qJ3wsK6LdtzLEYFg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

čt 7. 2. 2019 v 11:25 odesílatel Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
napsal:

> Hi,
>
> On 2019/02/07 18:08, Pavel Stehule wrote:
> > čt 7. 2. 2019 v 9:51 odesílatel Amit Langote <
> Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
> >> \dPn seems to work fine, but I don't quite understand why \dPn+ should
> >> show the sizes only for nested partitions of level. Consider the
>
> (correcting words of my previous email: ... of level 1.)
>
> > Show nested objects in rectangular output is a problem. I prefer a design
> > where any times the sum of displayed sizes is same like total size.
> >
> > So if I have partitions on level1 of size 16KB, and on level 2 8KB, then
> I
> > would to display 16 and 8, and not 24 and 8. If I remember, this rule is
> > modified, when filter is used.
>
> Just to recap, the originally proposed feature is to show the size of a
> partitioned table by summing the sizes of *all* of its (actually leaf)
> partitions, which \dP[t|i]+ gives us. As you mentioned, a limitation of
> the feature as initially proposed is that it only shows partitioned tables
> that are roots of their respective partition trees. That is, there is no
> way to see the sizes of the intermediate partitioned tables using any of
> psql's \d commands. So, you introduced the "n" modifier, whereby
> \dP[t|i]n+ now shows *also* the intermediate partitioned tables with their
> sizes. But it only considers the directly attached partitions of each
> partitioned table that's shown. So, only those partitioned tables that
> have leaf partitions directly attached them are shown with non-0 size (if
> leaf partitions are non-empty) and others with size 0 (root partitioned
> tables in most cases where nested partitioned tables are involved). But I
> think that means the "n" modifier is changing the behavior of the base
> command (\dP+) which is meant to show the total size of *all* partitions
> under a given partitioned table. Maybe, the "n" modifier should only
> result in including the nested/intermediate partitioned tables and nothing
> more than that.
>

It was a Michael's request to see all hierarchy, and I think so it has some
benefits

> I see your point that all these tables are appearing in the display as one
> flat list and so the sizes of same leaf partitions may be multiply
> counted, but it's not totally a flat representation given that you have
> added "Parent name" column. We could document that the size of a nested
> partitioned table shown in the display is also counted in the size of its
> parent partitioned table. That I think may be easier to understand than
> that the size of each partitioned table shown in the display only
> considers the sizes of leaf partitions that are directly attached to it.
>
>
Personally I don't agree - a) who read a documentation, b) it is really
violation of some relation principles. It is clean, if we have only one
table, but if we see a report with more tables, than multiple size
calculation can be messy.

It is not problem if you have clean schema like P1, P2, tables (when tree
is balanced). But when some tables can be assigned to P1 and some to P2
(tree is not balanced) then it is not clean what is size of directly
attached tables and what is size of subpartitions. So it is better don't
sum apples and oranges.

\dPn shows all subroots and related minimal size. I think so this is very
clear definition.

Your example

postgres=# \dPtn+
List of partitioned tables
┌────────┬────────┬───────┬─────────────┬────────────┬─────────────┐
│ Schema │ Name │ Owner │ Parent name │ Size │ Description │
╞════════╪════════╪═══════╪═════════════╪════════════╪═════════════╡
│ public │ p │ pavel │ │ 8192 bytes │ │
│ public │ p_1 │ pavel │ p │ 8192 bytes │ │
│ public │ p_1_bc │ pavel │ p_1 │ 8192 bytes │ │
└────────┴────────┴───────┴─────────────┴────────────┴─────────────┘
(3 rows)

I hope so the interpretation is clean .. there are three partitioned tables
(two are subpartitioned tables). Any partitioned table has assigned 8KB of
data.

We can introduce new column "size with sub partitions" where these numbers
can be counted together. But for term "size" I expect valid rule S1+S2+..
SN = total size.

It is acceptable for you?

> Thoughts? Any more opinions on this?
>

>
> Thanks,
> Amit
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2019-02-07 11:49:43 Re: [HACKERS] REINDEX CONCURRENTLY 2.0
Previous Message Jose Luis Tallon 2019-02-07 10:37:22 Re: phase out ossp-uuid?