Re: pgstattuple documentation clarification

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgstattuple documentation clarification
Date: 2016-12-21 14:04:01
Message-ID: 57f4e886-92f3-8440-59c5-47a2ae9d818d@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/20/2016 11:41 PM, Robert Haas wrote:
> On Tue, Dec 20, 2016 at 10:01 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>>> Recently a client was confused because there was a substantial
>>> difference between the reported table_len of a table and the sum of the
>>> corresponding tuple_len, dead_tuple_len and free_space. The docs are
>>> fairly silent on this point, and I agree that in the absence of
>>> explanation it is confusing, so I propose that we add a clarification
>>> note along the lines of:
>>> The table_len will always be greater than the sum of the tuple_len,
>>> dead_tuple_len and free_space. The difference is accounted for by
>>> page overhead and space that is not free but cannot be attributed to
>>> any particular tuple.
>>> Or perhaps we should be more explicit and refer to the item pointers on
>>> the page.
>> I find "not free but cannot be attributed to any particular tuple"
>> to be entirely useless weasel wording, not to mention wrong with
>> respect to item pointers in particular.
>>
>> Perhaps we should start counting the item pointers in tuple_len.
>> We'd still have to explain about page header overhead, but that
>> would be a pretty small and fixed-size discrepancy.
> It's pretty weird to count unused or dead line pointers as part of
> tuple_len, and it would screw things up for anybody trying to
> calculate the average width of their tuples, which is an entirely
> reasonable thing to want to do. I think if we're going to count item
> pointers as anything, it needs to be some new category -- either item
> pointers specifically, or an "other stuff" bucket.
>

Yes, I agree. In any case, before we change anything can we agree on a
description of what we currently do?

Here's a second attempt:

The table_len will always be greater than the sum of the tuple_len,
dead_tuple_len and free_space. The difference is accounted for by
fixed page overhead, the per-page table of pointers to tuples, and
padding to ensure that tuples are correctly aligned.

I don't think any of that is weaselish :-)

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2016-12-21 14:21:58 Re: Faster methods for getting SPI results
Previous Message Robert Haas 2016-12-21 14:00:21 Re: Parallel tuplesort (for parallel B-Tree index creation)