Re: Explain buffers display units.

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Explain buffers display units.
Date: 2010-02-16 02:48:36
Message-ID: 603c8f071002151848o5cea7d02hc58c739b84728a2a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 15, 2010 at 6:44 PM, Greg Stark <stark(at)mit(dot)edu> wrote:
> I did respond to it. The whole point is that the text output is for a
> human to read. It should be printed in human-readable units. Not some
> arbitrary internal unit of accounting that they then have to do
> arithmetic on to make sense of.

Well, I disagree with your statement the previous output was not
printed in human-readable units: it was printed in blocks, which I
find to be a perfectly good unit. It's true that the basic unit of
blocks can be converted into kilobytes, but so what? We aren't really
measuring kilobytes; we're measuring blocks. We could equally well
convert the sort and hash output from kilobytes into blocks, but it
would be equally wrong: the sort and hash statistics are measuring
memory usage by adding up actual memory allocations. The buffer
statistics are simply counting the number of blocks that are read or
written. Multiplying by the block size makes it sound as if all the
memory was read or used, which is simply not the case - especially for
things like buffer hits, which don't actually read or allocate any
memory at all.

> We do *not* display raw block numbers anywhere else. Generally I think
> we should have a policy of outputing human-readable standard units of
> memory whenever displaying a memory quantity. Actually I thought we
> already had that policy, hence things like:
>
> postgres=# show shared_buffers;
>  shared_buffers
> ----------------
>  28MB
> (1 row)
>
> postgres=# show checkpoint_timeout;
>  checkpoint_timeout
> --------------------
>  5min
> (1 row)

We certainly do that for GUCs, and in that context it seems to me to
make sense. If you set your shared buffers to a gigabyte, PG will use
an additional GB of memory. But if you hit a "gigabyte" of shared
buffers, you may be examining anywhere from one 8K block over and over
again all the way up to a full GB of memory. Block hits and reads
just don't add in the same way that actual memory allocations do.

And at any rate, what we DON'T do for GUCs is produce differing output
format for the same parameter based on the magnitude of the output
value, as you've done here. We accept input in several different
formats, but there is only one canonical output formal for any
particular GUC, which is furthermore always chosen in such a way that
the exact value of the setting is preserved (again, unlike what you've
done here).

> The other examples you name are all internal or machine-readable
> fomats which have to be formatted somehow using sql queries or tools
> if you want to inspect the values directly. The user is free to format
> the output of the pg_stat* functions using pg_size_pretty() though
> it's annoying that it's not in the same base unit that
> pg_relation_size() outputs but these are the only interface to these
> internal counters so there's no way to know if they're being used for
> human-readable output or for gathering raw data for statistics or
> other purposes.

So, you're saying we shouldn't look at the way that the pg_stat
functions format the output because somebody might write a view over
it that formats it in some different way that may or may not match
what you've done for the EXPLAIN output? What makes you think that
people don't just look at the raw numbers? I certainly have, and
there's no suggestion in the documentation that users should do
anything else.

pg_stat_statements doesn't do what you're suggesting either; it, too,
presents raw numbers, and lets the user make of it what they will.
They might, for example, want to compute a hit ratio, as in the
example provided in the docs. In the case of EXPLAIN of an index
scan, they might want to estimate the number of seeks, on the theory
that an inner-indexscan is going to be all random IO.

>> I think this is a really terrible idea.  You've got a lot of very
>> specific formatting code in explain.c which anyone who wants to use
>> the JSON and XML output will very possibly need to reimplement.  I
>> have worked really hard to keep the text format in sync with all the
>> others, and up until now they have been.
>
> You're assuming the JSON and XML program is planning to display the
> measurements? They might not be. They might be gathering them for
> charting or for alerts or all kinds of other things. Even if they do
> plan to output them they'll want to format it in way that makes sense
> for the context it's used in which might include more or fewer digits
> or plug into some widget which requires raw values and does the
> formatting automatically.

Yes, they might want to write their own formatting code, but they also
might not. They might want to calculate hit ratios, or they might
want to alter the number of decimal places, or they might just want to
output the exact same information as the text format, but in a GUI
format rather than using ASCII art.

> Whereas the human-readable format should display values in a form
> humans can parse, the machine-readable output should include the raw
> measurements with enough information for the tool to make sense of it.
>  Probably the XML schema should include the units as an attribute for
> each tag so tools don't have to hard-code knowledge about what unit
> each tag is in.

This doesn't seem to be a very carefully thought out proposal, because
you haven't explained how it would work for JSON or YAML output. A
format-neutral solution which we've already used for sort and hash
information (and for GUCs) is to include the unit designator in the
output.. But I generally think that trying to make the EXPLAIN output
self-documenting to the point where programs don't need any specific
knowledge of the meanings of the various properties is rather
hopeless. If we want to provide that knowledge, it should be encoded
in some sort of separate XML schema or data dictionary rather than
encoded in the output of every run of EXPLAIN. I think Andres Freund
actually wrote a relaxNG schema or something of that sort, which might
or might not be a useful starting point, but nobody felt motivated to
do anything with it.

Upon further review, I also notice that this patch seems to have
falsified the EXPLAIN documentation - both the description of the
BUFFERS option and the description of the FORMAT option are no longer
accurate.

...Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2010-02-16 03:25:23 Re: Error when building postgresql with contrib functions
Previous Message Fujii Masao 2010-02-16 02:02:20 Re: Streaming Replication on win32