Re: Explain buffers display units.

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Explain buffers display units.
Date: 2010-02-15 19:58:52
Message-ID: 603c8f071002151158r229611bew2986e8dee39ef7fe@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 15, 2010 at 1:29 PM, Greg Stark <stark(at)mit(dot)edu> wrote:
> On Mon, Feb 15, 2010 at 6:05 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> Well there was a 30+ message thread almost a week ago where there
>>> seemed to be some contention over the issue of whether the numbers
>>> should be averages or totals. But were there was no dispute over the
>>> idea of printing in memory units instead of blocks.
>>
>> Hmm.... yeah, I guess it wasn't discussed.  I'm still not sure it's an
>> improvement.  If a query hit one buffer, is that really the same as
>> saying it hit 8kB?
>
> Well you can always convert between them. The only time it would make
> a difference is if you're sure it's random i/o and you're concerned
> with the number of iops. However it's impossible to tell from this
> output how many of these buffers are read sequentially and how many
> randomly. Even if it's sequential you don't know how much it read
> between interruptions to handle the inner side of a join or whether
> the cached blocks were interspersed throughout the file or were all at
> the beginning or end.

All true, although "you can always converted between them" assumes you
know the block size. I don't imagine many people change that, but...

> I think we should provide better tools to measure these things
> directly rather than force users to make deductions from buffer
> counts. I'm still excited about using dtrace to get real counts of
> iops, seeks, etc.

Sure.

>>  To me, buffers seem like discrete (and unitless)
>> entities, and we handle them that way elsewhere in the system (see,
>> e.g. pg_stat_database, pg_statio_all_tables).  I don't know that it's
>> a good idea to display that same information here in a different
>> format.

This seems like an important point that you need to respond to. Why
should we print out this information in kB here when we display it as
raw numbers elsewhere? I can't see any reason at all.

>> I definitely do not want to do anything that loses accuracy.  This is
>> probably accurate enough for most uses, but it's still not as accurate
>> as just printing the raw numbers.
>
> I left the XML/JSON output in terms of blocks on the theory that tools
> reading this data can look up the block size and convert all it wants.

I think this is a really terrible idea. You've got a lot of very
specific formatting code in explain.c which anyone who wants to use
the JSON and XML output will very possibly need to reimplement. I
have worked really hard to keep the text format in sync with all the
others, and up until now they have been.

> Incidentally looking at the pg_size_pretty() functions reminds me that
> these counters are all 32-bit. That means they'll do funny things if
> you have a query which accesses over 16TB of data... I suspect this
> should probably be changed though I'm feeling lazy about it unless
> someone else wants to push me to do it now.

Well that will require fixing a whole lot of bits in the stats
infrastructure that are only minimally related to this patch. That is
certainly 9.1 material.

Basically, I think this whole change is a bad idea and should be
reverted. You've made the text format EXPLAIN inconsistent with both
the non-text formats and with the rest of the buffer statistics stuff
for absolutely no benefit that I can see.

...Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-02-15 20:00:16 Re: Listen / Notify - what to do when the queue is full
Previous Message David E. Wheeler 2010-02-15 19:52:01 Re: PostgreSQL::PLPerl::Call - Simple interface for calling SQL functions from PostgreSQL PL/Perl