On Mon, Feb 15, 2010 at 1:29 PM, Greg Stark <stark(at)mit(dot)edu> wrote:
> On Mon, Feb 15, 2010 at 6:05 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> Well there was a 30+ message thread almost a week ago where there
>>> seemed to be some contention over the issue of whether the numbers
>>> should be averages or totals. But were there was no dispute over the
>>> idea of printing in memory units instead of blocks.
>> Hmm.... yeah, I guess it wasn't discussed. I'm still not sure it's an
>> improvement. If a query hit one buffer, is that really the same as
>> saying it hit 8kB?
> Well you can always convert between them. The only time it would make
> a difference is if you're sure it's random i/o and you're concerned
> with the number of iops. However it's impossible to tell from this
> output how many of these buffers are read sequentially and how many
> randomly. Even if it's sequential you don't know how much it read
> between interruptions to handle the inner side of a join or whether
> the cached blocks were interspersed throughout the file or were all at
> the beginning or end.
All true, although "you can always converted between them" assumes you
know the block size. I don't imagine many people change that, but...
> I think we should provide better tools to measure these things
> directly rather than force users to make deductions from buffer
> counts. I'm still excited about using dtrace to get real counts of
> iops, seeks, etc.
>> To me, buffers seem like discrete (and unitless)
>> entities, and we handle them that way elsewhere in the system (see,
>> e.g. pg_stat_database, pg_statio_all_tables). I don't know that it's
>> a good idea to display that same information here in a different
This seems like an important point that you need to respond to. Why
should we print out this information in kB here when we display it as
raw numbers elsewhere? I can't see any reason at all.
>> I definitely do not want to do anything that loses accuracy. This is
>> probably accurate enough for most uses, but it's still not as accurate
>> as just printing the raw numbers.
> I left the XML/JSON output in terms of blocks on the theory that tools
> reading this data can look up the block size and convert all it wants.
I think this is a really terrible idea. You've got a lot of very
specific formatting code in explain.c which anyone who wants to use
the JSON and XML output will very possibly need to reimplement. I
have worked really hard to keep the text format in sync with all the
others, and up until now they have been.
> Incidentally looking at the pg_size_pretty() functions reminds me that
> these counters are all 32-bit. That means they'll do funny things if
> you have a query which accesses over 16TB of data... I suspect this
> should probably be changed though I'm feeling lazy about it unless
> someone else wants to push me to do it now.
Well that will require fixing a whole lot of bits in the stats
infrastructure that are only minimally related to this patch. That is
certainly 9.1 material.
Basically, I think this whole change is a bad idea and should be
reverted. You've made the text format EXPLAIN inconsistent with both
the non-text formats and with the rest of the buffer statistics stuff
for absolutely no benefit that I can see.
In response to
pgsql-hackers by date
|Next:||From: Tom Lane||Date: 2010-02-15 20:00:16|
|Subject: Re: Listen / Notify - what to do when the queue is full |
|Previous:||From: David E. Wheeler||Date: 2010-02-15 19:52:01|
|Subject: Re: PostgreSQL::PLPerl::Call - Simple interface for calling SQL functions from PostgreSQL PL/Perl|