Quick Links

Re: Patch: add timing of buffer I/O requests

From:	Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	"ktm(at)rice(dot)edu" <ktm(at)rice(dot)edu>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Ants Aasma <ants(at)cybertec(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Patch: add timing of buffer I/O requests
Date:	2012-04-13 20:33:42
Message-ID:	CAEYLb_VtJtPa7SpxEaoqejGb0YWikHyAo68adQfByQFbFN6Gzw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 13 April 2012 20:15, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Apr 11, 2012 at 9:02 AM, ktm(at)rice(dot)edu <ktm(at)rice(dot)edu> wrote:
>> By using all 64-bits of the hash that we currently calculate, instead
>> of the current use of 32-bits only, the collision probabilities are
>> very low.
>
> That's probably true, but I'm not sure it's worth worrying about -
> one-in-four-billion is a pretty small probability.

That assumes that our hashing of query trees will exhibit uniform
distribution. While it's easy enough to prove that hash_any does that,
it would seem to me that that does not imply that we will exhibit
perfectly uniform distribution within pg_stat_statements. The reason
that I invented the jumble nomenclature to distinguish it from a
straight serialisation is that, apart from the fact that many fields
are simply ignored, we still couldn't deserialize what we do store
from the jumble, because the correct way to serialise a tree entails
serialising NULL pointers too - two non-equivalent jumbles could
actually be bitwise identical. In practice, I think that adding
NodeTags ought to be sufficient here, but I wouldn't like to bet my
life on it. Actually, that is a point that perhaps should have been
touched on in the comments at the top of the file.

You may wish to take a look at my original analysis in the
pg_stat_statements normalisation thread, which attempts to quantify
the odds by drawing upon the mathematics of the birthday problem.

> On the broader issue, Peter's argument here seems to boil down to
> "there is probably a reason why this is useful" and Tom's argument
> seems to boil down to "i don't want to expose it without knowing what
> that reason is". Peter may well be right, but that doesn't make Tom
> wrong. If we are going to expose this, we ought to be able to
> document why we have it, and right now we can't, because we don't
> *really* know what it's good for, other than raising awareness of the
> theoretical possibility of collisions, which doesn't seem like quite
> enough.

Well, it's important to realise that the query string isn't really
stable, to the extent that it could differ as the query is evicted and
re-enters pg_stat_statements over time. The hash value is necessarily
a stable identifier, as it is the very identifier used by
pg_stat_statements. People are going to want to aggregate this
information over long periods and across database clusters, and I
would really like to facilitate that.

To be honest, with the plans that we have for replication, with the
addition of things like cascading replication, I think it will
increasingly be the case that people prefer built-in replication. The
fact that the actual hash value is affected by factors like the
endianness of the architecture in question seems mostly irrelevant.

If we were to expose this, I'd suggest that the documentation in the
table describing each column say of the value:

query_hash | stable identifier used by pg_stat_statements for the query

There'd then be additional clarification after the existing reference
to the hash value, which gave the required caveats about the hash
value being subject to various implementation artefacts that
effectively only make the values persistently indentify queries
referencing particular objects in the same database (that is, the
object OID values are used), or across databases that are binary
clones, such as between a streaming replica master and its standby.
The OID restriction alone effectively shadows all others, so there's
no need to mention endianness or anything like that.

Anyone who jumped to the conclusion that their aggregation tool would
work fine with Slony or something based on the query_hash description
alone would probably have bigger problems.

> On another note, I think this is a sufficiently major change that we
> ought to add Peter's name to the "Author" section of the
> pg_stat_statements documentation.

Thanks. I actually thought this myself, but didn't want to mention it
because I didn't think that it was up to me to decide, or to attempt
to influence that decision.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

In response to

Re: Patch: add timing of buffer I/O requests at 2012-04-13 19:15:28 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jim Nasby	2012-04-13 20:38:09	Re: Patch: add timing of buffer I/O requests
Previous Message	Robert Haas	2012-04-13 20:30:16	Re: xlog location arithmetic