Re: Hash id in pg_stat_statements

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hash id in pg_stat_statements
Date: 2012-10-01 17:05:57
Message-ID: 20121001170557.GW1267@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Peter,

* Peter Geoghegan (peter(at)2ndquadrant(dot)com) wrote:
> That won't really help matters. There'd still be duplicate entries,
> from before and after the change, even if we make it immediately
> obvious which is which. The only reasonable solution in that scenario
> is to bump PGSS_FILE_HEADER, which will cause all existing entries to
> be invalidated.

You're going to have to help me here, 'cause I don't see how there can
be duplicates if we include the PGSS_FILE_HEADER as part of the hash,
unless we're planning to keep PGSS_FILE_HEADER constant while we change
what the hash value is for a given query, yet that goes against the
assumptions that were laid out, aiui.

If there's a change that results in a given query X no longer hashing to
a value A, we need to change PGSS_FILE_HEADER to invalidate statistics
which were collected for value A (or else we risk an independent query Y
hashing to value A and ending up with completely invalid stats..).
Provided we apply that consistently and don't reuse PGSS_FILE_HEADER
values along the way, a combination of PGSS_FILE_HEADER and the hash
value for a given query should be unique over time.

We do need to document that the hash value for a given query may
change..

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2012-10-01 17:13:49 Re: WIP checksums patch
Previous Message Jeff Davis 2012-10-01 17:04:09 Re: WIP checksums patch