Re: 64-bit queryId?

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: 64-bit queryId?
Date: 2017-09-30 20:03:40
Message-ID: CAPpHfdtv_nwNkqwS6k_VgdExN3d6FKqFAMpSyqgc3UVJctEJTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 30, 2017 at 6:39 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:

> On Sat, Sep 30, 2017 at 7:34 AM, Robert Haas <robertmhaas(at)gmail(dot)com>
> wrote:
> > Assuming, however, that you don't manage to prove all known
> > mathematics inconsistent, what one might reasonably hope to do is
> > render collisions remote enough that one need not worry about them too
> > much in practice.
>
> Isn't that already true in the case of queryId? I've never heard any
> complaints about collisions. Most people don't change
> pg_stat_statements.max, so the probability of a collision is more like
> 1%. And, that's the probability of *any* collision, not the
> probability of a collision that the user actually cares about. The
> majority of entries in pg_stat_statements among those ten thousand
> will not be interesting. Often, 90%+ will be one-hit wonders. If that
> isn't true, then you're probably not going to find pg_stat_statements
> very useful, because you have nothing to focus on.
>

I heard from customers that they periodically dump contents of
pg_stat_statements and then build statistics over long period of time. If
even they leave default pg_stat_statements.max unchanged, probability of
collision would be significantly higher.

I have heard complaints about a number of different things in
> pg_stat_statements, like the fact that we don't always manage to
> replace constants with '?'/'$n' characters in all cases. I heard about
> that quite a few times during my time at Heroku. But never this.
>

I'm not sure that we should be oriented only by users complaints in this
problem by couple reasons.

1) Some defects could leave unrevealed by users during long periods of
time. We have examples of GiST picksplit algorithm to be bad resulting bad
query performance. However, users never complained about that because they
didn't know what is real fair performance for this kind of queries. In the
same way users may suffer from queryId collisions during long time without
noticing it. They might have inaccurate statistics of queries execution,
but they don't notice it because they have nothing to compare.

2) Ideally, we should fix potential problems before they materialize, not
only after users complaints.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Dolgov 2017-09-30 20:05:30 Re: Causal reads take II
Previous Message Andrew Dunstan 2017-09-30 19:58:04 Re: alter server for foreign table