Re: pg_stat_statements: calls under-estimation propagation

From: samthakur74 <samthakur74(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_stat_statements: calls under-estimation propagation
Date: 2013-09-20 06:04:49
Message-ID: CABzZFEuj+fWpwJ8Gah7zciwysyQTdGHYbBMo0m=6uN1mmLwZUw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 19, 2013 at 11:32 AM, Fujii Masao-2 [via PostgreSQL] <
ml-node+s1045698n5771565h39(at)n5(dot)nabble(dot)com> wrote:

> On Thu, Sep 19, 2013 at 2:25 PM, samthakur74 <[hidden email]<http://user/SendEmail.jtp?type=node&node=5771565&i=0>>
> wrote:
>
> >>I got the segmentation fault when I tested the case where the
> >> least-executed
> >>query statistics is discarded, i.e., when I executed different queries
> more
> >> than
> >>pg_stat_statements.max times. I guess that the patch might have a bug.
> > Thanks, will try to fix it.
> >
> >> >pg_stat_statements--1.1.sql should be removed.
> >> Yes will do that
> >
> >
> >>
> >> >+ <entry><structfield>queryid</structfield></entry>
> >> >+ <entry><type>bigint</type></entry>
> >> >+ <entry></entry>
> >> >+ <entry>Unique value of each representative statement for the
> >> >current statistics session.
> >> >+ This value will change for each new statistics
> session.</entry>
> >>
> >> >What does "statistics session" mean?
> >> The time period when statistics are gathered by statistics collector
> >> without being reset. So the statistics session continues across normal
> >> shutdowns, but in case of abnormal situations like crashes, format
> upgrades
> >> or statistics being reset for any other reason, a new time period of
> >> statistics collection starts i.e. a new statistics session. The queryid
> >> value generation is linked to statistics session so emphasize the fact
> that
> >> in case of crashes,format upgrades or any situation of statistics
> reset, the
> >> queryid for the same queries will also change.
>
> >I'm afraid that this behavior narrows down the use case of queryid very
> much.
> >For example, since the queryid of the same query would not be the same in
> >the master and the standby servers, we cannot associate those two
> statistics
> >by using the queryid. The queryid changes through the crash recovery, so
> >we cannot associate the query statistics generated before the crash with
> that
> >generated after the crash recovery even if the query is the same.
>
> Yes, these are limitations in this approach. The other approaches
suggested included
1. Expose query id hash value as is, in the view, but document the fact
that it will be unstable between releases
2. Expose query id hash value via an undocumented function and let more
expert users decided if they want to use it.

The approach of using statistics session id to generate queryid is a
compromise between not exposing it at all and exposing it without warning
the users of unstable hash value from query tree between releases.

> >This is not directly related to the patch itself, but why does the
> queryid
> >need to be calculated based on also the "statistics session"?
>
If we expose hash value of query tree, without using statistics session,
it is possible that users might make wrong assumption that this value
remains stable across version upgrades, when in reality it depends on
whether the version has make changes to query tree internals. So to
explicitly ensure that users do not make this wrong assumption, hash value
generation use statistics session id, which is newly created under
situations described above.

>
> >> Will update documentation
> >> clearly explain the term statistics session in this context
>
> >Yep, that's helpful!
>
> Regards,
> Sameer
>
>

--
View this message in context: http://postgresql.1045698.n5.nabble.com/pg-stat-statements-calls-under-estimation-propagation-tp5738128p5771701.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message KONDO Mitsumasa 2013-09-20 06:42:53 gaussian distribution pgbench
Previous Message David Rowley 2013-09-20 06:00:13 Re: FW: REVIEW: Allow formatting in log_line_prefix