Re: pg_stat_statements: calls under-estimation propagation

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Daniel Farina <daniel(at)heroku(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Sameer Thakur <samthakur74(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_statements: calls under-estimation propagation
Date: 2013-10-10 17:49:34
Message-ID: 20131010174934.GE4825@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Daniel Farina escribió:
> On Thu, Oct 10, 2013 at 7:40 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:

> > In my test, I found that pg_stat_statements.total_time always indicates a zero.
> > I guess that the patch might handle pg_stat_statements.total_time wrongly.
> >
> > + values[i++] = DatumGetTimestamp(
> > + instr_get_timestamptz(pgss->session_start));
> > + values[i++] = DatumGetTimestamp(
> > + instr_get_timestamptz(entry->introduced));
> >
> > These should be executed only when detected_version >= PGSS_TUP_LATEST?
>
> Yes. Oversight.

Hmm, shouldn't this be conditional on a new PGSS_TUP_V1_2? I mean, if
later somebody patches pgss to have a PGSS_TUP_V2 or V1_3 and that
becomes latest, somebody running the current definition with the updated
.so will no longer get these values. Or is the intention that
PGSS_TUP_LATEST will never be updated again, and future versions will
get PGSS_TUP_REALLY_LATEST and PGSS_TUP_REALLY_LATEST_HONEST and so on?
I mean, surely we can always come up with new symbols to use, but it
seems hard to follow.

There's one other use of PGSS_TUP_LATEST here which I think should also
be changed to >= SOME_SPECIFIC_VERSION:

+ if (detected_version >= PGSS_TUP_LATEST)
+ {
+ uint64 qid = pgss->private_stat_session_key;
+
+ qid ^= (uint64) entry->query_id;
+ qid ^= ((uint64) entry->query_id) << 32;
+
+ values[i++] = Int64GetDatumFast(qid);
+ }

This paragraph reads a bit strange to me:

+ A statistics session is the time period when statistics are gathered by statistics collector
+ without being reset. So a statistics session continues across normal shutdowns,
+ but whenever statistics are reset, like during a crash or upgrade, a new time period
+ of statistics collection commences i.e. a new statistics session.
+ The query_id value generation is linked to statistics session to emphasize the fact
+ that whenever statistics are reset,the query_id for the same queries will also change.

"time period when"? Shouldn't that be "time period during which".
Also, doesn't a new "statistics session" start when a stats reset is
invoked by the user? The bit after "commences" appears correct (to me,
not a native by any means) but seems also a bit strange.

I just noticed that the table describing the output indicates that
session_start and introduced are of type text, but the SQL defines
timestamptz.

The instr_time thingy being used for these times maps to
QueryPerformanceCounter() on Windows, and I'm not sure how useful this
is for long-term time tracking; see
http://stackoverflow.com/questions/5296990/queryperformancecounter-and-overflows#5297163
for instance. I think it'd be better to use TimestampTz and
GetCurrentTimestamp() for this.

Just noticed that you changed the timer to struct Instrumentation. Not
really sure about that change. Since you seem to be using only the
start time and counter, wouldn't it be better to store only those?
Particularly unsure about passing INSTRUMENT_ALL to InstrAlloc().

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2013-10-10 17:59:37 Re: Auto-tuning work_mem and maintenance_work_mem
Previous Message Josh Berkus 2013-10-10 17:37:22 Re: Auto-tuning work_mem and maintenance_work_mem