Re: pg_stat_statements with query tree based normalization

From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_statements with query tree based normalization
Date: 2011-12-07 01:19:24
Message-ID: CAEYLb_WX50qKziuc2BHejYDmDYnBqbgc0cWvrBtK9G5iGSqnxA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 14 November 2011 04:42, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
> The approach Peter used adds a single integer to the Const structure in
> order to have enough information to substitute "?" in place of those.
>  Adding and maintaining that is the only change outside of the extension
> made here, and that overhead is paid by everyone--not just consumers of this
> new code.

I've attempted to isolate that overhead, so far unsuccessfully. Attached are:

1. A simple python + psycopg2 script for repeatedly running a
succession of similar queries that explain would show as containing a
single "Result" node. They contain 300 Const integer nodes by
default, which are simply selected.

2. The results of running the script on Greg's server, which has CPU
frequency scaling disabled. That's an ODS spreadsheet. Out of
consideration of filesize, I've deleted the query column in each
sheet, which wasn't actually useful information.

The results are...taking the median value of each set of runs as
representative, my patch appears to run marginally faster than head.
Of course, there is no reason to believe that it should, and I'm
certain that the difference can be explained by noise, even though
I've naturally strived to minimise noise.

If someone could suggest a more telling test case, or even a
worst-case, that would be useful. This was just my first run at this.
I know that the overhead will also exist in code not well-exercised by
these queries, but I imagine that any real-world query that attempts
to exercise them all is going to add other costs that dwarf the
additional overhead and further muddy the waters.

I intend to work through the known issues with this patch in the next
couple of days.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Attachment Content-Type Size
performance_test.py text/x-python 3.0 KB
field_addition_results.ods application/vnd.oasis.opendocument.spreadsheet 30.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-12-07 01:46:27 Re: Inlining comparators as a performance optimisation
Previous Message Peter Geoghegan 2011-12-07 01:13:59 Re: Inlining comparators as a performance optimisation