Re: [PATCHES] WIP: executor_hook for pg_stat_statements

From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] WIP: executor_hook for pg_stat_statements
Date: 2008-07-15 07:25:46
Message-ID: 20080715150758.3016.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches


Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

> > I wonder whether we ought to change things so that the real query
> > source text is available at the executor level. Since we are (at least
> > usually) storing the query text in cached plans, I think this might just
> > require some API refactoring, not extra space and copying. It would
> > amount to a permanent decision that we're willing to pay the overhead
> > of keeping the source text around, though.
>
> I think its a reasonable decision to do that. Knowing what you're doing
> while you do it is pretty important.

I worked around to it, I found I can use ActivePortal->sourceText in some
situations. But there are still some problems:

- SQL functions:
They don't modify ActivePortal->sourceText, but we could get the
source from SQLFunctionCache->src. If it is required, we might
need to add a new field in QueryDesc and copy the src to the field.
- Multiple queries:
Query text is not divided into each query.
Only the original combined text is available.
- RULEs:
There are similar issues with multiple queries.
Also, they don't have original query texts.

The same can be said for planner_hook(). Only available query text is
debug_query_string in it, and it is the top-level query. We cannot use
the actual SQL text which the Query object comes from. The treu query
text might be SQL functions used in the top-level query, a part of multiple
queries, or another query rewritten by RULE.

For these reasons, now I'm thinking to collect only top-query level
statistics, not per-planner+executor level statistics. i.e, when we
receive a multiple query "SELECT 1; SELECT 2;", pg_stat_statements uses
the original combined text as a key. Comsumed resource associated with
the key is sum of resources used in both "SELECT 1" and "SELECT 2".

> > Also, after looking at the patch more closely, was there a good reason
> > for making the hook intercept ExecutePlan rather than ExecutorRun?
>
> That raises the question of whether we should have ExecutorStart() and
> ExecutorEnd() hooks as well, to round things off.

Yeah, and also ExecutorRewind() hook. There are 4 interface functions in
executor. My addin only needs Run hook because it doesn't modify the actual
behavior of executor. However, when someone hope to replace the behavior,
they need all of the hooks. (Is multi-threaded executor project still alive?)

How about adding new Executor class
and ExecutorStart() returns an instance of Executor?

typedef struct Executor
{
ExecutorRunFunc run;
ExecutorEndFunc end;
ExecutorRewindFunc rewind;
/* there might be private fields. */
} Executor;

Executor *e = ExecutorStart_hook(...);
ExecutorRun(e, ...) => { e->run(e, ...); }
ExecutorEnd(e, ...) => { e->end(e, ...); }

It could be make APIs cleaner because QueryDesc has 3 fields only for
executor (tupDesc, estate, planstate). We can move those fields to
Executor's private fields. Is this modification acceptable?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2008-07-15 08:23:50 Re: [PATCHES] VACUUM Improvements - WIP Patch
Previous Message David E. Wheeler 2008-07-15 05:09:27 Re: PATCH: CITEXT 2.0 v3

Browse pgsql-patches by date

  From Date Subject
Next Message Heikki Linnakangas 2008-07-15 08:23:50 Re: [PATCHES] VACUUM Improvements - WIP Patch
Previous Message Greg Smith 2008-07-15 05:07:28 Re: posix advises ...