Re: Planning counters in pg_stat_statements (using pgss_store)

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Marco Slot <marco(dot)slot(at)gmail(dot)com>
Cc: "imai(dot)yoshikazu(at)fujitsu(dot)com" <imai(dot)yoshikazu(at)fujitsu(dot)com>, legrand legrand <legrand_legrand(at)hotmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Marco Slot <marco(at)citusdata(dot)com>
Subject: Re: Planning counters in pg_stat_statements (using pgss_store)
Date: 2020-03-12 18:36:33
Message-ID: CAOBaU_YekXNsQ819w=eBTM_aQ+BYPcVWLX0sDO-5xRNhqipbRA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 12, 2020 at 1:11 PM Marco Slot <marco(dot)slot(at)gmail(dot)com> wrote:
>
> On Thu, Mar 12, 2020 at 11:31 AM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> > There's at least the current version of IVM patchset that lacks the
> > querytext. Looking at various extensions, I see that pg_background
> > and pglogical call pg_plan_query internally but shouldn't have any
> > issue providing the query text. But there's also citus extension,
> > which don't keep around the query string at least when distributing
> > plans, which makes sense since it's of no use and they're heavily
> > modifying the original Query. I think that citus folks opinion on the
> > subject would be useful, so I'm Cc-ing Marco.
>
> Most of the time we keep our Query * data structures in a form that
> can be deparsed back into a query string by a modified copy of
> ruleutils.c, so we could generate a correct query string if absolutely
> necessary, though there are performance-sensitive cases where we'd
> rather not have the deparsing overhead.

Yes, deparsing is probably too expensive for that use case.

> In case of INSERT..SELECT into a distributed table, we might call
> pg_plan_query on the SELECT part (Query *) and send the output into a
> DestReceiver that sends tuples into shards of the distributed table
> via COPY. The fact that SELECT does not show up in pg_stat_statements
> separately is generally fine because it's an implementation detail,
> and it would probably be a little confusing because the user never ran
> the SELECT query. Moreover, the call to pg_plan_query would already be
> reflected in the planning or execution time of the top-level query, so
> it would be double counted if it had its own entry.

Well, surprising statements can already appears in pg_stat_statements
when you use some psql features, or if you have triggers as those will
run additional queries under the hood.

The difference here is that since citus is a CustomNode, underlying
calls to planner will be accounted for that node, and that's indeed
annoying. I can see that citus is doing some calls to spi_exec or
Executor* (in ExecuteLocalTaskPlan), which could also trigger
pg_stat_statements, but I don't know if a queryid is present there.

> Another case is when some of the shards turn out to be local to the
> server that handles the distributed query. In that case we plan the
> queries on those shards via pg_plan_query instead of deparsing and
> sending the query string to a remote server. It would be less
> confusing for these queries to show in pg_stat_statements, because the
> queries on the shards on remote servers will show up as well. However,
> this is a performance-sensitive code path where we'd rather avoid
> deparsing.

Agreed.

> In general, I'd prefer if there was no requirement to pass a correct
> query string. I'm ok with passing "SELECT 'citus_internal'" or just ""
> if that does not lead to issues. Passing NULL to signal that the
> planner call should not be tracked separately does seem a bit cleaner.

That's very interesting feedback, thanks! I'm not a fan of giving a
way for queries to say that they want to be ignored by
pg_stat_statements, but double counting the planning counters seem
even worse, so I'm +0.5 to accept NULL query string in the planner,
incidentally making pgss ignore those.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-03-12 18:51:09 Re: The flinfo->fn_extra question, from me this time.
Previous Message Tomas Vondra 2020-03-12 17:30:47 Re: PATCH: add support for IN and @> in functional-dependency statistics use