Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Daniel Farina <daniel(at)heroku(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)
Date: 2012-03-01 21:27:25
Message-ID: CAEYLb_UgpcBCOT=HV__74FHrG8YcAJakLUcBa+m55H-sXxYC4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1 March 2012 00:48, Alvaro Herrera <alvherre(at)commandprompt(dot)com> wrote:
> I'm curious about the LeafNode stuff.  Is this something that could be
> done by expression_tree_walker?  I'm not completely familiar with it so
> maybe there's some showstopper such as some node tags not being
> supported, or maybe it just doesn't help.  But I'm curious.

Good question. The LeafNode function (which is a bit of a misnomer, as
noted in a comment) looks rather like a walker function, as appears in
the example in a comment in nodeFuncs.c:

* expression_tree_walker() is designed to support routines that traverse
* a tree in a read-only fashion (although it will also work for routines
* that modify nodes in-place but never add/delete/replace nodes).
* A walker routine should look like this:
*
* bool my_walker (Node *node, my_struct *context)
* {
* if (node == NULL)
* return false;
* // check for nodes that special work is required for, eg:
* if (IsA(node, Var))
* {
* ... do special actions for Var nodes
* }
* else if (IsA(node, ...))
* {
* ... do special actions for other node types
* }
* // for any node type not specially processed, do:
* return expression_tree_walker(node, my_walker, (void *) context);
* }

My understanding is that the expression-tree walking support is mostly
useful for the majority of walker code, which only cares about a small
subset of nodes, and hopes to avoid including boilerplate code just to
walk those other nodes that it's actually disinterested in.

This code, unlike most clients of expression_tree_walker(), is pretty
much interested in everything, since its express purpose is to
fingerprint all possible query trees.

Another benefit of expression_tree_walker is that if you miss a
certain node being added, (say a FuncExpr-like node), you get to
automatically have that node walked over to walk to the nodes that you
do in fact care about (such as those within this new nodes args List).
That makes perfect sense in the majority of cases, but here you've
already missed the fields within this new node that FuncExpr itself
lacks, so you're already finger-printing inaccurately. I suppose you
could still at least get the nodetag and still have a warning about
the fingerprinting being inadequate by going down the
expression_tree_walker path, but I'm inclined to wonder if it you
aren't just better of directly walking the tree, if only to encourage
the idea that this code needs to be maintained over time, and to cut
down on the little extra bit of indirection that that imposes.

It's not going to be any sort of burden to maintain it - it currently
stands at a relatively meagre 800 lines of code for everything to do
with tree walking - and the code that will have to be added with new
nodes or refactored along with the existing tree structure is going to
be totally trivial.

All of that said, I wouldn't mind making LeafNode into a walker, if
that approach is judged to be better, and you don't mind documenting
the order in which the tree is walked as deterministic, because the
order now matters in a way apparently not really anticipated by
expression_tree_walker, though that's probably not a problem.

My real concern now is that it's March 1st, and the last commitfest
may end soon. Even though this patch has extensive regression tests,
has been floating around for months, and, I believe, will end up being
a timely and important feature, a committer has yet to step forward to
work towards this patch getting committed. Can someone volunteer,
please? My expectation is that this feature will make life a lot
easier for a lot of Postgres users.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-03-01 21:51:38 Re: Collect frequency statistics for arrays
Previous Message Tom Lane 2012-03-01 21:08:26 Re: 16-bit page checksums for 9.2