Re: Generating code for query jumbling through gen_node_support.pl

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>
Subject: Re: Generating code for query jumbling through gen_node_support.pl
Date: 2023-01-06 06:07:32
Message-ID: CALDaNm2O26SE7b05i-D+SffsnqD-xRt0c-ZgMzrDSB8+DfYCxA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 7 Dec 2022 at 13:27, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> Hi all,
>
> This thread is a follow-up of the recent discussion about query
> jumbling with DDL statements, where the conclusion was that we'd want
> to generate all this code automatically for all the nodes:
> https://www.postgresql.org/message-id/36e5bffe-e989-194f-85c8-06e7bc88e6f7@amazon.com
>
> What this patch allows to do it to compute the same query ID for
> utility statements using their parsed Node state instead of their
> string, meaning that things like "BEGIN", "bEGIN" or "begin" would be
> treated the same, for example. But the main idea is not only that.
>
> I have implemented that as of the attached, where the following things
> are done:
> - queryjumble.c is moved to src/backend/nodes/, to stick with the
> other things for node equal/read/write/copy, renamed to
> jumblefuncs.c.
> - gen_node_support.c is extended to generate the functions and the
> switch for the jumbling. There are a few exceptions, as of the Lists
> and RangeTblEntry to do the jumbling consistently.
> - Two pg_node_attr() are added in consistency with the existing ones:
> no_jumble to discard completely a node from the the query jumbling
> and jumble_ignore to discard one field from the jumble.
>
> The patch is in a rather good shape, passes check-world and the CI,
> but there are a few things that need to be discussed IMO. Things
> could be perhaps divided in more patches, now the areas touched are
> quite different so it did not look like a big deal to me as the
> changes touch different areas and are straight-forward.
>
> The location of the Nodes is quite invasive because we only care about
> that for T_Const now in the query jumbling, and this could be
> compensated with a third pg_node_attr() that tracks for the "int
> location" of a Node whether it should participate in the jumbling or
> not. There is also an argument where we would want to not include by
> default new fields added to a Node, but that would require added more
> pg_node_attr() than what's done here.
>
> Note that the plan is to extend the normalization to some other parts
> of the Nodes, like CALL and SET as mentioned on the other thread. I
> have done nothing about that yet but doing so can be done in a few
> lines with the facility presented here (aka just add a location
> field). Hence, the normalization is consistent with the existing
> queryjumble.c for the fields and the nodes processed.
>
> In this patch, things are done so as the query ID is not computed
> anymore from the query string but from the Query. I still need to
> study the performance impact of that with short queries. If things
> prove to be noticeable in some cases, this stuff could be switched to
> use a new GUC where we could have a code path for the computation of
> utilityStmt using its string as a fallback. I am not sure that I want
> to enter in this level of complications, though, to keep things
> simple, but that's yet to be done.
>
> A bit more could be cut but pg_ident got in the way.. There are also
> a few things for pg_stat_statements where a query ID of 0 can be
> implied for utility statements in some cases.
>
> Generating this code leads to an overall removal of code as what
> queryjumble.c is generated automatically:
> 13 files changed, 901 insertions(+), 1113 deletions(-)
>
> I am adding that to the next commit fest.

The patch does not apply on top of HEAD as in [1], please post a rebased patch:
=== Applying patches on top of PostgreSQL commit ID
b82557ecc2ebbf649142740a1c5ce8d19089f620 ===
=== applying patch
./0001-Support-for-automated-query-jumble-with-all-Nodes.patch
...
patching file src/backend/utils/misc/queryjumble.c
Hunk #1 FAILED at 1.
Not deleting file src/backend/utils/misc/queryjumble.c as content
differs from patch
1 out of 1 hunk FAILED -- saving rejects to file
src/backend/utils/misc/queryjumble.c.rej

[1] - http://cfbot.cputube.org/patch_41_4047.log

Regards,
Vignesh

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2023-01-06 06:13:43 Re: making relfilenodes 56 bits
Previous Message vignesh C 2023-01-06 06:03:49 Re: Force streaming every change in logical decoding