Re: [POC] Allow an extension to add data into Query and PlannedStmt nodes

From: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [POC] Allow an extension to add data into Query and PlannedStmt nodes
Date: 2023-03-30 17:20:19
Message-ID: 9209b8f9-5e63-9f6f-b011-84274823f89e@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 30/3/2023 12:57, Julien Rouhaud wrote:
>> Extensions could need to pass some query-related data through all stages of
>> the query planning and execution. As a trivial example, pg_stat_statements
>> uses queryid at the end of execution to save some statistics. One more
>> reason - extensions now conflict on queryid value and the logic of its
>> changing. With this patch, it can be managed.
>
> I just had a quick lookc at the patch, and IIUC it doesn't really help on that
> side, as there's still a single official "queryid" that's computed, stored
> everywhere and later used by pg_stat_statements (which does then store in
> additionally to that official queryid).
Thank you for the attention!
This patch doesn't try to solve the problem of oneness of queryId. In
this patch we change pg_stat_statements and it doesn't set 0 into
queryId field according to its internal logic. And another extension
should do the same - use queryId on your own but not erase it - erase
your private copy in the ext_field.

>> Ruthless pgbench benchmark shows that we got some overhead:
>> 1.6% - in default mode
>> 4% - in prepared mode
>> ~0.1% in extended mode.
>
> That's a quite significant overhead. But the only reason to accept such a
> change is to actually use it to store additional data, so it would be
> interesting to see what the overhead is like once you store at least 2
> different values there.
Yeah, but as I said earlier, it can be reduced to much smaller value
just with simple optimization. Here I intentionally avoid it to discuss
the core concept.
>
>> - Are we need to invent a registration procedure to do away with the names
>> of entries and use some compact integer IDs?
>
> Note that the patch as proposed doesn't have any defense for two extensions
> trying to register something with the same name, or update a stored value, as
> AddExtensionDataToNode() simply prepend the new value to the list. You can
> actually update the value by just storing the new value, but it will add a
> significant overhead to every other extension that want to read another value.
Thanks a lot! Patch in attachment implements such an idea - extension
can allocate some entries and use these private IDs to add entries. I
hope, an extension would prefer to use only one entry for all the data
to manage overhead, but still.
>
> The API is also quite limited as each stored value has a single identifier.
> What if your extension needs to store multiple things? Since it's all based on
> Node you can't really store some custom struct, so you probably have to end up
> with things like "my_extension.my_val1", "my_extension.my_val2" which isn't
> great.
Main idea here - if an extension holds custom struct and want to pass it
along all planning and execution stages it should use extensible node
with custom read/write/copy routines.

--
regards,
Andrey Lepikhov
Postgres Professional

Attachment Content-Type Size
v0-0002-Add-ExtendedData-entry-registration-routines-to-use-.patch text/plain 6.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-03-30 18:04:54 Re: Thoughts on using Text::Template for our autogenerated code?
Previous Message Tom Lane 2023-03-30 17:07:20 pgsql: Clean up role created in new subscription test.