| From: | Andy Fan <zhihuifan1213(at)163(dot)com> |
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Cc: | David Rowley <dgrowleyml(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de> |
| Subject: | Re: Make printtup a bit faster |
| Date: | 2026-05-06 13:00:29 |
| Message-ID: | 877bpghevm.fsf@163.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
> Andy Fan <zhihuifan1213(at)163(dot)com> writes:
>> You understood me correctly and I thought we should maintain one version
>> two years ago, so I tried to implement this idea today. The first issue I
>> want to talk about now how to define the function protocol in SQL, take
>> int4out for example:
>
>> master: cstring int4out(integer);
>> New protocol: void int4out(integer, internal). and the internal is
>> StringInfo acutally.
>
>> The direct impaction would be:
>
>> master support:
>> postgres=# select int4out(8);
>> int4out
>> ---------
>> 8
>> (1 row)
>
>> After our change, user could not invoke any {type}out function anymore
>> in SQL since it takes 'internal' as an agrument. I am not sure if people
>> would write SQL like this, but it'd be good to have a talk about this.
>
> I think you missed the point of what I said two years ago: you will
> never be able to remove the existing output function API, nor the
> per-datatype functions that implement that API. Even if we were
> willing to convert every last one of the in-core callers and callees,
> doing that would break too much non-core code. So the above example
> is never going to stop working.
OK, I thought David supported this and you didn't object it, so I
planned to try with this way.
> We can consider implementing a new datatype output API alongside the
> existing one. But it'd likely not be callable from SQL, so the
> question of SQL compatibility is moot.
>
> My own guess is that even if we built a new API, only a fairly small
> number of datatypes would find it worth the trouble to support.
> The potential win seems clear for, say, textout: there's really no
> computation to do, only data copying, so halving the amount of
> copying is attractive. But I bet you won't measure much percentage
> improvement for numeric_out or point_out.
Is this similar with the soluation I called as print function at [1]
"""
My high level proposal is define a type specific print function like:
oidprint(Datum datum, StringInfo buf)
textprint(Datum datum, StringInfo buf)
"""
Then we can have benefits like compatibility, incremental development
(start from common used/potential win data type). But as David said
"what would be the point of having both versions?" at [2], actually I
was persuaded by this, I thought David's method paies more effort to
current patch now and save the maintain effort in future. (To be honest,
I am not good at the trade off...)
So looks we have 3 soluation for now IIUC.
(1) Maintaining one copy of output function (David's proposal).
(2) Add a new type of API for some specific data types like above.
(3) Andres's method, acutally I can't follow well now.
From Andres:
> FWIW, I've experimented fixing this overhead before, and what I did was to
> pass an optional context via the fcinfo, and output / send functions could use
> memory allocated via that optional context object, rather than doing it
> allocating in CurrentMemoryContext. For the send functions that looks
> reasonably clean, given that it already deals with a stringinfo. For out
> functions it's a bit uglier, but still somewhat acceptable.
Puting optional context via the fcinfo looks novel to me (I have zero
experience to use fcinfo utility.). Then I'm not sure how to use the
optional context, Will it be a MemoryContext or a StringInfo? If
MemoryContext, then how to avoid the memory copy in the printtup
sistuation or this method has different target.
> This line of thought suggests that maybe some special-purpose hack
> would be a better answer than defining a new datatype API. It's hard
> to tell without some concrete performance numbers, which are sadly
> lacking in this thread.
Does the data in [3] helpful? Quote the message there:
"The attached is PoC of this idea, not matter which method are adopted
(rewrite all the outfunction or a optional print function), I think the
benefit will be similar. In the blew test case, it shows us 10%+
improvements. (0.134ms vs 0.110ms)
create table demo as select oid as oid1, relname::text as text1, relam,
relname::text as text2 from pg_class;
pgbench:
select * from demo;"
I re-attached the patch there, just rebased with the latest master.
[1] https://www.postgresql.org/message-id/87wmjzfz0h.fsf%40163.com
[2]
https://www.postgresql.org/message-id/CAApHDvqHthJb6baDhgTE5T4RLW6nEX%3Dr239EYmpjfg%3DWq5CqQA%40mail.gmail.com
[3] https://www.postgresql.org/message-id/87v7zihaf1.fsf%40163.com
--
Best Regards
Andy Fan
| Attachment | Content-Type | Size |
|---|---|---|
| v20260912-0003-add-unlikely-hint-for-enlargeStringInfo.patch | text/x-diff | 1.3 KB |
| v20260912-0001-Refactor-float8out_internval-for-better-pe.patch | text/x-diff | 6.3 KB |
| v20260912-0002-Continue-to-remove-some-unnecesary-strlen-.patch | text/x-diff | 2.2 KB |
| v20260912-0004-Make-printtup-a-bit-faster-intermediate-st.patch | text/x-diff | 30.0 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tomas Vondra | 2026-05-06 13:56:30 | occasional ECPG failures on dikkop (FreeBSD) |
| Previous Message | Dilip Kumar | 2026-05-06 12:58:27 | Re: Proposal: Conflict log history table for Logical Replication |