Re: Reducing output size of nodeToString

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(at)eisentraut(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Michel Pelletier <pelletier(dot)michel(at)gmail(dot)com>
Subject: Re: Reducing output size of nodeToString
Date: 2024-02-19 13:19:58
Message-ID: CAEze2WhfRn0cdNer0Vkye_61BwAmMqM6D9_cJp8i6JmZ8U4wAA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 15 Feb 2024 at 15:37, Matthias van de Meent
<boekewurm+postgres(at)gmail(dot)com> wrote:
>
> On Thu, 15 Feb 2024 at 13:59, Peter Eisentraut <peter(at)eisentraut(dot)org> wrote:
> >
> > Thanks, this patch set is a good way to incrementally work through these
> > changes.
> >
> > I have looked at
> > v4-0001-pg_node_tree-Omit-serialization-of-fields-with-de.patch today.
> > Here are my thoughts:
> >
> > I believe we had discussed offline to not omit enum fields with value 0
> > (WRITE_ENUM_FIELD). This is because the values of enum fields are
> > implementation artifacts, and this could be confusing for readers.
>
> Thanks for reminding me, I didn't remember this when I worked on
> updating the patchset. I'll update this soon.

This has been split into patch 0008 in the set. A query on ev_action
shows that enum default-0-omission is effective on 1994 fields:

select match, count(*)
from pg_rewrite,
lateral (
select unnest(regexp_matches(ev_action, '(:\w+ 0)[^0-9]', 'g')) match
)
group by 1 order by 2 desc;
match | count
-----------------+-------
:funcformat 0 | 587
:rtekind 0 | 449
:limitOption 0 | 260
:querySource 0 | 260
:override 0 | 260
:jointype 0 | 156
:aggsplit 0 | 15
:subLinkType 0 | 5
:nulltesttype 0 | 2

> > On the reading side, the macro nesting has gotten a bit out of hand. :)
> > We had talked earlier in the thread about the _DIRECT macros and you
> > said there were left over from something else you want to try, but I see
> > nothing else in this patch set uses this. I think this could all be
> > much simpler, like (omitting required punctuation)
> [...]
> > Not only is this simpler, but it might also have better performance,
> > because we don't have separate pg_strtok_next() and pg_strtok() calls in
> > sequence.
>
> Good points. I'll see what I can do here.

Attached the updated version of the patch on top of 5497daf3, which
incorporates this last round of feedback. It moves the
default-0-omission for Enums to newly added 0008, and checks the sign
to deal with +0/-0 issues in float default checks.
See below for updated numbers.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

New numbers:

select 'master' as "version"
, pg_database_size('template0') as "template0"
, pg_total_relation_size('pg_rewrite') as "rel_total"
, pg_relation_size('pg_rewrite', 'main') as "rel_main"
, sum(pg_column_size(ev_action)) as "toasted"
, sum(octet_length(ev_action)) as "raw"
from pg_rewrite;

version | template0 | rel_total | rel_main | toasted | raw
---------+-----------+-----------+----------+---------+---------
master | 7528975 | 770048 | 114688 | 574051 | 3002981
0001 | 7348751 | 630784 | 131072 | 448495 | 1972854
0002 | 7250447 | 589824 | 131072 | 412261 | 1866880
0003 | 7242255 | 581632 | 131072 | 410476 | 1864843
0004 | 7225871 | 565248 | 139264 | 393801 | 1678735
0005 | 7225871 | 565248 | 139264 | 393556 | 1675165
0006 | 7217679 | 557056 | 139264 | 379062 | 1654178
0007 | 7160335 | 491520 | 155648 | 322145 | 1363885
0008 | 7135759 | 475136 | 155648 | 311294 | 1337649

Attachment Content-Type Size
v3-0004-gen_node_support.pl-Add-a-TypMod-type-for-signall.patch application/octet-stream 10.4 KB
v3-0003-gen_node_support.pl-Mark-location-fields-as-type-.patch application/octet-stream 26.4 KB
v3-0001-incremental-backups-Add-new-items-to-glossary-mon.patch application/octet-stream 3.1 KB
v3-0002-pg_node_tree-Don-t-store-query-text-locations-in-.patch application/octet-stream 19.2 KB
v3-0001-pg_node_tree-Omit-serialization-of-fields-with-de.patch application/octet-stream 22.8 KB
v3-0005-nodeToString-omit-serializing-NULL-datums-in-Cons.patch application/octet-stream 1.8 KB
v3-0007-gen_node_support.pl-Optimize-serialization-of-fie.patch application/octet-stream 9.2 KB
v3-0008-nodeToString-omit-serializing-0s-in-enum-typed-fi.patch application/octet-stream 2.1 KB
v3-0006-nodeToString-Apply-RLE-on-Bitmapset-and-numeric-L.patch application/octet-stream 7.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2024-02-19 13:47:37 Re: Optimize planner memory consumption for huge arrays
Previous Message Ashutosh Bapat 2024-02-19 13:17:27 Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning