Re: autoprewarm_dump_now

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Daria Shanina <vilensipkdm(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: autoprewarm_dump_now
Date: 2025-06-03 17:56:09
Message-ID: CA+Tgmobh6d=jJTJqw_So5cUDR-AJtbEPFBp-jHFJBJe_E39_TQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 3, 2025 at 1:24 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > I think the proposed patch should be committed and back-patched, after
> > fixing it so that it's pgindent-clean and adding a comment. Does
> > anyone have strong objection to that?
>
> Not here. I do wonder if we can't find a more memory-efficient way,
> but I concur that any such change would likely not be back-patch
> material.

Cool. Regarding memory efficiency, I was just discussing radix trees
this morning with Sawada-san and others. "lib/radixtree.h" seems
unsuitable here because it supposes that each node should cover 1 byte
of key material, which seems unsuitable for a 20-byte key, but I
wonder if we could reuse "common/blkreftable.h" which I created for
incremental backup but which seems like it could probably work here as
well.

Implementation sketch: The "limit block" concept would be irrelevant
in this case, so you just wouldn't call BlockRefTableSetLimitBlock.
Instead, you'd call BlockRefTableMarkBlockModified for each
memory-resident block -- the name would be a misnomer, and we might
want to do some renaming. blkreftable.c/h already has infrastructure
to read and write the data structure from and to disk, so we wouldn't
need to reinvent that.

The reason I think this might work out well is that for moderately
sparse block sets, the space usage is about 2 bytes per block, while
for dense sets block sets, it converges to 1 bit per block. If you
make the set of blocks super-duper sparse then it might use more
memory than what we do today, but I'm not sure that's a very realistic
scenario in practice -- you'd probably need to have a gigantic number
of relations in the database and absolutely no locality of access. See
the comment in blkreftable.c just above "#define BLOCKS_PER_CHUNK" for
what the format actually is.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2025-06-03 17:59:45 Re: Correcting freeze conflict horizon calculation
Previous Message Robert Haas 2025-06-03 17:27:29 Re: Avoid orphaned objects dependencies, take 3