Re: [PoC] Improve dead tuple storage for lazy vacuum

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Date: 2023-03-14 11:27:37
Message-ID: CAFBsxsF_y9Nwkk2-3pktAkPZq6NoRGRkGSFoto9isytQfNOUSA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:

> > > Since the block-level measurement is likely overestimating quite a
bit, I propose to simply reverse the order of the actions here, effectively
reporting progress for the *last page* and not the current one: First
update progress with the current memory usage, then add tids for this page.
If this allocated a new block, only a small bit of that will be written to.
If this block pushes it over the limit, we will detect that up at the top
of the loop. It's kind of like our earlier attempts at a "fudge factor",
but simpler and less brittle. And, as far as OS pages we have actually
written to, I think it'll effectively respect the memory limit, at least in
the local mem case. And the numbers will make sense.
> > >
> > > Thoughts?
> >
> > It looks to work but it still doesn't work in a case where a shared
> > tidstore is created with a 64kB memory limit, right?
> > TidStoreMemoryUsage() returns 1MB and TidStoreIsFull() returns true
> > from the beginning.
>
> I have two ideas:
>
> 1. Make it optional to track chunk memory space by a template parameter.
It might be tiny compared to everything else that vacuum does. That would
allow other users to avoid that overhead.
> 2. When context block usage exceeds the limit (rare), make the additional
effort to get the precise usage -- I'm not sure such a top-down facility
exists, and I'm not feeling well enough today to study this further.

Since then, Masahiko incorporated #1 into v31, and that's what I'm looking
at now. Unfortunately, If I had spent five minutes reminding myself what
the original objections were to this approach, I could have saved us some
effort. Back in July (!), Andres raised two points: GetMemoryChunkSpace()
is slow [1], and fragmentation [2] (leading to underestimation).

In v31, in the local case at least, the underestimation is actually worse
than tracking chunk space, since it ignores chunk header and alignment.
I'm not sure about the DSA case. This doesn't seem great.

It shouldn't be a surprise why a simple increment of raw allocation size is
comparable in speed -- GetMemoryChunkSpace() calls the right function
through a pointer, which is slower. If we were willing to underestimate for
the sake of speed, that takes away the reason for making memory tracking
optional.

Further, if the option is not specified, in v31 there is no way to get the
memory use at all, which seems odd. Surely the caller should be able to ask
the context/area, if it wants to.

I still like my idea at the top of the page -- at least for vacuum and
m_w_m. It's still not completely clear if it's right but I've got nothing
better. It also ignores the work_mem issue, but I've given up anticipating
all future cases at the moment.

I'll put this item and a couple other things together in a separate email
tomorrow.

[1]
https://www.postgresql.org/message-id/20220704211822.kfxtzpcdmslzm2dy%40awork3.anarazel.de
[2]
https://www.postgresql.org/message-id/20220704220038.at2ane5xkymzzssb%40awork3.anarazel.de

--
John Naylor
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Watzinger 2023-03-14 11:30:18 Re: pg_dump/pg_restore: Fix stdin/stdout handling of custom format on Win32
Previous Message gkokolatos 2023-03-14 11:07:25 Re: Add LZ4 compression in pg_dump