Re: Adding REPACK [concurrently]

From: David Klika <david(dot)klika(at)atlas(dot)cz>
To: alvherre(at)alvh(dot)no-ip(dot)org, ah(at)cybertec(dot)at
Cc: jian(dot)universality(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, mihailnikalayeu(at)gmail(dot)com, rob(at)xzilla(dot)net
Subject: Re: Adding REPACK [concurrently]
Date: 2025-12-04 15:17:51
Message-ID: 84a6d065-1dc3-4b37-af7b-75904d967ab4@atlas.cz
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello

Great to hear about this feature.

You speak about table rewrite (suppose a whole-table rewrite). I would
like to share idea of an alternative approach that also takes into
account amount of WAL generated during the operation. Applicable to
non-clustered case only.

Let's consider a large table where 80% blocks are fine (filled enough by
live tuples). The table could be scanned from the beginning (left side)
to identify "not enough filled" blocks and also from the end (right
side) to process live tuples by moving them to the blocks identified
by the left side scan. The work is over when both scan reaches the same
position.

Example:

_ stands for filled enough blocks

D stands for blocks with (many) dead tuples

123456789
___DD____

Left scan identifies page #4 and tuples from the right scan (page #9)
are moved here. The same with tuples from #8 to #5. Two pages from the
data file are trimmed and (only) pages #4 and #5 are written in WAL,
others are untouched.

Regards
David

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema-Nio 2025-12-04 15:29:55 Re: Safer hash table initialization macro
Previous Message Tom Lane 2025-12-04 15:17:28 Re: Use func(void) for functions with no parameters