Quick Links

Re: Dead code with short varlenas in toast_save_datum()

From:	Nikhil Kumar Veldanda <veldanda(dot)nikhilkumar17(at)gmail(dot)com>
To:	Michael Paquier <michael(at)paquier(dot)xyz>
Cc:	Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Dead code with short varlenas in toast_save_datum()
Date:	2025-08-16 09:34:02
Message-ID:	CAFAfj_GjENTM1v_LAN+1AEhwuCEQj2kEkE_mBzQ2MBj+2YTggw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Aug 14, 2025 at 8:32 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
>
> First, in the SQL test. The trick where you are using a PLAIN storage
> to not allocate a chunk_id on initial storage with a value large
> enough to force TOAST on rewrite, while the value is small enough to
> fit on a single page, is a nice one. We could have used a \gset as
> well with a toasted value, but that won't change the fact that we
> check for a new value allocated. The location in strings.sql feels
> incorrect because this is a rewrite issue, so I have moved the test to
> vacuum.sql and applied a slightly-tweaked result. A second thing I
> have added is a test to make sure that the same chunk_id is reused
> after the rewrite. That's also worth tracking and cheap, covering the
> non-InvalidOid case.
>

Thank you, Michael, for adjusting the change and merging it.

> With the isolation test, the case is different, and it looks like the
> test is incomplete: we want to make sure that the new chunk IDs are
> the same before and after, but we cannot use \gset in this context.
> What I would suggest is to create an intermediate table storing the
> contents we want to compare after the CLUSTER, with a CTAS that stores
> the primary key of cluster_toast_value_reuse.id and the chunk_id
> associated to each row. Then, after the CLUSTER, we join the pkey
> values in the CTAS table and cluster_toast_value_reuse, compare their
> chunk IDs and they should match. The test triggers 29 times the
> todo=0 code path, as far as I can see, but we should not need that
> many tuples with generate_series(), no? If the test is written so as
> we compare the old and new chunk IDs with the pkey values, the number
> of tuples does not matter much, but that would be a bit cheaper to
> run. Could you update the isolation test to do something among these
> lines?
> --

Thanks for the guidance. I’ve updated the isolation test to use a CTAS
capturing (id, chunk_id) pre-CLUSTER and added a post-CLUSTER join to
verify the chunk IDs match. I also reduced the number of tuples to 1.

Please find the attached patch for your review. Thanks

--
Nikhil Veldanda

Attachment	Content-Type	Size
v1-0001-Add-isolation-test-for-TOAST-value-deduplication-.patch	application/octet-stream	5.6 KB

In response to

Re: Dead code with short varlenas in toast_save_datum() at 2025-08-15 03:32:42 from Michael Paquier

Responses

Re: Dead code with short varlenas in toast_save_datum() at 2025-08-17 06:26:20 from Michael Paquier

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Kirill Reshke	2025-08-16 09:43:20	Re: Retail DDL
Previous Message	Tatsuo Ishii	2025-08-16 09:33:29	Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options