From: | Nikhil Kumar Veldanda <veldanda(dot)nikhilkumar17(at)gmail(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Dead code with short varlenas in toast_save_datum() |
Date: | 2025-08-16 09:34:02 |
Message-ID: | CAFAfj_GjENTM1v_LAN+1AEhwuCEQj2kEkE_mBzQ2MBj+2YTggw@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Aug 14, 2025 at 8:32 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
>
> First, in the SQL test. The trick where you are using a PLAIN storage
> to not allocate a chunk_id on initial storage with a value large
> enough to force TOAST on rewrite, while the value is small enough to
> fit on a single page, is a nice one. We could have used a \gset as
> well with a toasted value, but that won't change the fact that we
> check for a new value allocated. The location in strings.sql feels
> incorrect because this is a rewrite issue, so I have moved the test to
> vacuum.sql and applied a slightly-tweaked result. A second thing I
> have added is a test to make sure that the same chunk_id is reused
> after the rewrite. That's also worth tracking and cheap, covering the
> non-InvalidOid case.
>
Thank you, Michael, for adjusting the change and merging it.
> With the isolation test, the case is different, and it looks like the
> test is incomplete: we want to make sure that the new chunk IDs are
> the same before and after, but we cannot use \gset in this context.
> What I would suggest is to create an intermediate table storing the
> contents we want to compare after the CLUSTER, with a CTAS that stores
> the primary key of cluster_toast_value_reuse.id and the chunk_id
> associated to each row. Then, after the CLUSTER, we join the pkey
> values in the CTAS table and cluster_toast_value_reuse, compare their
> chunk IDs and they should match. The test triggers 29 times the
> todo=0 code path, as far as I can see, but we should not need that
> many tuples with generate_series(), no? If the test is written so as
> we compare the old and new chunk IDs with the pkey values, the number
> of tuples does not matter much, but that would be a bit cheaper to
> run. Could you update the isolation test to do something among these
> lines?
> --
Thanks for the guidance. I’ve updated the isolation test to use a CTAS
capturing (id, chunk_id) pre-CLUSTER and added a post-CLUSTER join to
verify the chunk IDs match. I also reduced the number of tuples to 1.
Please find the attached patch for your review. Thanks
--
Nikhil Veldanda
Attachment | Content-Type | Size |
---|---|---|
v1-0001-Add-isolation-test-for-TOAST-value-deduplication-.patch | application/octet-stream | 5.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Kirill Reshke | 2025-08-16 09:43:20 | Re: Retail DDL |
Previous Message | Tatsuo Ishii | 2025-08-16 09:33:29 | Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options |