Re: Support for 8-byte TOAST values (aka the TOAST infinite loop problem)

From: "Burd, Greg" <greg(at)burd(dot)me>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Nikita Malakhov <hukutoc(at)gmail(dot)com>, Hannu Krosing <hannuk(at)google(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Support for 8-byte TOAST values (aka the TOAST infinite loop problem)
Date: 2025-07-08 16:58:26
Message-ID: 783176F8-1F6C-444C-8997-6DD05A4A3B10@burd.me
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On Jul 7, 2025, at 7:38 PM, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> I have also pushed this v2 on this branch, so feel free to grab it if
> that makes your life easier:
> https://github.com/michaelpq/postgres/tree/toast_64bit_v2
> --
> Michael

Thank you for spending time digging into this and for the well structured patch set (and GitHub branch which I personally find helpful). This $subject is important on its own, but even more so in the broader context of the zstd/dict work [1] and also allowing for innovation when it comes to how externalized Datum are stored. The current model for toasting has served the community well for years, but I think that Hannu [2] and Nikita and others have promising ideas that should be allowable without forcing core changes. I've worked a bit in this area too, I re-based the Pluggble TOAST work by Nikita [3] onto 18devel earlier this year as I was looking for a way to implement a toaster for a custom type.

All that aside, I think you're right to tackle this one step at a time and try not to boil too much of the ocean at once (the patch set is already large enough). With that in mind I've read once or twice over your changes and have a few basic comments/questions.

v2-0001 Refactor some TOAST value ID code to use uint64 instead of Oid

This set of changes make sense and as you say are mechanical in nature, no real comments other than I think that using uint64 rather than Oid is the right call and addresses #2 on Tom's list.

v2-0002 Minimize footprint of TOAST_MAX_CHUNK_SIZE in heap TOAST code

I like this as well, clarifies the code and reduces repetition.

v2-0003 Refactor external TOAST pointer code for better pluggability

+ * For now there are only two types, all defined in this file. For now this
+ * is the maximum value of vartag_external, which is a historical choice.

This provides a bridge for compatibility, but doesn't open the door to a truly pluggable API. I'm guessing the goal is incremental change rather than wholesale rewrite.

+ * The different kinds of on-disk external TOAST pointers. divided by
+ * vartag_external.

Extra '.' in "TOAST pointers. divided" I'm guessing.

v2-0004 Introduce new callback to get fresh TOAST values
v2-0005 Add catcache support for INT8OID
v2-0006 Add GUC default_toast_type

Easily understood, good progression of supporting changes.

v2-0007 Introduce global 64-bit TOAST ID counter in control file

Do you have any concern that this might become a bottleneck when there are many relations and many backends all contending for a new id? I'd imagine that this would show up in a flame graph, but I think your test focused on the read side detoast_attr_slice() rather than insert/update and contention on the shared counter. Would this be even worse on NUMA systems?

v2-0008 Switch pg_column_toast_chunk_id() return value from oid to bigint
v2-0009 Add support for bigint TOAST values
v2-0010 Add tests for TOAST relations with bigint as value type
v2-0011 Add support for TOAST table types in pg_dump and pg_restore
v2-0012 Add new vartag_external for 8-byte TOAST values
V2-0013 amcheck: Add test cases for 8-byte TOAST values

I read through each of these patches, I like the break down and the attention to detail. The inclusion of good documentation at each step is helpful. Thank you.

Thanks for the flame graphs examining a heavy detoast_attr_slice() workload. I agree that there is little or no difference between them which is nice.

I think the only call out Tom made [4] that isn't addressed was the ask for localized ID selection. That may make sense at some point, especially if there is contention on GetNewToastId(). I think that case is worth a separate performance test, something with a large number of relations and backends all performing a lot of updates generating a lot of new IDs. What do you think?

As for adding even more flexibility, I see the potential to move in that direction over time with this as a good focused incremental set of changes that address a few important issues now.

Really excited by this work, thank you.

-greg

[1] https://commitfest.postgresql.org/patch/5702/
[2] "Yes, the idea is to put the tid pointer directly in the varlena
external header and have a tid array in the toast table as an extra
column. If all of the TOAST fits in the single record, this will be
empty, else it will have an array of tids for all the pages for this
toasted field." - Hannu Krosing in an email to me after PGConf.dev/2025
[3] https://postgr.es/m/224711f9-83b7-a307-b17f-4457ab73aa0a%40sigaev.ru
[4] https://postgr.es/m/764273.1669674269%40sss.pgh.pa.us

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2025-07-08 17:43:58 Re: Remove redundant comment regarding RelationBuildRowSecurity in relcache.c
Previous Message Álvaro Herrera 2025-07-08 16:40:28 Re: Inconsistent LSN format in pg_waldump output