Re: Shared detoast Datum proposal

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Nikita Malakhov <hukutoc(at)gmail(dot)com>
Cc: Andy Fan <zhihuifan1213(at)163(dot)com>, Michael Zhilin <m(dot)zhilin(at)postgrespro(dot)ru>, Peter Smith <smithpb2250(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Shared detoast Datum proposal
Date: 2024-03-07 11:44:41
Message-ID: ad7c22ce-0e99-45a7-a843-05c7ecdc2826@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/7/24 08:33, Nikita Malakhov wrote:
> Hi!
>
> Tomas, I thought about this issue -
>> What if you only need the first slice? In that case decoding everything
>> will be a clear regression.
> And completely agree with you, I'm currently working on it and will post
> a patch a bit later.

Cool. I don't plan to work on improving my patch - it was merely a PoC,
so if you're interested in working on that, that's good.

> Another issue we have here - if we have multiple slices requested -
> then we have to update cached slice from both sides, the beginning
> and the end.
>

No opinion. I haven't thought about how to handle slices too much.

> On update, yep, you're right
>> Surely the update creates a new TOAST value, with a completely new TOAST
>> pointer, new rows in the TOAST table etc. It's not updated in-place. So
>> it'd end up with two independent entries in the TOAST cache.
>
>> Or are you interested just in evicting the old value simply to free the
>> memory, because we're not going to need that (now invisible) value? That
>> seems interesting, but if it's too invasive or complex I'd just leave it
>> up to the LRU eviction (at least for v1).
> Again, yes, we do not need the old value after it was updated and
> it is better to take care of it explicitly. It's a simple and not invasive
> addition to your patch.
>

OK

> Could not agree with you about large values - it makes sense
> to cache large values because the larger it is the more benefit
> we will have from not detoasting it again and again. One way
> is to keep them compressed, but what if we have data with very
> low compression rate?
>

I'm not against caching large values, but I also think it's not
difficult to construct cases where it's a losing strategy.

> I also changed HASHACTION field value from HASH_ENTER to
> HASH_ENTER_NULL to softly deal with case when we do not
> have enough memory and added checks for if the value was stored
> or not.
>

I'm not sure about this. HASH_ENTER_NULL is only being used in three
very specific places (two are lock management, one is shmeminitstruct).
This hash table is not unique in any way, so why not to use HASH_ENTER
like 99% other hash tables?

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2024-03-07 11:54:40 Re: Invalid query generated by postgres_fdw with UNION ALL and ORDER BY
Previous Message Kartyshov Ivan 2024-03-07 11:44:32 Re: [HACKERS] make async slave to wait for lsn to be replayed