Re: Shared detoast Datum proposal

From: Andy Fan <zhihuifan1213(at)163(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Nikita Malakhov <hukutoc(at)gmail(dot)com>, Michael Zhilin <m(dot)zhilin(at)postgrespro(dot)ru>, Peter Smith <smithpb2250(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Shared detoast Datum proposal
Date: 2024-02-26 15:29:33
Message-ID: 875xyb1a6q.fsf@163.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On 2/26/24 14:22, Andy Fan wrote:
>>
>>...
>>
>>> Also, toasted values
>>> are not always being used immediately and as a whole, i.e. jsonb values are fully
>>> detoasted (we're working on this right now) to extract the smallest value from
>>> big json, and these values are not worth keeping in memory. For text values too,
>>> we often do not need the whole value to be detoasted.
>>
>> I'm not sure how often this is true, espeically you metioned text data
>> type. I can't image why people acquire a piece of text and how can we
>> design a toasting system to fulfill such case without detoast the whole
>> as the first step. But for the jsonb or array data type, I think it is
>> more often. However if we design toasting like that, I'm not sure if it
>> would slow down other user case. for example detoasting the whole piece
>> use case. I am justing feeling both way has its user case, kind of heap
>> store and columna store.
>>
>
> Any substr/starts_with call benefits from this optimization, and such
> calls don't seem that uncommon. I certainly can imagine people doing
> this fairly often.

This leads me to pay more attention to pg_detoast_datum_slice user case
which I did overlook and caused some regression in this case. Thanks!

As you said later:

> Is there a way to identify cases that are likely to benefit from this
> slicing, and disable the detoasting for them? We already disable it for
> other cases, so maybe we can do this here too?

I think the answer is yes, if our goal is to disable the whole toast for
some speical calls like substr/starts_with. In the current patch, we have:

/*
* increase_level_for_pre_detoast
* Check if the given Expr could detoast a Var directly, if yes,
* increase the level and return true. otherwise return false;
*/
static inline void
increase_level_for_pre_detoast(Node *node, intermediate_level_context *ctx)
{
/* The following nodes is impossible to detoast a Var directly. */
if (IsA(node, List) || IsA(node, TargetEntry) || IsA(node, NullTest))
{
ctx->level_added = false;
}
else if (IsA(node, FuncExpr) && castNode(FuncExpr, node)->funcid == F_PG_COLUMN_COMPRESSION)
{
/* let's not detoast first so that pg_column_compression works. */
ctx->level_added = false;
}

while it works, but the blacklist looks not awesome.

> In any case, it's a long-standing behavior /
> optimization, and I don't think we can just dismiss it this quickly.

I agree. So I said both have their user case. and when I say the *both*, I
mean the other method is "partial detoast prototype", which Nikita has
told me before. while I am sure it has many user case, I'm also feeling
it is complex and have many questions in my mind, but I'd like to see
the design before asking them.

> Or maybe there's a way to do the detoasting lazily, and only keep the
> detoasted value when it's requesting the whole value. Or perhaps even
> better, remember what part we detoasted, and maybe it'll be enough for
> future requests?
>
> I'm not sure how difficult would this be with the proposed approach,
> which eagerly detoasts the value into tts_values. I think it'd be
> easier to implement with the TOAST cache I briefly described ...

I can understand the benefits of the TOAST cache, but the following
issues looks not good to me IIUC:

1). we can't put the result to tts_values[*] since without the planner
decision, we don't know if this will break small_tlist logic. But if we
put it into the cache only, which means a cache-lookup as a overhead is
unavoidable.

2). It is hard to decide which entry should be evicted without attaching
it to the TupleTableSlot's life-cycle. then we can't grantee the entry
we keep is the entry we will reuse soon?

--
Best Regards
Andy Fan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andy Fan 2024-02-26 16:08:33 Re: Shared detoast Datum proposal
Previous Message Tom Lane 2024-02-26 15:20:58 Re: Optimize planner memory consumption for huge arrays