Re: Problem with accessing TOAST data in stored procedures

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Problem with accessing TOAST data in stored procedures
Date: 2021-02-19 15:19:00
Message-ID: e152c4d7-ea1b-d143-f61d-d72f795238f7@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 19.02.2021 11:12, Pavel Stehule wrote:
>
>
> pá 19. 2. 2021 v 9:08 odesílatel Konstantin Knizhnik
> <k(dot)knizhnik(at)postgrespro(dot)ru <mailto:k(dot)knizhnik(at)postgrespro(dot)ru>> napsal:
>
>
>
> On 19.02.2021 10:47, Pavel Stehule wrote:
>>
>>
>> pá 19. 2. 2021 v 8:39 odesílatel Konstantin Knizhnik
>> <k(dot)knizhnik(at)postgrespro(dot)ru <mailto:k(dot)knizhnik(at)postgrespro(dot)ru>>
>> napsal:
>>
>>
>>
>> On 19.02.2021 10:14, Pavel Stehule wrote:
>>>
>>>
>>> pá 19. 2. 2021 v 7:51 odesílatel Konstantin Knizhnik
>>> <k(dot)knizhnik(at)postgrespro(dot)ru
>>> <mailto:k(dot)knizhnik(at)postgrespro(dot)ru>> napsal:
>>>
>>>
>>>
>>> On 18.02.2021 20:10, Pavel Stehule wrote:
>>>> This has a negative impact on performance - and a lot
>>>> of users use procedures without transaction control. So
>>>> it doesn't look like a good solution.
>>>>
>>>> I am more concentrated on the Pg 14 release, where the
>>>> work with SPI is redesigned, and I hope so this issue
>>>> is fixed there. For older releases, I don't know. Is
>>>> this issue related to Postgres or it is related to
>>>> PgPro only? If it is related to community pg, then we
>>>> should fix and we should accept not too good
>>>> performance, because there is no better non invasive
>>>> solution. If it is PgPro issue (because there are ATX
>>>> support) you can fix it (or you can try backport the
>>>> patch
>>>> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ee895a655ce4341546facd6f23e3e8f2931b96bf
>>>> ). You have more possibilities on PgPro code base.
>>>
>>> Sorry, it is not PgPro specific problem and recent
>>> master suffers from this bug as well.
>>> In the original bug report there was simple scenario of
>>> reproducing the problem:
>>>
>>> CREATE TABLE toasted(id serial primary key, data text);
>>> INSERT INTO toasted(data) VALUES((SELECT
>>> string_agg(random()::text,':') FROM generate_series(1,
>>> 1000)));
>>> INSERT INTO toasted(data) VALUES((SELECT
>>> string_agg(random()::text,':') FROM generate_series(1,
>>> 1000)));
>>> DO $$ DECLARE v_r record; BEGIN FOR v_r in SELECT data
>>> FROM toasted LOOP INSERT INTO toasted(data)
>>> VALUES(v_r.data);COMMIT;END LOOP;END;$$;
>>>
>>>
>>> can you use new procedure_resowner?
>>>
>> Sorry, I do not understanf your suggestion.
>> How procedure_resowner can help to solve this problem?
>>
>>
>> This is just an idea - I think the most correct with zero
>> performance impact is keeping snapshot, and this can be stored in
>> procedure_resowner.
>>
>> The fundamental question is if we want or allow more snapshots
>> per query. The implementation is a secondary issue.
>
> I wonder if it is correct from logical point of view.
> If we commit transaction in stored procedure, then we actually
> implicitly start new transaction.
> And new transaction should have new snapshot. Otherwise its
> behavior will change.
>
>
> I have no problem with this. I have a problem with cycle
> implementation - when I iterate over some result, then this result
> should be consistent over all cycles.  In other cases, the behaviour
> is not deterministic.

I have investigated more the problem with toast data in  stored
procedures and come to very strange conclusion:
to fix the problem it is enough to pass expand_external=false to
expanded_record_set_tuple instead of !estate->atomic:

                                {
                                        /* Only need to assign a new
tuple value */
expanded_record_set_tuple(rec->erh, tuptab->vals[i],
- true, !estate->atomic);
+ true, false);
                                }

Why it is correct?
Because in assign_simple_var we already forced detoasting for data:

    /*
     * In non-atomic contexts, we do not want to store TOAST pointers in
     * variables, because such pointers might become stale after a commit.
     * Forcibly detoast in such cases.  We don't want to detoast (flatten)
     * expanded objects, however; those should be OK across a transaction
     * boundary since they're just memory-resident objects. (Elsewhere in
     * this module, operations on expanded records likewise need to request
     * detoasting of record fields when !estate->atomic. Expanded
arrays are
     * not a problem since all array entries are always detoasted.)
     */
    if (!estate->atomic && !isnull && var->datatype->typlen == -1 &&
        VARATT_IS_EXTERNAL_NON_EXPANDED(DatumGetPointer(newvalue)))
    {
        MemoryContext oldcxt;
        Datum        detoasted;

        /*
         * Do the detoasting in the eval_mcontext to avoid long-term
leakage
         * of whatever memory toast fetching might leak.  Then we have
to copy
         * the detoasted datum to the function's main context, which is a
         * pain, but there's little choice.
         */
        oldcxt = MemoryContextSwitchTo(get_eval_mcontext(estate));
        detoasted = PointerGetDatum(detoast_external_attr((struct
varlena *) DatumGetPointer(newvalue)));

So, there is no need to initialize TOAST snapshot and "no known
snapshots" error is false alarm.
What do you think about it?

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-02-19 15:26:20 Re: Some regular-expression performance hacking
Previous Message Jan Wieck 2021-02-19 15:13:36 Re: Extensibility of the PostgreSQL wire protocol