Re: [BUG]Update Toast data failure in logical replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUG]Update Toast data failure in logical replication
Date: 2022-02-10 02:14:22
Message-ID: CAA4eK1+VGApXZ5sEyn-3O7nos+Jx_cGAUbukU=khyDZCreM9MA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 9, 2022 at 11:08 AM tanghy(dot)fnst(at)fujitsu(dot)com
<tanghy(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Tue, Feb 8, 2022 3:18 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> >
> > On 2022-02-07 08:44:00 +0530, Amit Kapila wrote:
> > > Right, and it is getting changed. We are just printing the first 200
> > > characters (by using SQL [1]) from the decoded tuple so what is shown
> > > in the results is the initial 200 bytes.
> >
> > Ah, I knew I must have been missing something.
> >
> >
> > > The complete decoded data after the patch is as follows:
> >
> > Hm. I think we should change the way the strings are shortened - otherwise we
> > don't really verify much in that test. Perhaps we could just replace the long
> > repetitive strings with something shorter in the output?
> >
> > E.g. using something like regexp_replace(data,
> > '(1234567890|9876543210){200}', '\1{200}','g')
> > inside the substr().
> >
> > Wonder if we should deduplicate the number of different toasted strings in the
> > file to something that'd allow us to have a single "redact_toast" function or
> > such. There's too many different ones to have a reasonbly simple redaction
> > function right now. But that's perhaps better done separately.
> >
>
> I tried to make the output shorter using your suggestion like the following SQL,
> please see the attached patch, which is based on v8 patch[1].
>
> SELECT substr(regexp_replace(data, '(1234567890|9876543210){200}', '\1{200}','g'), 1, 200) FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
>
> Note that some strings are still longer than 200 characters even though they have
> been shorter, so they can't be shown entirely.
>
> e.g.
> table public.toasted_key: UPDATE: old-key: toasted_key[text]:'1234567890{200}' new-tuple: id[integer]:1 toasted_key[text]:unchanged-toast-datum toasted_col1[text]:unchanged-toast-datum toasted_col2[te
>
> The entire string is:
> table public.toasted_key: UPDATE: old-key: toasted_key[text]:'1234567890{200}' new-tuple: id[integer]:1 toasted_key[text]:unchanged-toast-datum toasted_col1[text]:unchanged-toast-datum toasted_col2[text]:'9876543210{200}'
>
> Maybe it's better to change the substr length to 250 to show the entire string, or we
> can do it as separate HEAD only improvement where we can deduplicate some of the
> other long strings as well. Thoughts?
>

I think it is better to do this as a separate HEAD-only improvement as
it can affect other tests results. We can also try to deduplicate some
of the other long strings used in toast.sql file along with it.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-02-10 02:16:44 Re: Plug minor memleak in pg_dump
Previous Message Justin Pryzby 2022-02-10 01:58:54 Re: warn if GUC set to an invalid shared library