Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date: 2020-01-28 05:57:47
Message-ID: CAA4eK1Li8L8-Ja-5d9DixV7Mwk6HJW3Z4rOzk0hYJ7PqY5XC-A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 22, 2020 at 10:30 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Tue, Jan 14, 2020 at 10:44 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> >
> > Hmm, I think this can turn out to be inefficient because we can easily
> > end up spilling the data even when we don't need to so. Consider
> > cases, where part of the streamed changes are for toast, and remaining
> > are the changes which we would have streamed and hence can be removed.
> > In such cases, we could have easily consumed remaining changes for
> > toast without spilling. Also, I am not sure if spilling changes from
> > the hash table is a good idea as they are no more in the same order as
> > they were in ReorderBuffer which means the order in which we serialize
> > the changes normally would change and that might have some impact, so
> > we would need some more study if we want to pursue this idea.
> I have fixed this bug and attached it as a separate patch. I will
> merge it to the main patch after we agree with the idea and after some
> more testing.
>
> The idea is that whenever we get the toasted chunk instead of directly
> inserting it into the toast hash I am inserting it into some local
> list so that if we don't get the change for the main table then we can
> insert these changes back to the txn->changes. So once we get the
> change for the main table at that time I am preparing the hash table
> to merge the chunks.
>

I think this idea will work but appears to be quite costly because (a)
you might need to serialize/deserialize the changes multiple times and
might attempt streaming multiple times even though you can't do (b)
you need to remove/add the same set of changes from the main list
multiple times.

It seems to me that we need to add all of this new handling because
while taking the decision whether to stream or not we don't know
whether the txn has changes that can't be streamed. One idea to make
it work is that we identify it while decoding the WAL. I think we
need to set a bit in the insert/delete WAL record to identify if the
tuple belongs to a toast relation. This won't add any additional
overhead in WAL and reduce a lot of complexity in the logical decoding
and also decoding will be efficient. If this is feasible, then we can
do the same for speculative insertions.

In patch v8-0013-Bugfix-handling-of-incomplete-toast-tuple, why is
below change required?

--- a/contrib/test_decoding/logical.conf
+++ b/contrib/test_decoding/logical.conf
@@ -1,3 +1,4 @@
wal_level = logical
max_replication_slots = 4
logical_decoding_work_mem = 64kB
+logging_collector=on

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2020-01-28 06:04:42 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Previous Message Michael Paquier 2020-01-28 05:36:33 Some incorrect option sizes for PQconninfoOption in libpq