Re: AtEOXact_ApplyLauncher() and subtransactions

From: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: AtEOXact_ApplyLauncher() and subtransactions
Date: 2018-07-05 10:07:31
Message-ID: CAJ3gD9c2xA6e_ZV23tTMA+o370rB__wB36xSS5FvVP11Ee_zow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4 July 2018 at 00:27, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Jun 26, 2018 at 6:25 AM, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
>> Added this into the July 2018 commitfest :
>>
>> https://commitfest.postgresql.org/18/1696/
>
> It seems to me that it would probably be better to separate this into
> two patches, because I think there are really two separate issues.

Ok, will do that.

> With regard to the lack of proper subtransaction handling, I think it
> would be best if we could avoid introducing a new AtSubStart function.
> I wrote a patch for this issue that works that uses a slightly
> different kind of stack than yours, which I have attached to this
> email, and it creates stack levels lazily so that it doesn't need an
> AtSubStart function.

Yes, I agree that if we can avoid this function, that would be good.
Couldn't find a proper way to do this. Will have a look at your patch.

> As for the other part of your fix, which I think basically boils down
> to comparing the final states instead of just looking at what got
> changed, the logic looks complicated and I don't think I fully
> understand it.

Once I split the patch, let me try to add up some comments to make it clearer.

> but here are a few comments.
>
> + subrelids = GetSubscriptionRelids(sub->oid,
> + committed_subrels ?
> + CurrentMemoryContext :
> TopTransactionContext);
>
> This looks ugly and dangerous. In the first place, if
> GetSubscriptionRelids() needs to work in one of several memory
> contexts, the best thing would probably be for the caller to be
> responsible for saving and restoring the memory context as needed,
> rather than passing it as an argument.

I wanted to make the context switch only around the code which
allocates the list. I understand that all the other code in
GetSubscriptionRelids() such as heap_open() , systable_beginscan() etc
would not be affected if we run that in TopTransactionContext, because
that all gets freed in the cleanup functions at the bottom of the
function, but still I thought this way would be more robust for
future, in case there is some new code that allocates some temporary
memory which is not freed.

> Secondly, it's not very clear
> why we need to do this. The comment says we have to do it, but it
> doesn't give a reason.

When committed_subrels is NULL, it means this is the first time we are
running the ALTER command since the main transaction started, and we
want to store the committed subrels list in the transaction context,
so that the subsequent ALTER commands compare with this list. May be
when I split the patch, I will come up with an example.

>
> + * Merge the current list into the immediate parent.
> + * So say, parent has sub1(tab1, tab2),
> sub2(tab2, tab3),
> + * and current on_commit_workers has
> sub2(tab4) and sub3(tab1),
> + * then the merged list will have :
> + * sub1(tab1, tab2), sub2(tab4), sub3(tab1)
>
> I don't think this is very clear. Also, why is that the correct
> answer? Why not sub2(tab2, tab3, tab4)?

Parent sub-transaction has sub2(tab2, tab3), meaning that when the
subtransaction sub2 commits (and when the COMMIT happens), we should
stop the worker for (subscription sub2, table tab2) , and the worker
for (subscription sub2, table tab3).

And the current sub-transaction has modified the subscription sub2
such that the workers to be stopped are sub2(tab4). Note that this the
final list of workers to be stopped, regardless of the earlier ALTER
SUBSCRIPTION commands having run in the upper subtransaction. So this
list should completely replace the parent's sub2(tab2, tab3). We
always compare the subscription tables with the ones that are
committed, not with the ones that were last altered in the same
transaction. Hence the other patch.

>
> + foreach(lc, on_commit_stop_workers)
> + {
> + SubscriptionRels *subrels = lfirst(lc);
> + ListCell *lc1;
> +
> + /* Search this subrel in the subrels
> of the top of stack. */
> + foreach(lc1,
> subtrans_stop_workers->stop_workers)
>
> This might be very expensive if both lists are long. I guess that's
> probably not very likely. You would need to have modify a lot of
> subscriptions and then, within a subtransaction, modify a lot of
> subscriptions again.

Yes that's what I also think.

>
> + foreach(lc, committed_subrels_list)
> + {
> + SubscriptionRels *subrels = (SubscriptionRels *) lfirst(lc);
> +
> + if (sub->oid == subrels->subid)
> + {
> + committed_subrels = subrels;
> + break;
> + }
> + }
>
> This looks to be O(n^2) in the number of subscriptions modified in a
> single transaction.
>
> Like Peter, I'm not entirely sure how much we should worry about this
> problem.

At the upper sections, I have explained it a bit. Basically, without
this second patch, the main patch won't work if we have the same
subscription altered more than once in the sub-transaction stack.

> However, if we're going to worry about it, I wonder if we
> should worry harder and try to come up with a better data structure
> than a bunch of linked lists. Maybe a hash table indexed by subid?

I had thought about using hash to search a subscription id, but then
thought in practice, there would not be too many subscriptions in the
same transaction. Hence continued to use the existing linked list data
structure; just that instead of a single linked list having
everything, I thought let's keep a list of tables belonging to the
same subscription in one linked list. The reason for this was that it
is easy to abandon or merge a subscription table list when a
sub-transaction rolls back or commits.

--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2018-07-05 10:37:04 Re: TupleTableSlot abstraction
Previous Message Abinaya k 2018-07-05 09:55:47 Regarding shared_preload_libraries (postgresql.conf file)