Re: Race-like failure in recovery/t/009_twophase.pl

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Race-like failure in recovery/t/009_twophase.pl
Date: 2017-07-02 15:54:15
Message-ID: 28238.1499010855@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> Tom Lane wrote:
>> Part of the reason I'm confused is that the programming technique
>> being used in 009_twophase.pl, namely doing
>> ($node_master, $node_slave) = ($node_slave, $node_master);
>> and then working with the reversed variable names, is ENTIRELY TOO CUTE
>> FOR ITS OWN GOOD.

> This is my fault. I noticed this in the submitted test (and was pretty
> confused about it too) when I reviewed it for commit, but somehow it
> didn't reach my threshold to require a rewrite. I'll fix it sometime
> during the week.

I'd kind of like to fix it now, so I can reason in a less confused way
about the actual problem. Last night I didn't have a clear idea of how
to make it better, but what I'm thinking this morning is:

* Naming the underlying server objects "master" and "slave" is just
wrong, because the script makes them swap those roles repeatedly.
Better to choose neutral names, like "alice" and "bob" or "london"
and "paris".

* We could simply make the test script refer directly to the appropriate
server at each step, ie s/node_master/node_london/ in relevant parts of
the script and s/node_slave/node_london/ elsewhere. Maybe that's the
best way, but there is some value in identifying commands as to whether
we're issuing them to the current master or the current slave. Plan B
might be to do
($cur_master, $cur_slave) = ($node_paris, $node_london);
at the swap points and use those names where it seems clearer.

* Some effort should be put into emitting text to the log showing
what's going on, eg print("Now london is master."); as appropriate.

* Another reason why I had the feeling of being lost in a maze of
twisty little passages all alike was the relentless sameness of the
commands being sent to the servers. There is no good reason for the
prepared transactions to be all alike; they should be all different
so that you can match up postmaster log entries to points in the
script.

If anyone has other ideas please speak up.

Barring objection I'll go do something about this shortly.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-07-02 16:22:27 Re: Using postgres planner as standalone component
Previous Message Alvaro Herrera 2017-07-02 15:00:30 Re: Race-like failure in recovery/t/009_twophase.pl